Killer Context: How AI Will Eat Security and Software

Presenter:

Transcript:

Our next session is killer context. How I will eat security and supper with Daniel Missler. Daniel's the cybersecurity and I thought leader running unsupervised learning and advising on cyber and AI adoption. Great creator of widely read blog with 3000 plus essays and projects spanning security, tech and philosophy. Former technical leader at Apple, HP, Robin Hood and consultant to fortune 1000 organizations and architect of open source tools like fabric, AI and checklist staple in Kali Linux.

So please welcome to the stage Daniel Missler.

Thanks so much and thanks to the conference for having me. All right. So today I want to talk about something pretty intense and crazy and, yeah, kind of wild. I want to talk about the future of hacking and the future of software. And by the future of hacking. What do I actually mean? Like, at what scale or scope I'm talking about?

Actually, most of it. Most hacking, most security, most software attack, defense, bug bounty, pretty much all of it. And I am aware of how big of a claim that is. So I just want to call that out in the beginning. And the reason I think I can make a prediction like this, it's because it's actually like a stochastic, prediction, which is a fancy word that I actually love, which means it's directionally true.

But you're not actually making predictions about things you can't predict, like specific technology or specific companies. And I love this. This is the way I think of stochastic. Let's say you have some bar somewhere, outdoor bar, and you have a person who comes and drinks and gets drunk every single night. And, they always stumble home. Now, you could use all the supercomputers in the world to try to predict their next step, and it wouldn't work.

But you do know that they're going home. So that's like my favorite way of, thinking about stochastic. So, a little bit of background. I've been in security since, 2000 or. Yeah, 1999. And I went heavy into AI around 2022. And, basically the way I see AI is kind of the, the basis of security, which is, it's actually a Latin combination word for set and kora, which is without worry.

So I love this concept of just abstracting security away a little bit to trying to help people and businesses just not be worried and be able to do their best work. Most of my background is in technical assessment of different types. Yeah, mostly web, attack assessment and that sort of thing. But, assessment in general.

So going back like 15 years or so, whenever I do a security assessment, I sort of start it in an unconventional way, rather than starting with the phones and kind of working around. What I do is I start with interviewing people and, having conversations with, like, the leaders of the company to try to figure out what it is they're actually trying to do.

And then I move down through the tiers and have conversations with everyone. I'm trying to extract what they consider to be the worst, situations. And that's basically how I, proceed all the way down to the people doing the work. Technically. And as I keep gathering more and more of this information, I actually turn that into, diagrams.

So, back when I was doing this, on, on prem, I would do this, with an actual whiteboard, and everyone would be at the office. They would come in and I would interview them. Doesn't have happened so much. Since the pandemic. Most people stay remote now. But what would happen is over the course of time, they would see this, this board, materialize.

And it was, quite valuable for the company because everyone who came in would, like, change the board and say, no, no, that's not correct. So after a week or two of that, I would then do the technical assessment and then start figuring out like what we actually need to fix. But the key idea is taking the context all the way from the top, all the way through the different layers, and using that as the background of the technical assessment.

And that's really how I view security assessment and have for a long time, before I. So in a completely separate thread, we're going to jump around a bit here in the beginning, talking about consumer tech in, 2013, I started thinking a lot about this stuff, 2014, and I ended up writing a book about trying to predict what was going to happen again in a stochastic way.

It's not a great book. You should not buy it. It's I turned it into a blog post. You could just, use AI to read it. It's a much, much better than reading it because it's, kind of crappy, but the idea is, we're pretty good, and they're starting to happen.

So the first idea is that you will have an AI powered digital assistant that knows everything about you and will advocate for you. The second idea is basically everything gets an API. So everything has, what I call a daemon, and it's broadcasting information about itself and the Da basically uses those demons to advocate for you. And finally, the last idea was, basically it would provide an augmented reality interface to you, using glasses or lenses or something.

And the the final piece, the fourth piece was actually that once everything has these demons, you would be able to use other AI to move towards human goals. So that was that was kind of the ideas that, 2018 got a job at Apple doing information security stuff for them, but actually joined into a machine learning team.

So it was actually AI team that I joined. So I got lots of exposure to doing AI stuff with security. Early 2021, I left Apple to go build up stack and vote management for Robinhood, and ended up doing a presentation there on using context to run a vote management program. So this whole thing is just kind of building another brick in the path.

So after doing that, I decided to go build and do my own consulting stuff. And this happened a few months, like six months or four months from like that before ChatGPT happened, which turned out to be great timing. And obviously I went absolutely crazy the moment ChatGPT happened. A lot of people in this room actually probably got calls and texts from me saying, stop what you're doing.

I don't care what you're doing. Just stop what you're doing, and you've got to pivot into this. Yeah. Everyone got calls about this. So the first place that my head went with all this is security assessment and managing security programs. But I kind of quickly realized it was bigger than security as well. And a lot of the questions that we can ask and answer can be more universal.

So in March of 23, I wrote this post called SPQR, which is a horrible name, like it's just an acronym. And it's state policy questions and actions. And you can basically build this for anything you could build this for, like you're managing a church or managing a giant organization or whatever. So you have policy, which is what you're trying to accomplish.

You have questions that you constantly want the answers to. You have actions that we or the AI can take to help you towards those goals. So I'm starting to sort of zero in tighter basically on all these, all these different components here. And I got decent traction, but I wanted to start working on something more tangible. So for a talk at Blackhat that year, I put together a fake company called OMA and gave it tons of context, just like a regular security company like I would do during an assessment.

So it's got the company mission, how they differentiate themselves, their goals, how they do business, the risk register, security team, it's members, like all the projects, basically collecting tons of this stuff for for the organization. And this is very similar to what I was doing before without I. And then what I started doing with this is asking questions like, you could basically manage a security program.

I don't know how well you can see that, but what percentage of endpoints have this? Like what what what controls do we have in this region of the country? And if you're in security, if you're a defender, you know that basically your security program is answering these constantly from different people, asking them different ways, security questionnaires especially is a lot of this.

So here's an example of a CSO making a statement. No more connections to a particular resource. And, we're asking the question, should this be allowed? So the CSO said that earlier and we have this system. This is a live system. This was 23. This is a live agent responding back saying, no, it's not allowed because the CSO said in earlier context that you can no longer do that.

Subscribe to our newsletter

So through 2324. Well, yeah, 23 and 24, basically been building around this concept of context. And I, and I want to give some more examples of that. So later in 23, I built a thing called threshold, which is, an app that takes content from like over 3000 different sources. And it tells me how good the content is, independent of like, any sort of, other thing.

I basically have a universal assessor of the content, and I can control the slider for how good the content, how good the content has to be before it will actually show up. For me. Currently, launching a thing called Same Page. It's basically doing the same thing that I've been doing in my security assessments for all these years, which is collecting all that context and turning that into an app.

Another thing I've had for, I don't know, 12 years or something, is called Helios Attack Surface Management. I just completed migrating this to AI as well. So once again, it's about gathering context at scale and then doing something with it. And the last one I'll mention is something I'm building now is I'm trying to make myself like a presidential, Intel report.

So, I could basically say what I really care about, and then I can go and collect people and like their opinions from, like, ECS or blue Sky or whatever, and just collect them all into one place so it can find me the patterns. I just basically want an Intel report. So all of these separate ideas are loosely around this concept of contexts in AI.

And I felt like I had a pretty unified team there. But at the beginning of this year, I was like, wait a minute, this is actually something completely different. So I think I have a simpler and much more powerful way of describing all of this. And that's something I call unified entity context. And that's obviously not going to be the name that Gartner uses, which is going to be the name that everyone uses.

But, it's what I'm using for now. So keep everything that I've talked about so far in your mind. And now let's talk about cyber security specifically and some use cases, because there's a lot of similarities here. So for SoC it's looking at data lots of different logs threaded until reports, IAM systems, endpoint data and all that stuff for IO.

It's a lot of the same stuff. But you're trying to create a narrative of what happened and determine the scope, right? So determine how bad it is. Blast radius, all that. With Pentesting, you're gathering tons of information and figuring out how you can get to a goal. Put the pieces together. Demonstrate value. Same with red team, but with a different scope management.

We actually have to under understand the organization really well, otherwise we can't do remediation program management. You've got to have all this stuff as well. Projects, budgeting, people management, GRC. It's like who do we have to be compliant with? You know, what regulations do we fall under? So the common issue here is most of these actually require the ability to see multiple parts of the org and be able to piece them together.

And that's the thing I think is really powerful. This is why security analysts and incident responders are so valuable, because they're actually mapping those pieces together. It's not like a single task. And the problem that's that's hard. It's it's connecting the dots. So I'll take phone management as another example of this. Or a stronger example of this.

And I just want to ask, what is the hard part about vulnerability management? Is it finding vulnerabilities? Would you say you don't have enough, like you're trying to talk to vendors because you don't have enough phones? Is it making dashboards? It's not dashboards. It's fixing the phones. Right. And the reason it's hard to fix phones is because, okay, what application is a part of what repo is that?

What dev team. Right. All this context about the org is what makes you able to do remediation. And you would think that would be easy. But if you've done this, you know, it's extremely not easy, especially when the teams are changing all the time. The projects are changing all the time. It's a mess. So here's the question how much of our inability to do a good job at fund management is a security problem, and how much of it is an organizational knowledge problem?

Like what? Just like, what do you think? The percentages. And now ask that question for all areas of security. And where it starts to get very strange. Is this not even really a security thing? Software and service industries in general are based on asking a set of questions to some data sets or collections of data sets, like HR software is asking HR questions and getting back HR answers.

Same for every other area of software project management. Same thing you're asking project management questions and getting back project management answers. Do we really think that these all need separate repositories? They need separate, data stores. They need separate things for all of these, their own APIs, like why are these all separate software industries and software verticals? I don't think they will be for that much longer.

I think that starts to go away. And when we migrate to this thing called unified entity context or whatever ends up being named, and I want to make something clear, you can't just take all the data from all the different places of the business, like the most sensitive financial data and dump it in with like network telemetry logs or whatever.

You can't do that, obviously, because of regulations and reasons like that. But logically it's going to be very similar. It's going to be unified and you'll have controls that, keep certain types of data from other types. So if you think about a person, your history, your belief system, your aspirations, favorite books and music, you start to collect all the things that matter to a human.

You pull that into context family goals, medical history, schedule and calendar and transcripts. You know, why is my relationship not working? When you have all of that context, you can answer extraordinarily, really powerful questions. How to improve my health, for example. That's a good one. If you're a company, go back to the stuff that we had in the alma context.

So history of the company, it's goals like it's been breached in the past. We got a CSO at this time, whatever it is, slack messages current projects desired RR like you just put as much as you can in there and this becomes the baseline for everything going forward. Once you have that, then you take your best eyes and you point it at that context, which now can see the dots.

So I think we might have this entire, I think, backwards. I think instead of cybersecurity or finance or whatever industry being at the center. And then the question is, oh, how do we use AI? I think it's more like this. The context of the entity is actually the center, the the collection of all that context and all that knowledge about the thing that you care about is actually the most important thing.

Software verticals go away. They kind of just blur into this and it becomes use cases on top of context. So how does this apply to hacking? So basically I think the future of where this hacking, direction is going is the whole game turns into how good your world model is of yourself if you're a defender and how good your world model is of your target.

If you're an attacker. So it's a battle of these world models and how accurate they are and how real time they are kept up. So it's a competition between the attacker and defender, but it's actually a competition between the attackers AI system against the defenders AI system.

So everyone listening to this and every attacker and every hopefully every defender will have a start. Similar to this with a whole bunch of modules and eventually millions of agents. And it's a whole bunch of really powerful small modules that do one thing well. So it's gathering context about all the employees and all their social media posts and what they're talking about in forums, and maybe they're leaking data somewhere.

Automated crawls, parsing like DNS changes. So all the I.T infrastructure, each of those are separate things, and they're kind of like separate industries that we've gone through, separate products. And those become modules inside of this larger system. Writing exploits. POCs like all of that become modules. So let's say the attacker has like, is looking at someone has like five main web applications.

They're also going to be looking at gathering additional domains. They're going to be gathering all the pages for those, we're going to learn about their new marketing things there. Oh really? If you have new marketing, it's a new product. Cool. There must be infrastructure. Let's go find the, resources for that infrastructure. So you could basically spin up all these modules pointing at this new target and start pulling in that context.

So let's say we find some good stuff and we start to set off, exploit agents to go and attack, if we're an attacker, like, maybe we're trying to extract data, we're trying to figure out what data is the most, dangerous to to the target and what they're most likely to pay a ransom for. If we're a bounty person, we're trying to create, like, a PLC or something so we can get paid and have it be as good as possible.

But that's not the cool part. The cool part is that this thing never sleeps. So when the attacker builds this thing and the modules work well and they could just give it a new target, give it a new target, give it a new target, it just keeps working. So when new forum post come out, new employees get hired.

New comments they make about, oh, I hate this company. And they got really secure, insecure stuff. They use default passwords for their Cisco gear. They're just so lame. They're probably going to go under soon. Oh, really? So one of my agents just went pick that up and said, let's go after the Cisco, infrastructure. So this this automation, it just the fact that you only have to build the system once, you still have to maintain it, but you really only have to build the system once and it just keeps working.

Now, it sounds complex, but like I said, you you really just have to set it up. Well, and it's the type of thing that that can keep running. And the other thing is when the models and agents get better, it improves the whole system, right? The the models get smarter, your agents get smarter, the context management gets better.

It improves the whole system. There's actually an example of this now, which there wasn't the first time I talked about this. Anthropic released a paper and they said inside of there that they had what looked like a full, large organization doing months of attacks. It was being done by one person with clawed in about two weeks.

And they basically had all the prompts, all the stuff. What I wanted to ask was, why did you allow it? Were you just watching so you could do it, talk about it later? Like, what were you doing? That was confusing to me, but, yeah, this is not theoretical. People are already doing it. And because the attacker has this.

Yeah.

Do you see this model once? Do you see a few models emerging as the most common, like an 8020 style set of common blue and red models, or are they going to be hyper specific per you? You mean the AI models? Oh, no. That's the type of thing I would put in the category of not possible to predict, where like if I stood up here and I was like, well, it's going to be, you know, GPT 17 three and it's going to have this feature.

Who knows? Nobody knows. I mean, I would say it's probably going to be a tiered system of some elite models being used for certain things, but a whole lot of open source models that are not even go into the cloud all the way down. So it's like .0001, penny to do a lot of tasks. So ultimately, this whole stack should have hundreds or thousands of different models.

Some cloud and some local, probably mostly local, I'm guessing. I'm not sure. Good question though. So because the attacker has the stack, we as defenders need the same stack, right? So, we're kind of used to thinking about it this way with attack surface management, we're supposed to be constantly checking our own thing. This is like that.

Just way more advanced and way more comprehensive. So, what I built, first around this is, Hold on.

Yeah. So this is essentially what a stack is going to look like. This is one particular stack that I'm using this, like the automated stack, that I'm using for attack surface management, which I've basically moved over to being AI based. But really it's kind of the same thing, that, that was working in the past. But the better the agents get, I mean, it just magnifies everything.

All right, so this is like the way I'm sort of thinking about this and, like, trying to build this, I'm literally building it for myself. I want to basically say, okay, Tesla, modified their, scope, and now they have lots of other top level domains that are available. Cai, which is the name of my AI system.

Go ahead and attack those. And, you know, me and a bunch of my buddies, we're going to go out and eat. I want to sit down to eat. The food is coming, and Kai starts talking in my ear like, hey, I found some stuff. Should I write it up now? And submit it? I'm like, yeah, write it up now and submit it.

We need to be first. So I want to be having interactions with my AI system that are going through this whole stack. And finding things, dynamically for me while I'm doing other things. And, Keep in mind that the whole time the target is changing, the target is changing constantly. So your AI stock has to be monitoring it constantly as well.

And the other thing is it's just a race between your AI system, your AI stack, finding the new stuff that's getting released versus them finding it first.

Yeah. So the entire game is who has the better model of your own system, the attacker. Or, you know, ideally you would because you're supposed to be able to just hit those products and APIs directly, which hopefully they can't do. But I'm afraid that for the next few years at least, the attackers are going to be doing this way faster than the defenders.

So next thing that's crazy. Yeah, yeah.

All right, so on that point, how do you solve for the challenge of the you know, let's say it's the good guys. The good guys find that you know, it's, short purple people with curly hair that are too. Oh, we can't say that. That's, you know, that's, violating these rules, and we can't single out this kind of, you know, whatever we're not allowed to say.

But the bad guys don't have those restrictions, right? They don't have to go through risk assessments. And that's that's a sensitivity topic, whatever. It's like, how do we solve that mismatch and timing, the asymmetry that tends to come up every time. Yeah. And unfortunately I don't think there's a way to solve that because it is a great point.

The defender will always be slower because the attacker can read about something Sunday morning and just blast out on slack. Holy crap. We're doing this Monday morning for, 3000 of our targets. Write it up now. We're launching at 9 a.m., and if a defender hears about a new technique, they're like, I guess let's get a meeting together and we'll talk about in the meeting.

If we can have other meetings about whether or not we can do this. And that's that's going to be way slower than the attacker. Yeah. Let's put it in the budget for next year. Exactly. Maybe let's talk about the budget. Yeah. So I'm really worried about that. The other cool idea about this, is imagine, how this thing improves, how the attacker system improves.

One of the things that I'm working on here is, let's say, let's say Jason here does a new talk, and he puts out a new technique, Jason Haddix over here. So he puts out a new technique on how to attack. I and, I've seen a number of things that he's talked about because maybe we worked on the slides together or we hung out or whatever, but I haven't seen this talk yet.

And he mentions three things I've never seen before. And I'm just like, hey, Kai, did you see that? He was like, yep. All right. So when can you have that in there? He's like, hold on. Okay, done. So my system just got improved because we just learned something we've never seen before from new research. And Kai can be crawling the whole internet watching all the talks, pulling out all the stuff that it didn't know to improve itself.

And guess what that means? The defender has to be doing the exact same thing. So the question really is, like I'm already monitoring like bug bounty systems. There's a bunch of projects that actually, they publish scopes, they publish full scopes for all their different programs. So that is a thing that you can add back into the thing as well.

Subscribe to our newsletter

So your agent infrastructure, can point to that stuff, not testing live against them yet. Still working on that piece, but I really want it to be fully automated.

So the real game here is just maintaining these models. We absolutely have to do, so what we end up with basically is the time it takes for something to be uncovered. Every stone, every new S3 bucket, like we've already experienced before, like maybe ten, 15 years ago, you could leave the S3 bucket out there for a little while before it got found.

Right? But now with this many agents looking at your stuff and knowing exactly where to look, that's going to move into, you know, hours, minutes, eventually seconds. So that will probably take some time. So what does this mean for us? If this is correct. So I think if you're a bounty player, you need to rebuild or any type of, offensive tester, you need to rebuild your stack to be context based, where you are constantly pulling in context from whatever your targets are.

You need to have separate modules for the social piece, separate modules for the DNS discovery, separate modules for recon, for parsing. New business deals, all sorts of stuff. It's basically your context versus theirs and then your automation against theirs. And if you're a defender, you need to try to determine like when how fast you could build this AI for yourself.

You've got to be building your context for your company. And it's not just like your top level domains that you put it into a service management tool. It's all the same stuff that the, the, the attacker is going to have those exact same modules. So you also need to build your own agent automation stack that can actually do things, based on that context.

And finally, if you're just trying to figure out where things are going with this AI stuff, just remember the core idea. The game isn't adding AI to stuff that we care about. The game is having real time world models of what we care about, and then using AI to take action on those things. Thanks for your time.

Oh, and also, one last thing. Any, AI assessments, security training we got, Jason and Julie from Julia from Arcanum, and they're right over here. Thanks.

Oh, yeah. Who has questions? Yeah, I thought there might be a few. Started front work back. First time I remember having to sit in the front. Honestly, the slide that you had, about all the little submodules that one might have that are running out in it, continuously gathering information, continuously taking in context, working with each other. This sounds to me like the new version of someone's ultra crazy hash cracking rig running at home.

That is a huge CPU. Resource draw a huge power intake. Do you see that being something practical for the everyday person? Or as information gets further and further and more complex and more Venn diagrams kind of need to be drawn between like relationships between, does this get away from feasibility or work? Yeah. It's a it's a good question about feasibility.

I mean, I think just like anything else, you can start with a few modules like monitoring forum posts, getting lists of employees. We've got one thing that we build that creates dossiers like personality dossiers on all the employees. So you could just like, oh, you can send them, you know, fishing about pets or whatever. So, I mean, if you're trying to do fishing, maybe you only build like ten of these modules.

And to your point, you can also optimize with, hey, I really need to get off of OpenAI for this. I need to switch to a llama. And you're also watching the models. So you watch the benchmarks and you realize this thing is going to do as good as OpenAI, so you swap it out. Now it's a local model.

So over time, all those modules should get cheaper and cheaper. But that's that's not even what's actually probably going to happen was probably going to happen is, is like I've got an AI stack that just kind of helps you with anything in life and also work, but someone's going to release a full AI stack of that and just put it on GitHub.

And then that's oh, no, you just download it and you put in your keys and you turn it on and you pointed out things and it starts attacking like that. That's probably going to happen soon. I wonder if something like that would be taken down. But anyway, there's going to be open source versions.

So yeah, kind of to that point, given that there isn't like a hello world project that you can just go on GitHub and and pull down and get started with, and, and I was at a talk earlier today and they put it really well. It's not best practices. There's just no practice. Yeah. Right. So from from the perspective of the blue team, like what's the what's the stupid five year old version of the first step to get this started, right.

Yeah I would iterative iterative approach. Yeah I would probably start outside in. So I would start with gathering context to replicate your attack surface management system. Assuming you have one, that would be a different step if you don't have one yet. But I love to capture things as questions, just like, do we have any new domains that just came out within the last hour?

That's a question that I just always want the answer to. So that's one question. And there's a there's a context with a source of data that I need to connect to be able to answer it. And then I just ask another question. Ask another question. And that turns into the modules. So I would say starting with like your top ten questions, that's probably where I would start.

Anybody else currently there must be another question. All right. Thanks a lot. There you go.