Presenter:
Transcript:
Our first speaker. He has 30 years of experience in ICS cybersecurity, red teaming, and risk assessment. USAF Veteran, author of Hacking Exposed Industrial Control Systems and ChatGPT for Cybersecurity Cookbook. Founder and head of Product Innovation at Threat Gen and director of Cyber Innovation at Morgan Franklin Cyber. Pioneer in gamification and AI driven cybersecurity training. Creator of threat. Gen.
Red versus blue and Auto. Tabletop. May I present to you, Clint Bodungen. Yeah, I got it right. Okay. My title has changed at Morgan Franklin since then, but. All right, this is going to be a magic trick because, they don't give us a monitor to see your own presentation, since I don't have mine completely memorized.
I have to click here and click here. So we may end up getting into a situation to where I'm clicking here and forgetting to click here. So if I'm talking about something and you're like that'll match the slides, just let me know. All right. So we'll start off with a little story. All right. Who knows what this is.
Who's played this game. I know normally if we had more people in here, that would be somebody that would say thank you. I'm going to do that five times. I give it to you. Okay. Who knows what this is. Okay. So this is created by the by the creators of The Sims. This game came out and I think, like 2009 or something like that, and back then this game was I love this game.
And there's other games like SIM Earth, right? And SIM Evolution and all that stuff. So this game is about you start with essentially particles or amoebas and single celled organisms, and you do things over the course of the game to evolve your characters involved and evolve in the building into a big society. And and so I've always loved things like that.
Simulations and this sent me down the path that led me here to today. And so I really got interested in trying to understand, for some reason, I'm a big nerd when it comes to just stuff like this that bothers everybody else in terms of like what creates selective gene selection and dominant gene selection and through evolution. And and I started thinking about this kind of stuff.
It's like, well, you know, what traits and what conditions, and what environmental pressures calls different gene selections and gene pools and stuff like that. You know, this is the kind of stuff that excites me and everybody else just kind of like, what is wrong with you? And so that led me down the path to this, which is this book came out, I think, in 2016 or so.
Well, so let me backtrack a little bit. In 2013, I got into, game development and that was it for creating, trading and stuff like that, which became project. And when I combined my water twin powers of game development and my super nerdy interests of evolution really started getting into understanding how I can create these simulations, why?
For no other reason. Literally, I just had this weird dream. I want to create a simulator that can figure out the best conditions and how to create dominant and recessive genes and all that. And I don't know, maybe I'm like Doctor Moreau or something, but I really wanted to create this. And so then I found this book, like three years after I got into game development.
And this gets into the really nerdy stuff, like all the math, right? All the calculations. And it's basically the mathematics behind evolution and but at this point, the only and by the way, this is not a talk on generative AI. There's not even a talk on AI. Some of this stuff I'll show you and that we released because all the research I'm doing here is open source.
I started doing, trying to figure out if I needed AI for this. How do I build these simulations? And it it was a long learning path. And then I came across this book here. If you are into the nerdy stuff, I am like genetic algorithms, this bio evolution stuff. And want to build these types of simulations.
This is like the book. This book is really awesome. It walks you through all the different code examples and the different types of by, bio bio evolutionary algorithms. And these is like Python, C, C plus plus all that. But this is where I really started to get an idea of how I can start to create these simulations and provides it in, but at this point, it had nothing to do with cybersecurity.
It wasn't even thinking cybersecurity. I just wanted to do this really nerdy crap. So I ended up creating and I think, Beesley, I think you've seen this next piece before it. Haha. And so, so I built that, finally ended up building this program, a little Python script. And for what it does, and all things considered, it's actually pretty small to where you can enter in all of these little, calculations and all these parameters.
And you can, you know, okay, what are how many random traits do you want? And at this point, my thought pattern was, I want to learn what creates what creates different genes, what creates different protection mechanisms for an ecosystem. And what, you know, what kind of pressures does it take? And, and so, like, I had this really weird idea is like, I want to create, like, brand new, unheard of genes.
Okay, I know that sounds weird, but so it has all these parameters and you can, you know, figure out in an ecosystem, adjust these parameters and how all can work fun with it. And so that's, that's kind of the there's a video of the of, of it in action. Right.
How do we do this? There we go. All right. And see, basically what you're looking at is all the green is food and you have all of the different colors are different particular genes. And the red are the predators. And so you have all these different. And over time all these different environmental conditions change. And so the non-red ones will pair up and mate and create offspring red can create offspring red.
So the the non red ones feed off the food. The food will grow back. The red feed off the prey. That red are predators and they feed off the prey and they have to deal with all that. But they all have. They also have to deal with, all the environmental conditions like snow and seasons and rain and cloud and, you know, all these things and then all the different traits that it starts off with are things like, camouflage, speed, adaptability, all those things.
Right. And so. And then when it runs a simulation then you still you can kind of see how long did it run. Over a percentage of time. What was the most common trait? What were the what percentage of traits and genes did you see across time? What were the environmental factors that affected things, that affected things?
And this is just a quick summary of the overall data. And then I would go on run these things like Monte Carlo simulations, essentially changing all the variables and seeing like what what happened, what to change this, what happens, what to run this a thousand times, a million times. Right. And so you might be asking yourself like, what does this have to do a cybersecurity.
Can anybody think about what this might have to do with cybersecurity. Anybody? Yeah. Trying. But even more directly right. So the not just the process. Think about not just the process that I, that I followed, but think about how this could potentially. Okay, here, I'll give you a hint or maybe just the answer okay. So what if we treated our cybersecurity and our network systems?
What if we treated this like a living ecosystem? Okay. What if we had a system that could continuously monitor the health? Okay. Some of these systems we kind of have. Right. But what if we had a continuous system that would always be monitoring the health of our our network ecosystem? It always knew about healing. What what things could heal the problems.
What is what's going to harm us and it protected itself. It balanced out because one thing that you find out, one of the goals that I found out when creating this, I call it Pi valve by the way, one of the things I'm probably going to mention several times, we're just reminded. So everything of doing all this research, research, everything I'm doing all the code, everything you're seeing here.
I'm about to release a white paper and a GitHub repository. It's all open source. You can follow along, you can contribute. But this is only about a handful. Maybe, maybe ten people know about this research that I'm doing. So congratulations. You're now among the the elite. But this is kind of my first announcement of doing all this, but.
So the one one of the things that I ended up noticing was it eventually finds an equilibrium. And that was kind of groundbreaking for me. So if I have that, that's the perfect system. The perfect setting is if through that evolutionary simulation that I did, what settings did I have to have, what conditions that I have to have, what needed to be there for it to find an equilibrium?
And that was kind of the challenge. It became a game for me. I was like, oh, how can I get this to where it's in perfect balance and to where the simulation never ends? Because the simulation will end if the predators, the red dots or the red team, if they take over all of food, they take over all the prey and they kill everything.
Then the simulation ends. Or now, this would be a perfect world in cybersecurity. If the the non predators file found a way to eliminate all of the the red team. Right. The predators right. That's nice. But in a real cybersecurity world right. That's never going to happen okay. You're never going to eliminate all the threats. So the best you can hope for is an equilibrium.
Right. So every time that they find a way to a new exploit, a new way in, if you had the ability to adapt to that encounter, it. Okay. Right. Or maybe stay ahead and you'll see in that simulation two, you'll notice that sometimes the red dots go way down and the, the, the prey takes over and then it goes back.
The red dots kind of come back. And it it really is like looking at a cybersecurity landscape. And so that got me thinking this was this was my moment. What if I could create a system that could simulate that living eco system and automatically monitor systems and create an equilibrium, always able to counter the threats? Know what threats out there know how to heal itself.
And so it's kind of like, why did I want to put my energy into this? Because those of you that know me know that I just have so much time on my hands. Right. So, why put my time and energy into this? And so first and foremost, you know, I've been dealing with risk analysis and risk assessment for, you know, more than two decades, almost three decades now, more than three decades.
Who knows? I'm too old to remember, but we can't keep pace with the amount of dynamic threats because our risk assessment methods are too static. Right? We have too many, you know, how many times do you do a vulnerability assessment per year? You know how many times you do a risk assessment, even if you do it 50 times a year, each one of those is a static snapshot in time.
Okay. And so how many of you use all the data that you have available to you for your risk analysis? How many of you are using all the threat data out there, all the scan data, all the risk assessments, all of the threat information and threat intelligence out there, all the information in your asset database, how many of you are actually taking the vulnerability assessments and putting it into your asset database?
How many you have a comprehensive database that correlates everything but you, if any. There's one person if that in here that takes an A, I'm going to bet to say none have all that information correlated. And when I say correlated I mean, you know, okay, I've got this threat and this threat uses these Http and Https, affect this vulnerability.
And this particular vulnerability is attached to these particular assets because I haven't patched them yet. And because of that, I now know that I have a direct attack path because there's that same correlation 3 or 4 times down the chain on the network to get from the outside to the inside. Nobody knows that because we don't correlate data that way.
Okay. And so that's what I sought out to create using bio evolution algorithms as an example. Okay. And the third one there, the defenders, think and and graphs. That's a miss I mistyped that it was two in the morning and probably in a fight with my wife and kids or something, I don't know. But either way. So emit we think multidimensional okay.
Attackers, red teamers, they think multidimensional. As a defenders we all too often are saying, okay, here's my list of vulnerabilities. Let me look. Go to this list and let me see. Based on, the Cvss score, what am I going to prioritize these? Or maybe you have some clever metric to be able to prioritize and and prioritize your mitigations for those vulnerabilities, but it's still a linear process.
Okay. You're not taking into account this multidimensional correlative database of information to give you a true probability of most likely threats and what the criticality of those would actually be. Okay. Whereas if you're a red team or if you're pentesting or you know, you're always taken into a lot of information in terms of the network attack path, the amount of vulnerabilities, if I have to do privilege escalation to do this.
And so it's a different way of thinking. And then the, the, the last one is this is a problem we all face, right. Security control spending is really optimized for risk reduction. Right. And so that's the age old we have a budget problem we don't have. Or you know what. It's not really a budget problem. And people would not have as much problem with their cybersecurity budget if they actually allocated the budget that they did have properly.
And what I mean is, that is how many of you are are blindly spending budget on controls that may be just casting a safety net or that may or may not be targeted, or you have overlapping controls because these are best practices and this is what the industry says you have to have. Right. So how do you know how to truly target your cybersecurity budget to where you need it the most, the biggest bang for your buck, right.
So that's a problem. Okay. So see, I'm Marty Marty, hold on I hit the wrong button.
And. Wait. Okay. That's the one I was on before. Right? Okay. I, I don't actually have the same slides. So there's the problem right there with not having a sync. The thing my version here is different. That version there. So I'm have to go with it all right. So this is where Project Darwin comes in. So people are already asking what does Darwin stand for.
Looks like an acronym I have no idea. This is what I did. I said Darwin sounds like a perfect code name for a project that deals with bio evolution and cybersecurity. And then I said, ChatGPT, hey, what can it what can this acronym be? And that's the crappy thing it came up with, I have no idea. So if you have some good ideas, I'm all for it.
Let's just project Darwin. All right? So all right, I think maybe. Yeah, that's where I'm at. Okay. I'm caught up. All right. So disclaimer that of all the things we're going to talk about here, I am I'm not complete. I'm not finished. This is a project. It's in progress and it's maybe about half done. It's probably buggy.
Yeah, no pun intended. So but I'm showing you what I have here kind of talk about, because my goal is to, like, look, if I put something out there, open source, and you can use it. Awesome. If you can help. Awesomer. If you can, like, just take these ideas and create something on your own. Awesomest. Right.
So this is just here for whatever you want to use it for. All right. So in short, this is basically what Project Darwin is. So we use, graph based list of assets. Right. So we have to start with assets, that we do this in a graph database as opposed to a relational database. Okay. What's a graph database as opposed to a relational database you might ask.
You know sort of graph databases. Well two people that it at least admit they want to raise your hand. Okay. So a graph database is so everybody which I'm pretty sure we all know what a relational database is, right? Even a document base database. Where do you see the Json and Json. It's pretty relatively structured as opposed to unstructured.
But a traditional databases rows and columns. And you may have, one to 1 or 1 to many relationship. And then they can be correlated. So what a graph database is. And I actually meant to have where where are you. Where were you. All right I meant to have, a, an actual fleshed out one here. And I forgot to add that slide.
And I realized it when I got here this morning. So picture, if you will, lots of nodes all over the place with connecting lines, kind of like the one over there on the left, but like thousands of them. And this can really scale hundreds of thousands of these things and connecting lines. But what are those connecting lines?
Those connecting lines will say things like if, if we think about this in terms of cyber security, right. So I have all these asset nodes and over here that we have miter TPS. And then we have threats. And so you'll have threats lines connecting to maybe a CV. It says threat. And it uses this this TTP uses this or this.
This threat uses this vulnerability. This vulnerability is used by this tactic or technique. And then we have mitigations this mitigate that vulnerability or that tactic or technique or secures it. And then we have this asset has this vulnerability. This asset has this mitigation. So if you can picture that it's a matrix of nodes and topics. It would be for gain.
Yeah yeah yeah yeah. So yeah look there's standing room only I appreciate you not interrupting me while we're talking here. So you know you were just kidding. All right. No, actually, I don't mind if you do ask questions. Are so few of us here, that if you if you have a question, just just stop me right in the middle of it, I don't care.
But I think maybe, though they might not appreciate that because they might want to hear questions. So for the recording, the the question was asking, the graph databases that Neil Forge. Okay. Neil Forge is one of the most widely used graph databases. And there's a few others out there, but yeah. All right. So that's kind of what graph databases are.
It helps to create a matrix of relationships of in this case they're called nodes. But it helps to create a matrix of relationships. And it's say it is a many to many okay. It's almost always a many to many. So this helps you visualize the correlations from end to end of a threat, using a vulnerability, using a tactic or technique that is mitigated by this.
And you can trace it all the way back to an asset. That's a very important concept when we're talking about Project Darwin, because that is the crux of kind of how we're simulating the next part here. And where are we at? I got a sink up here. Oh yeah. All right. So then the next part of that is our algorithms.
And so I'll talk about that in a little a little bit more as we go. But we're using three main algorithms. Don't what the fourth one doesn't count. So we're using an algorithm that is called an artificial immune system. We're using an algorithm called ant colony optimization. And we're using an algorithm called or just genetic algorithm. So maybe genetic algorithm.
Anyway, so in the Som that's a mistake. I took that one out. And I'll talk about these in detail here. Just a minute. So you have your graph database of all your information. Then you have your algorithms that make use of that. And then we have our data in fusion. So like in addition to our asset data and telemetry and all that, we're also gaining other external intelligence like Ceph data or common common or known known exporter vulnerabilities.
Common is to start with a k known exploit of vulnerabilities miter attack, defend miter attack or my miter attack. Miter defend. And I guess now Atlas, if we're talking about AI, data from National Vulnerability Database and all that, and then we're putting it all together into we're finding all the solutions for all of that. So all the different or we I guess would call most come over the Stig.
Data. I forgot Stig stands for. It's the technical rule. No, Stig stands for all your mitigations, your fix, your recommended procedures to fix things. I forget what it stands for, but. So, this is just kind of like how we're using all that source. We're using our Ceph data. Just so we have a good idea of the vulnerability so we can help determine, exploit ability.
Miter attack is going to help us determine our chain of techniques and tactics. Miter defend helps us correlate mitigations and controls the envy. And the National vulnerability Database is going to help us with get all of our, baseline vulnerability information, baseline export ability, and all the metrics there. And then, of course, the internal telemetry and assets is going to be your, your asset database, your SIM, all your internal stuff.
So this is all the data source they're getting. We're pumping all that data into the graph database and building correlations. Some of this stuff is really easy to build correlations, by the way. I don't even know what time I'm supposed to stop here. Is it quarter till that one time I supposed to stop? Okay, okay. There you are.
Okay. All right. So, so you have, this massive graph database with all the correlations and some of the stuff is really easy to build correlations because it comes in Json format or something like that. Like a lot of your, your miter data, your data, it already has a lot of correlations built into the Json format.
So that's really easy if you don't have correlative information on this, on this, that's where we do. We will use generative AI. We'll use generative AI to help build correlations. And we'll put you in a loop and help correlate that. But that's where one of the hang ups was for me in the the long ago past, when I was trying to do all this stuff, is that it's just too difficult to correlate all this information.
And I figured, you know, that's probably why a lot of you in your companies don't build this correlative type of information, in your risk analysis, because it's so difficult to build those correlations. And so that's, that's one area where I'm, I'm in support of using AI and generative AI. So.
All right. And so this is kind of the the basic general architecture is like so you have the first thing that happens is you have all of your external threat information vulnerability data that goes into and I call this a it's really important here. It's an asset data model. Okay. People call this a digital twin. That's not a digital twin.
If if it doesn't actually simulate or emulate and it's not functional and I can't get in there and actually it doesn't actually do anything. It's not a digital twin. It's just when you take all of your data, all your vulnerabilities, everything you have, that's just an asset data model. Put that on the graph. Right. And so and then we feed that into our algorithms.
And then what we're hoping to get out of this. So at the end of this entire project we would like a system that has all these algorithms that makes use of all that data that gives you dynamic risk scores like a living risk score, always, always evaluating, always working. Giving you attack paths that are feasible, that are that we replace that stupid metric of likelihood in the the risk metric calculation that we all learn from the CISSP, of likelihood because it's an arbitrary BBS made up number.
So if we replace that with actual data that says, look, we know these attack paths and this exploit ability is feasible and highly likely because of the data. There's your likelihood metric. So and then you get, hopefully you get an optimal, control or mitigation portfolio that will help us get that big bang for our buck.
So ultimately, what we're looking for is are we know we want to know how much controls cost. We want to know what is helping us prevent what impact is helping us prevent. And then a dashboard to display all this. Right. So how this works is we have. Yes. Okay. Up there I have asset data model here I have digital twin.
So I gotta make sure they're good anyway. So it all starts. Like I said before it all starts with your your digital your graph. Right. Your your graph database with all that information. Okay. This correlated information, the graph database is going to enable the next piece of that. Having all those connections made. This is where ant colony optimization.
This is like the coolest thing ever. Ant colony optimization is it is a swarm simulation. And by the way, none of this none of this is AI. You can use this in neural nets and AI. But this is just this Python script. And so ant colony optimization is where all of these little swarms go out and they find a path looking for a goal based off of whatever criteria that you feed it.
And it literally leaves breadcrumbs or pheromone trails for the next. And they, they, they search through and find all the main paths. And so what we're doing is we're taking all that information in the graph database. And this is why, you know, it's a safe for all environments because we're not actually unleashing anything on your network. This is literally taking data.
We can take copies of your data. We can look at your data. And when I say we I mean us like that. This is on a company thing or anything like that. We could look at your data and traverse through there logically. Okay. So when I like attacking your network and so you can give it whatever criteria you want, the desirability.
And there's a formula for that. I got a formula for that. But there's desirability. There is exploit ability. There's you know, these are the crown jewels. This is the value that I get out of this. So you get all this and the little ant swarms will go through there looking for the juiciest morsels of assets out there, based on criteria.
And you'll run hundreds, thousands of iterations of this, and they'll start to find the most likely path in the most critical paths of the most critical data based on whatever you have in your database. So data is key here.
Then you have your artificial immune system. Your artificial immune system is another set of algorithms that is designed to take in those known critical paths, validated and, verified critical paths. And it's also going to take into account all of the, the general risk, the general health. So if it think of it like an immune system, our, our environment is is the body.
And and if it's running optimally it's healthy. Then you have these attack paths. Those are weaknesses. Those are vulnerabilities. Those are conditions or those are the the states or the conditions that can potentially harm you. So like for example, you know, if I'm not eating properly, right, if I'm not eating properly and, I have a bad diet, then that is a vulnerability.
That's an attack path. For my my immune system is compromised. Right. And so these are things that the artificial immune system takes care of in the same sense of the word, in that it's looking for those weaknesses, whereas the condition, those attack paths, vulnerabilities. But then it also knows about what are the genes that can help, what are the things that can help heal.
So white blood cells will be in the biology system. Right. Or, or or different, dominant traits that protect against certain predators in the environmental ecosystem in your network. It's going to know about well, do you have monitoring set up. Do you have your firewall. What are your firewall rules? And it knows about all of this. So it can start to make recommendations that your immune system is it's going to say we need to deploy this in this area because we're weak here.
And then you have your genetic algorithms. This is your strategic plan. This is the plan that's going to take not only all that information from the ACO and from the AIS, but now it's going to say, well, we know based off of our immune system and our overall health of our of our system, we think of it this is the genes, right?
So we know what are the critical points of our system. We know what are the valuable parts. We also know out there what the threats are. So is taking all this information and evaluating it just like biology. And it's trying to figure out what systems do I need to put in place in order to protect here as the system.
This is where the, the, the the artificial immune system and the genetic algorithms work in concert. Whenever your systems are changing, constantly changing, it's constantly evaluating and it's constantly trying to make adjustments. And then over time, this is where you do get into artificial intelligence or machine learning. Not generative AI, not yet, but actual classical artificial intelligence and machine learning.
It's where over time it will start to learn patterns, and it will start to understand that whenever we have these conditions, that's when we are most likely and most usually have a breach or an attack or an outage. Whenever the world conditions are in a certain way, that's when we know these things happen. And so you can start to get recommendations on when you need to build this, when you need to secure this, okay.
So don't think of this system as you know, this isn't going to be one of those things to where it will automatically shut down firewall rules. It's going to automatically do this. It's all about real time information, real time intelligence, and a way to always be giving you real time data and recommendations based on your exact system immediately in that moment.
Not just a snapshot in time, but constantly. So as long as you're feeding a data, as long as it's getting data, it's always taking this into account.
So it's kind of reiteration reiteration here. So the asked, is constantly evaluating monitoring this. That's the artificial immune system constantly evaluating and monitoring. And it will it will update the immune system. The it basically it notifies the the ACO, the ant colony optimization on what the vulnerabilities are so that the ant colony optimization has to know about the vulnerabilities to know about these things.
So those two communicate, and then you have all of those paths informed, the genetic algorithm for the overall fitness of the environment and all the solutions that are out there, all the threats that are out there. And then, sorry, that third bullet, I took that out, took the som, out of that. And so.
Okay. So where are we at in the current state?
Right now we have I don't have it here. So we have the artificial immune system partially implemented. We have the graph database, working. But as in we can take in any amount of data. We have a generic, generative AI system that can help build correlations. And we have the code that can take anything that already has Json to build the code and build this artificial.
I'm sorry to build this graph database okay. We have the artificial immune system in pre proof of concept concept like alpha mode to where it can take all that information and start to give you a dashboard sense of, you know systems normal or we have weaknesses here in this kind of thing. Think of it like Pascal's favorite security onion.
Think of it like security onion. But.
And I guess the security onion, the difference would be instead of using, like Elasticsearch, instead of using a lot of unstructured data, we have to build transforms. Think of it like security onion built on top of a graph database. So there's a lot more correlations that you can do naturally and on its own without having to always build manual transforms.
And stuff like that. So it's like making security onion more flexible. And then the other part that's built is the, ant colony optimization. So the first the graph database and the first parts of the artificial immune system where necessary in order to get the ant colony optimization working. So I have little ants that run around logically, through all this information I start to tell you things.
So.
Where are we at here? Okay, I know what this is. Shut up.
So I have no clue, but I actually do know what it is in general, but I have no clue how to solve that. What this is. This is a very common, when it comes to ant colony optimization, this is kind of like the base algorithm or the base formula. But what I found that is cool. I found this out by accident and it happened to work.
So I was messing with somebody, a colleague, and I took this as like I had on my phone. I was like, I have a mess message somebody. And so I took out a napkin. We're at a happy hour and I took out a napkin. I encourage you to do it. Take a picture of that and just try it.
You just try it out, you know, press your friends at happy hours and parties. So. And I wrote this on a napkin and so I'm sitting there and I'm writing and my colleague looks over. And what are you doing? What? Hold on. Act like I'm right. I already had a written out. Like like writing like, dude, what do you.
And if I asked, what's that? And I just slid it on the table, I figured it out. They looked at this, this algorithm handwritten on a napkin, and it's like, what is this? I, I figured out how to actually figure up likelihood and the risk of risk formula. This is this is actually how you figure out likelihood and risk analysis and exploit and exposure management.
This this is it. This is going to protect everyone. And so we kind of lost it. It's like what is that. So what this really is is but you should do that. It's the funniest thing ever at happy Hours. Just write it down and be like, dude, I figured it out. Just the the fact that you like, wrote that on a napkin in that equation.
They're just going to look at you and be like, like, how did you come up with this? All right. So this is actually and here the reason I give this to you is because probably nobody in here knows how to figure this out. And what this means, I didn't, but I used generative AI to help me figure this out.
I use generator to help me figure out this entire premise, this entire concept of how to take, generic that these generative algorithms and, the genetic, algorithms and all this stuff and equate it. I needed help with that. I'm not that smart. I needed help trying to figure all this out. And so but it was interesting when I finally figured out what this is, okay.
And I did, I verified everything and I put the the me, the almost human in the loop. And so I verified all this. And what I found out was that this is just an AI optimization has nothing to do with cybersecurity. But you see that in AI. And then the J and the, the, the beta up at the top right.
What I did was figure out that if we solve for that using cybersecurity metrics, cyber security categories, then what that does is that makes that entire equation, that it's an algorithm makes the entire algorithm for the ant colony optimization. It you didn't. Where were you? Thank you.
That was the fifth time, by the way. That was the fifth time. So there's the last one. So what that does is now that we've solved for putting in the cyber security categories into that, that makes the ant colony there. So that's why this is relevant. Because since this is where I'm at in the phase of things, I'm at the ant colony optimization phase.
This was like a cool breakthrough in that we figured out that if we took the generic ant colony optimization algorithm and we put security metrics into it, it makes the ant colony optimization all the ants. It actually makes them care about the cybersecurity metrics. It helps that helps them determine why should I go to the next thing. And that's that's what the first algorithm is.
The first algorithm is it's the probability and the reason why the probability exists and how likely that an ant would go to the next node and which node it chooses. And this is what makes it choose cybersecurity metrics. So all that data is in your graph database. So and in the code that we have it helps put it all together.
All right. So this is kind of a this is a very quick video. It's I mean it looks it's going to look like a Commodore 64 type of thing. This is the last slide sort of. But this is going to look like a Commodore 64 thing going. And it goes really fast. But this is kind of it working with some generic data and stuff.
So it's going really fast. So what does your nuts all the ants go into the system really fast and creating attack paths and finding vulnerabilities and and each of those nodes in the background, each of those little nodes has all that information about it has these vulnerabilities and it has all this. And so it's using all that information.
In the graph database that's tied to those nodes to make determinations and make decisions on what notice should go to next. And it's finding the attack path. So that was really fast. Just a really quick thing. But then what you see here is we start to get further data. We start to see, well this is how many iterations we ran.
This is these are the CVEs that it found. These are the most used CVEs. This is the minor attack that it used. These are the nodes that have found. And so it starts. You can start to track all the data that every ant goes through, and you start to get this information. Now, all of a sudden, hopefully you can start to see the value here.
So if you have ant colony optimization algorithms going through all of your data and it's all correlated now of sudden, you can start to see where you're most likely attack paths are, what are there. And you start to chain everything together. Okay. None of that's AI. It's all Python script. All right. So what are my next steps? So next steps are I still need to iterate and refine the ant colony optimization code and models.
So you just clean up some bugs and stuff like that. I need to fully implement the artificial immune system, to get all that working together properly. And what I don't have in is the is the, implement the genetic algorithm phase, which is the long strategic thinking and putting it all together and then clean dashboards and stuff like that.
So a key takeaways is if you're going to use this on your own, you know, try to utilize the graph databases, think and graphs and correlate all the information using graphs beyond what we're doing here. This may seem complex to a lot of you beyond what we're doing here. Using graph databases gives you so much power over your data in many instances.
So. But using these bio inspired engines and algorithms, allow us to mirror the benefits of actual, like natural selection and immune systems and in a diverse living, ecosystem. And if we, you know, if we create, if we create our cyber ecosystems that way and think of it that way, it's kind of a different way of thinking.
Right? So if we think that way, think it like a living ecosystem, stop treating everything like a one off or a once a year. Check the check the box. We might actually get down the road of doing some real with cyber as opposed to just checking the box. So, and I guess that's the. You know, this is just a list of like, future possibilities, like things we can do with this, right?
I haven't even just I was brainstorming, like, what all can we do with this? How far can we go? So and then finally, look, if you're interested in this, just follow me on LinkedIn. And I should have probably next week. The white paper to all this is coming out and I'll be releasing the GitHub repository. So just look for that on LinkedIn.
If you, if in like a couple weeks have gone by and you didn't see maybe you missed it or you don't, you didn't get my notification or whatever, just reach out to me on LinkedIn, email me whatever. I forgot to put my email address here. It's pretty easy knowing it's there. It's just not here. So yeah, you can email me and, and whatever, but, you know, if anything else, if you just want to nerd out and geek out on like, algorithms and AI and code and all this stuff like that, happy to do that.
To again, this is a side project I'm gonna open source at all. So I think that's it. Do I have any questions before? I think that's pretty much it. Like I said, it's just kind of a passion project and happy to work with any of you on it. I we have a question here. Yeah. If you were to use this in the actual system, what kind of computational power?
You have no clue. I don't know yet. That's, I can tell you. I mean, this this. So I I'm running on. I have a system where I put 50,000 nodes. This is done on this this laptop right here, which is an M3 max with, 32 gigs of Ram. And it took a couple hours to get all, though.
Take that back. To get the correlations done for 50,000 nodes and get everything built. It took like 24 to 48 hours to get all that built faster machines. It's going to happen, faster. But once everything's built, it doesn't take hardly any computational power to run it all. It's literally just a Python script, and it's just running through and iterating.
And I'm not doing any AI or anything like that. So the computational power is negligible once you get everything done. The only AI that I've used in this entire thing so far is the generative AI. To build the correlations in the graph database, which is that, that weren't automatically done through Json.
Questions. And so no, no, it did not. This is this. No, this is just this is the stuff that I, stay up late night awake and and think of weird crap. That's it. And with that. Thank you. Clint. You can take more questions offline if he wants to. What? You want to take more questions offline if you want to.
Yeah. Yeah, I'll I'll stick around. Except for you. Pascal, you get out of here. Yeah. No, I got answers for you. All right.