Collecting Evidence

MSP spend a lot of time and energy trying to align with standards from one of the many frameworks that are out there to improve their cybersecurity posture. Whether you do it to meet regulatory requirements, or are just looking to improve your business operations, how do you know when you are failing or succeeding? I sit down with Jim Harryman of Kinetic Technology Group to discuss how evidence comes into play. Policy Process and People are key to collecting that appropriate evidence and Jim and I are going to talk through how to make it part of your cultural habits.

[00:00:00] Welcome to MSP 1337. I'm your host Chris Johnson, a show dedicated to cybersecurity challenges,

[00:00:14] solutions, a journey together, not alone. Welcome everybody to this episode of MSP 1337.

[00:00:26] It is coming out on time this week. Yay. So Jim, with Kinetic Technology Group, Jim, welcome to the show.

[00:00:34] Jim Johnson Good to be here.

[00:00:36] Jim Johnson I think you're probably pretty much everybody knows at this point that you're on

[00:00:41] like the monthly rotation to get at least one episode in. We're going to have to start calling

[00:00:44] us the Chris and Jim also fireside chat. But part of the reason I've targeted you to be on

[00:00:52] the show is you've been heavily involved in the trust mark and pursuing CIS controls. And

[00:01:00] when I look back on a lot of the episodes, they've often, I don't want to say that we're attached to,

[00:01:07] but they've followed kind of the pattern of like a CIS control or a safeguard that's come up.

[00:01:11] I mean, we've even had an episode, I don't know that you were on it, but we talked just about

[00:01:15] MFA and what it means and why it's important to talk about that. What does it mean to have

[00:01:20] centralized management of accounts? And so as you're getting ready for your assessment or

[00:01:26] going through the assessment for the trust mark and one of the things that comes up

[00:01:30] for everybody that's going through it is, how much evidence do I need? What kind of

[00:01:34] evidence are they looking for? How specific? How vague? I mean, it just goes on and on and on.

[00:01:40] And I don't want to say that it's exhausting. The questions, I think it's actually quite

[00:01:44] the opposite. It's actually refreshing to know that everybody is going through,

[00:01:51] whether it's the trust mark or the CIS framework or any framework for that matter,

[00:01:55] is it's not easy. And there are things that you are going to find that you don't know how to

[00:02:02] collect the evidence for or you're not sure what evidence would look like. But I wanted to talk

[00:02:07] about it more from the perspective of, yes, we can collect evidence, but why are we doing that?

[00:02:14] What's important? Forget the trust mark for a second and getting a badge, but Jim,

[00:02:21] walk me through what got you on the bandwagon of doing any framework or following any rules that

[00:02:28] said I want to have somebody assess whether or not we're doing this right. Well, I can tell you

[00:02:34] from my side of things that it was a lack of discipline that I knew existed in my own

[00:02:42] routine. And I still see it in every, in our secure outcomes within our peer group

[00:02:55] and just people that aren't, I don't have time or, and it's just like I get honestly sick

[00:03:04] of hearing that because it's like, look, you've got time to do all these other things. And

[00:03:12] just the couple of hours just to go down and give yourself a reality check of what you are

[00:03:19] and aren't doing is not that much time. It's just to start, right? Just to start and say,

[00:03:24] yes, we're doing this, whether or not you can prove it or not. But at least in your mind,

[00:03:28] you're like, okay, we're doing now I need to go back and start doing these things. So

[00:03:32] for me, it was a lack of discipline and needing to be held accountable to something, right?

[00:03:40] And it started with a peer group member that was wanting to, wanting the same thing, wanting to

[00:03:49] get with a group outside of peer group and with a group of people that were striving to

[00:03:54] enhance and build upon their maturity from a cyber security perspective. And then I kind of went

[00:04:04] off the grid a little bit because, you know, I'm like, well, this is cool, but I really feel like

[00:04:10] I need, I need to attach a dollar sign to this thing before it really kicks me in the rear.

[00:04:18] So we partnered up with a group that were doing SOC2 assessments. And they, it wasn't really,

[00:04:30] their framework wasn't necessarily built on one specific whether it's CIS or NIST or anything

[00:04:39] like that. It was just kind of a general thing. I mean, the SOC2 assessment, you know,

[00:04:45] assesses you on certain areas, but within that, you had to meet certain safeguards as well from a

[00:04:52] technology perspective. So that was kind of the deal that got me rolling. It was really knowing

[00:04:59] myself, a lack of discipline and not, you know, the time thing and really just dedicating it

[00:05:05] and realizing that if I didn't change that the likelihood of my company actually making it through

[00:05:13] you know, the next 10 years or whatever was going to be pretty small.

[00:05:21] So that raises a question. So you've been doing this now for, well, you've been engaged in a process

[00:05:28] of aligning with a framework that it goes back now what probably three or four years?

[00:05:33] If not... Yeah, we're starting kind of like our fifth year in the process. Yeah.

[00:05:39] So what... So I get the whole accountability part. I think that's kind of great for those listening

[00:05:44] like, hey, if you don't have an accountability in place, you know, whether it's through a peer group

[00:05:48] or you know, you committed the money to do a trust mark or something along those lines,

[00:05:54] it starts to, you know, if dollars are attached to it, it tends to have a little bit

[00:05:58] more of a stickiness of I should probably get this done. There's a value proposition at least

[00:06:02] even if it's not a big one. Did you find that the company as a whole at the beginning

[00:06:08] were they really resistant to this and not wanting to get on board? Or was it kind of a

[00:06:13] collective like everybody's kind of recognizing like, hey, we've got to do something more than

[00:06:17] we are today? So initially I had to get some people that were leaders within my company

[00:06:27] on board. I had just hired a finance person and then I had my director of technology and service

[00:06:39] delivery as well at the time. Sure. And I did have to... I mean, initially I just had to tell

[00:06:46] him I just was flat out point blank. I believe that if we don't take this seriously, that

[00:06:56] the longevity of this company as it stands is probably pretty short.

[00:07:07] And going back to the whatever they call it, the 80-20 rule and whatever the name they give that...

[00:07:14] The 80-20 principle? Right. The 80-20 principle, I wanted to be the 20%

[00:07:21] of people in our industry that were moving in that direction and kind of leading the charge.

[00:07:28] And so, I mean, we're a small company and we're 13 employees. When we started this, we were probably

[00:07:34] nine employees, right? Yeah. And so initially the selling it to those two people was pretty easy

[00:07:44] because they both were like, well, we love this company and we want to see it go beyond

[00:07:52] you someday when you retire or whatever. Well, no one ever retires. Come on, let's be honest.

[00:08:00] And so then fast forward. So here we are today. You're being assessed for the trust mark.

[00:08:06] And along this way, did you see a shift? Because one of the things that kind of...

[00:08:10] I've been thinking about and one of the reasons why I thought it'd be good to talk to you

[00:08:14] knowing that we've had some evidence conversations around what you're working on for the trust mark,

[00:08:18] but it got me thinking about things like outside of say the scope that is the trust mark safeguards

[00:08:24] and thinking about things that are more... I don't want to say generic, but like, hey,

[00:08:28] I have a service ticket. I put my time in, but there's no notes. Like, where's my notes?

[00:08:33] Like, why did you not put notes in? Oh, I was busy. And it's like, okay, well,

[00:08:36] what's the ramifications? And so I guess I was drawing these parallels between

[00:08:42] if you're running an MSP and you're taking seriously the services that you deliver,

[00:08:46] both internally and to your clients. Why would this really be all that different? It's just a

[00:08:54] different ask, right? Like I think to some extent the reporting might look different. The type of

[00:09:00] information being tracked might be different. But as far as employees and the human element here,

[00:09:05] I would think it's the same. Like it's not... This shouldn't be new hat.

[00:09:12] Well, from my experience with the conversations that I've had with a number of other MSP owners,

[00:09:20] if you're struggling in the service delivery category, you will struggle in meeting a framework.

[00:09:28] And so, and this kind of... There's a maturity element here that has to be considered.

[00:09:34] Absolutely. So, I mean, 10 years ago is when I really realized that we were not that operationally

[00:09:44] mature. We had systems in place and we had things of that nature. But ultimately,

[00:09:52] I learned to follow the principle of if it's not written down, it's not a process.

[00:10:01] Right. That's a great way of saying it. So, I think to what you just said, if you're listening

[00:10:06] to this, I think a policy can be in your head. You can execute a process that's in your head

[00:10:12] and procedures in your head. But I would argue that anybody else in your organization,

[00:10:17] they can potentially understand the policy that's in your head. They are unlikely to follow

[00:10:22] the procedures the exact same way. It's very true. I mean, so from that, we learned a lot of things.

[00:10:32] I mean, initially it was a lot of hard work and a lot of micromanagement, which I don't like.

[00:10:40] Micromanaging from a get it done standpoint? Yeah, I think so. Just basically just following up

[00:10:47] and checking and making sure that things were being done from entering time on tickets,

[00:10:54] actually making it to work. You can't close a ticket unless there's time entered and notes put in

[00:11:00] and reviewing those things to make sure that they're being done properly.

[00:11:05] Right. So, wow. Yeah. Go ahead. No, I was just going to say, I mean, now it's so ingrained.

[00:11:15] I mean, any habit that you develop, any skill that you're trying to learn,

[00:11:20] you put enough time into it and all of a sudden it becomes more like a routine. It doesn't seem

[00:11:26] like it's a burden. And so the point of it is, is that yes, initially following a framework or

[00:11:34] doing any of that stuff is going to be a burden. It's going to encroach upon your personal time or

[00:11:40] the time that you think that you have to give someplace else. But once you start doing it,

[00:11:46] it does become like any other habit in developing skills. I'm a guitar player. I mean,

[00:11:52] I didn't just pick it up and start playing. I had to dedicate time to do that. And I think

[00:11:59] that any of us that own businesses particularly, the survival instinct needs to kick in a little bit

[00:12:07] here because if you want to be at a certain level operationally mature or security mature

[00:12:16] or anything else, it's going to take some dedication to do it.

[00:12:22] So that kind of leads us into the sort of transition piece of like,

[00:12:27] you know, as a solution provider figuring out that if you don't have some maturity in place around

[00:12:33] service delivery, operations, it's kind of hard to tackle cybersecurity or really any sort of

[00:12:42] defined set of safeguards regardless of what they might be for. You know, they could have

[00:12:45] set of safeguards for, you know, how to run your business. And obviously, if you're

[00:12:50] organizationally not looking to improve that process, it's not going to really matter which

[00:12:56] one you tackle. So I think that makes it very clear like, hey, you know, get processes in place,

[00:13:01] get your procedures in place, make sure that everybody in your organization is kind of

[00:13:05] following that mindset. But then that gets us into this sort of like we've kind of skirted it

[00:13:10] like talking about evidence for particular, but you know, talk to me a little bit about

[00:13:15] your process, you know, leading up to having evidence for SOC 2 or the trust mark. Like

[00:13:21] I feel like that's a big shift in the type of evidence that you're now collecting that's a

[00:13:27] little bit more, there's more effort involved than a lot of cases. Sure. You know, we were

[00:13:34] initially it was so overwhelming, man. I think our first process really took about nine months,

[00:13:41] right? I mean, from beginning to end, from the time I said, we're going to do it. And I

[00:13:45] wrote the check and signed the contract to go through the process. I mean, it took nine months.

[00:13:51] And part of that was me, you know, part of it was my lack of wanting to delegate it because I

[00:13:58] really wanted to absorb as much of it as I could. But I realized about, you know, four and a half

[00:14:05] months into it that it's not something that I can do on my own and that I really needed to

[00:14:10] start engaging the rest of my staff to do this. And so initially when we did the SOC 2, it was

[00:14:19] type one. We never actually achieved the type two, though we were prepared to do so and then

[00:14:27] decided to move down the trust mark route with CompTIA. But I think our, I wish that

[00:14:38] evidence collection was a little bit more automated. And I think that's starting to

[00:14:44] show up a little bit more. But for us, it was certainly not. I mean, we were logging into

[00:14:50] 72 different platforms and showing evidence that we had MFA turning.

[00:14:59] We were essentially managed, right? Right. Definitely was not centrally managed. I mean,

[00:15:03] a lot of that has changed in the last several years for us.

[00:15:07] It didn't exist back in the day, right? Like, man, SSO, like we look at the SSO tax website,

[00:15:12] like, man, some of the websites are some of the vendors who have SSO now, like the cost

[00:15:17] is prohibitive for a lot of smaller businesses. And still is. I mean, it's like you have to

[00:15:24] go up to the enterprise level with a minimum of, you know, say a 50 or more. Right. Buy

[00:15:30] and then you're a company of, you know, I mean for us 13, right? Or whatever the case may be. So it is,

[00:15:37] it is kind of ridiculous. And believe me, I've been making every single one of them that I

[00:15:43] encounter aware of my displeasure. Right, right. And I think it comes with some overhead costs

[00:15:51] that we have to also, I don't want to say appreciate, but we understand based on us

[00:15:56] going through the steps to secure, it's not so different. So talk to me a little bit about,

[00:16:03] you know, you were mentioning some of the things that you had to do, you know, MFA,

[00:16:07] show proof of that. How has that changed? Because, you know, one of the things that's

[00:16:10] really interesting about what you said about like, you know, automated systems.

[00:16:13] Yes, technology is moving very rapidly. What was very manual four or five years ago,

[00:16:18] you can now do a lot of that data collection very fast, whether it's through a PSA,

[00:16:22] or sorry, an RMM or some of those tools that can go and harvest. But you know, I think we're still

[00:16:27] dealing with that 80-20 rule again, only this is where technology still has the limitations of what

[00:16:34] it can do from an automation standpoint. It still involves the human element. And I think sometimes

[00:16:40] we lose sight of tools being able to do all of it. And we get upset when it like, it's

[00:16:47] only doing like this one small piece. You're like, yeah, but that one small piece would

[00:16:50] take you hours if you had to do it manually. So like, take the wins where they are.

[00:16:55] So so evidence collection today for you, how has that changed over the, you know, the course of

[00:17:01] the last couple of years? Because I mean, evidence maybe was it was it lighter when you did it the

[00:17:05] first time? Or what was it that they asked for that you thought in the beginning was like,

[00:17:10] I don't know how we're ever going to collect that evidence. And then today,

[00:17:13] you're like, I'm collecting the evidence. I'm not necessarily writing it the way they

[00:17:16] want to read it. But it's we're doing the things we say we're doing.

[00:17:20] Yeah, I mean, I think for us, it was again, back to the the processes and everything that we

[00:17:28] really established during our first assessment. And so

[00:17:34] and it just again, it became a routine like, you know, we are not always collecting

[00:17:42] evidence, but we know where to get the evidence for what we need to do now. So

[00:17:49] when we were getting ready to strive for the sock to type two, the whole deal there is to,

[00:17:56] you know, prove it over a, you know, six nine month period of time as opposed to a once,

[00:18:03] you know, one point in time. The snapshot said that's almost a, I don't want to say it's easier,

[00:18:08] but like, you're only going to be in that spot in time one time anyway.

[00:18:13] Right. Exactly. No matter what you do.

[00:18:15] So, you know, a lot of it was like, you know, some of it, especially on the sock two side,

[00:18:20] it's like, okay, well, we have, we have meetings to review these things, right? And so we,

[00:18:26] we formulated templates started using, you know, 90 for meeting documentation and

[00:18:33] minutes and things of that nature that, you know, showed that, hey, we were reviewing these

[00:18:38] and we would take screenshots of those reviews and collect them into a place and be done with

[00:18:44] that. And so that was, that's how it kind of started. Now it's probably very similar to that.

[00:18:55] I mean, things have changed somewhat. But you know, when we do,

[00:19:01] you know, we do backup restore testing not only for our organization, but for every client that we

[00:19:08] do backup routines for. So on a quarterly basis, we go in and do that, we track it,

[00:19:14] you know, we document it, you know, take the necessary resilience, right? You're adding in

[00:19:20] so, so to some extent, the whole effort of collecting evidence from an assessment standpoint

[00:19:27] is really about being able to prove something right now. And where you're at in the path that

[00:19:32] you're on, correct me if I'm wrong here, is about being able to prove it at any given time

[00:19:36] because you've asked or you want to look at something and like, because if it fails,

[00:19:41] then it's failing for everybody. Right, that's correct. So that's, that's really the approach

[00:19:47] that I've tried to take. Look, I don't consider myself an assessor, but I tried to look at it from

[00:19:53] that mindset now particularly. It's like I'm constantly assessing my own business.

[00:20:03] That's spot on. I shouldn't have interrupted that you were on something, but I was going to say

[00:20:08] what you said is something I want to make sure we capture. What I want to see it, right?

[00:20:13] Like you're saying the thing that I think a lot of organizations are forgetting about, and that is

[00:20:18] you should be checking all of your systems on a basis that's more frequent than once a year.

[00:20:23] And I would argue that somewhere between a week and a quarter is going to be based on systems,

[00:20:30] you know, they may vary. Like I'm not looking like, Hey, you should do everything every day.

[00:20:35] But we have tools now that allow us to do continuous vulnerability scanning. We didn't

[00:20:40] have that before, man. You were like, dude, I'm going to run the scan. So like don't try to like

[00:20:44] run any big jobs on your computer because the scan is going to literally take all the resources that

[00:20:48] we have and the internet's going to be non-existent for like the next hour and a half.

[00:20:52] Right. Yeah, it is. I mean, literally that's that's how it is. So taking kind of taking

[00:20:59] an assessment mindset and building some things into your routine. Look, there are some

[00:21:04] safeguards that we look at way more frequently than others. They're part of a weekly meeting

[00:21:11] that we have within our organization that we're like, okay, these KPIs that we're looking at

[00:21:18] to make sure that these things are happening are correct. Whether it's a score that a particular

[00:21:26] product gives us on the vulnerability scanning and where we're at on the remediation plan

[00:21:31] and things of that nature. So it's really just kind of reviewing the things that we have in place

[00:21:35] and then the places where we don't have something in place, then saying, okay, well, those we're

[00:21:41] going to have to take a little bit, whether it's monthly or quarterly or whatever, and have them

[00:21:48] more on a schedule. And so we basically, we have a checklist that, you know, that we have to get

[00:21:54] through a checklist. No, checklist. Yeah, checklist that we have to get through every month regarding

[00:22:02] various areas that we are just verifying that the things that we put in place are continuing

[00:22:09] to work and and the cool thing is that we do that and we find things that are broken and we go

[00:22:15] in and fix them. And so they're just don't run perfectly forever. Right? Well, I mean,

[00:22:22] I think that's one of the things that we often miss. I mean, the age old rule of nah,

[00:22:26] we'll just let it patch it on the next cycle. Yeah, I mean, you can get away with that on some

[00:22:32] things, but you know, it's really not the way that you should look at. But if you have a

[00:22:39] process, right, is kind of your point, right? Like, yeah, I guess I was maybe a little bit over

[00:22:45] simplifying it. Yeah, if it doesn't patch and it catches it the next time.

[00:22:48] But I think we if you go back in time, I mean, you know, largely we would say, I'll catch the next

[00:22:53] time. Well, how many times before we proactively go and investigate? Like, is that documented? Is there

[00:22:59] a if any endpoint, you know, goes through three patch cycles or two patch whatever it is,

[00:23:04] you know, what's the what's the protocol? And I think it's some of these little things that

[00:23:08] you've described as part of your process, maybe we don't call them little things. But

[00:23:12] I mean, the reality is they all add up to making a resilient environment. And I think that's

[00:23:18] ultimately the goal here, you didn't start out on a journey, but I just want to do SOC2 just because

[00:23:24] it was there. We don't know what we don't know. And one of the things that you find out when

[00:23:28] you go through the trust mark is to do a gap analysis. What are the things that I know I'm

[00:23:32] doing? What are the things that I know I'm not doing? And what are the things that I just

[00:23:34] genuinely don't understand? And hopefully that comes out with somewhere between a 6040,

[00:23:40] 7030, and it can be either direction, not proposing that everybody's going to be, you know, 70, 80%

[00:23:46] deployed. But like, you have to know where you are to know where you can go. Yep. Absolutely.

[00:23:53] And understand that it's never going to really stop because either a I mean, look,

[00:24:02] our realm changes so quickly now. I mean, used to be it's like, oh, well, we've got

[00:24:08] a realm. Are we in, you know, oh man, just Zelda.

[00:24:17] You're not wrong. Like, wait, so yeah, it's a good point. The realm, the world that we live in

[00:24:22] today or the realm we live in, depending on what kingdom you're from, you may find that your

[00:24:28] clients don't deal with some of the things that you are because your clients aren't all

[00:24:33] in the same vertical of the same industry are targeted by the same, you know, geographic threat

[00:24:38] actor, whatever fill in the blank. So yeah, that's a good point. Like it's always changing.

[00:24:45] Yeah. And I think we've actually covered evidence quite well. I think it's we're not

[00:24:49] telling you what the evidence needs to look like. I think that's up to you as you go

[00:24:53] through your own process. Jim, what else would you leave our listeners with who are

[00:24:58] currently whether they're going after the trust mark or any other framework for their own

[00:25:01] satisfaction or they haven't even gone down that path? You know, how would you advise someone

[00:25:06] as a solution provider like, Hey, what are some of the things you should do first? We're going to

[00:25:10] assume that you are working towards our have worked through from a business operation standpoint,

[00:25:16] service delivery standpoint. And now you're trying to enhance your cybersecurity posture.

[00:25:21] We'll give them that win and just I don't want to say assume because we know what

[00:25:24] that means, but that they are doing some of those things because I think that I think

[00:25:28] our listeners are at that level or better. Yeah, I think so. And I think from my standpoint really

[00:25:37] is just you've got to you've got to like when you're digging a hole or you're planting a garden,

[00:25:44] you know, you still you got to stick the shovel into the ground, right? Okay, you've got to

[00:25:49] you've got to get started and and like digging a garden or planting a tree or something like that.

[00:25:57] I mean, it's going to make you sweat a little bit. It's it's it's going to hurt working out.

[00:26:02] You do talk about that before we got on the call. It's like, you know, I mean, it's

[00:26:07] it's going to be hard. Yeah, it's gonna be hard, but it does get easier. And as you realize

[00:26:13] that look, I can do this, I can there are things that that I can put in place and I'm

[00:26:18] identifying those things as I'm going along, that as I start going back and taking it

[00:26:23] from the beginning again, that the next time is going to be easier. And then the next time is

[00:26:28] going to be easier after that. That's like, if you wait to weed the garden until it's overgrown

[00:26:32] with weeds, it's going to be backbreaking work. But if you maintain it from the beginning,

[00:26:37] it can actually be for some people enjoyable. I think that's, that's a good way to describe

[00:26:43] it. I would also add to that, especially with cybersecurity, you don't need to do it

[00:26:50] all at once. As Mike Stewart says, any forward progress is good progress. And I think that's

[00:26:56] true, like you have to start somewhere. Don't try to boil the ocean. Don't try to get all done in one

[00:27:01] day. It's not going to happen. Don't chat GPT all your policies because then you'll have policies

[00:27:06] you can't follow. So yeah, I think that that pretty much sums it up. For those of you

[00:27:12] listening, this has been an episode of MSP 1337. Thanks and have a great week.

[00:27:24] Welcome to MSP 1337. I'm your host Chris Johnson, a show dedicated to cybersecurity

[00:27:31] challenges solutions, a journey together, not alone.

[00:27:35] Welcome everybody to another episode of MSP 1337. I'm joined this week by Pranter Das

[00:27:47] with Sitaro. Welcome to the show. Thank you, Chris. My pleasure. I appreciate you taking

[00:27:55] the time out for our audience. So one of the things that we always want to do is anytime

[00:27:59] we have a guest, especially someone that hasn't been on the show before,

[00:28:02] as if you could just tell the audience a little bit about yourself, kind of your path to where

[00:28:06] you are today. I would love to hear you talk a little bit about Sitaro. I'm very interested in

[00:28:12] what you guys are doing over there. And then we'll jump into your topic and I'll let you

[00:28:16] share that with the audience. Sounds good, Chris. Thank you. As you said, I'm Pranter Das. I'm

[00:28:22] co-founder and CEO of Sitaro. We are a Boston based startup. And I like to tell people that

[00:28:29] we are in broadly in the cybersecurity space, but more specifically focused on data security.

[00:28:37] I mean, obviously the question is how did I get here? How did we get here?

[00:28:41] I've actually, the bulk of my career has been in marketing and marketing technology.

[00:28:48] I have been the CTO at two of the largest marketing services providers in this country

[00:28:52] and one of them is actually one of the largest in the planet. What that did for me was expose me to

[00:29:01] two things. One was data management at scale. When you talk about building marketing solutions for

[00:29:06] the largest brands in the country or in the world, we're talking about tens of billions

[00:29:12] of transactions or records that store sensitive information. The flip side of that is how do

[00:29:18] you collect, store, but protect this information while the company is using them to drive their

[00:29:23] marketing, their customer relationship building. When you put those two things together, we're

[00:29:28] talking about data security and privacy management at massive volumes. That challenge

[00:29:35] is what got me and my co-founder started in the data security space because what we saw

[00:29:40] was limitations around how to achieve security and privacy at scale that was meaningful,

[00:29:47] but didn't become an obstacle to enabling companies to operate and access the data.

[00:29:54] We're going to send that anyways. They're going to send the data either way.

[00:29:57] Exactly. Yeah, but with all the regulations coming on board and the administration,

[00:30:06] privacy groups becoming more stringent and focused on how to protect the consumer,

[00:30:12] addressing the challenge was what got us started and we said, hey, there's got to be a better

[00:30:16] way of doing this, the classic. We need to figure out a better way of achieving data

[00:30:21] security and privacy. That's what got us started on this thing. We looked at this problem as Sotero

[00:30:28] and said most security, if not all of security is focused on the outside, meaning

[00:30:34] locked on the network, locked on access, which kind of is ironic because the one thing

[00:30:40] it didn't really protect was the data or the information itself. We said, why don't we

[00:30:45] look at it as an inside out problem and come up with a solution for facilitating this process.

[00:30:52] And we came up with the patent ability to query encrypted attributes without the need to decrypt them.

[00:31:00] So it sounds very technical and complex, but if you think about it, what it does is elevate

[00:31:06] data protection from address, meaning protected for nobody's using it, to protect the data

[00:31:12] while it's being used. So that was an initial point into this space.

[00:31:19] So I mean, we could just go so many different directions with this, but I think the one that

[00:31:24] pops out at me, I think about it and you're gonna have to tell the audience what this topic is,

[00:31:29] but I'm going to throw this out there. I think one of the challenges that we have in

[00:31:33] protecting data is so many people have become almost sensitized to what is sensitive data,

[00:31:42] they understand what data is, they understand when a medical record should probably be protected.

[00:31:48] We still see it all the time like sign this document, put your Social Security number on it,

[00:31:52] and then scan an email back to me. Well, how do I ensure that I'm sending you something

[00:31:58] that's going to get there securely, let alone like why do I like so many

[00:32:03] steps are being taken to do something that should not be happening? It shouldn't have

[00:32:07] to send this stuff around like this. Yeah, absolutely. I mean, the hacker aspect of it

[00:32:14] or the data, the theft aspect of it has become extremely complex because the talent and the

[00:32:20] skills being deployed on that side matches the talent and skills on the traditional product

[00:32:25] development or the data management space. So no, it's not just it's one data asset that has

[00:32:30] your credit card numbers. They are collating data from a broad basis sources that actually makes that

[00:32:39] hack or that attack or that criminal attempt so much more sophisticated where they're actually

[00:32:46] crafting programs built around the ability to link your data, not just your credit card number,

[00:32:51] but potentially your medical history or travel history and all of this and build a hack

[00:32:57] attempt that's so much more sophisticated that it continues to fool people. But it's all

[00:33:02] based on the fact that they have access to data and building a profile. So they're not just saying

[00:33:08] like, hey, I got your credit card number. They're like, Hey, yeah, that's a win. We would have said

[00:33:11] that's a win. If I was a bad guy, having someone's credit card number without them knowing about

[00:33:15] it, that's at least a win, right? Because I can do something with that. But to your point,

[00:33:20] we're talking about like, that's just part of the breadcrumb trail that they're using to

[00:33:24] build something that allows for a much larger impact when they finally do deploy their

[00:33:29] Exactly. And even the monetization of that, right? There's the whole data collation aspect of it,

[00:33:34] then the sales aspect of it, then the people that are building platforms to leverage those

[00:33:39] things, providing people code or even the fishing code, which is the starting point.

[00:33:46] You're depressing me.

[00:33:49] But it is reality. That's what we're facing. I mean, every what's the thing that you

[00:33:54] that's consistent every day when you listen to the news or read the news, it's about new

[00:33:58] hack attempts or new breaches that have occurred. Or I think the ones that are worse than the

[00:34:04] actual breaches to me right now is the the the doom and gloom of what's going to happen

[00:34:09] if we don't do something, but then no one says what should we do? They just kind of

[00:34:12] leave it out there like, yeah, if we don't protect critical infrastructure,

[00:34:17] you know, bad things are going to happen. It's like, hmm, that's probably true. So

[00:34:21] what's your suggestion? Right? Like, give me something to do and then I can at least take

[00:34:25] action. Let me be tactical. So that's where we kind of looked at this and said, hey,

[00:34:32] this like a multitude of point solutions, right? Network access management, endpoint

[00:34:39] monitoring, log monitoring is if you look at this, the one thing that still has not

[00:34:44] protected is the data. And that's really an operational challenge because no company

[00:34:49] wants to introduce a data product because they're under the belief that it's closed down commerce

[00:34:54] or process. The moment there's friction in the process, companies are scared because that means

[00:34:59] an impact on the revenue and they go like, oh, we're not going to do anything at the data level

[00:35:03] because that's going to slow things down. We said whatever we come up with has to operate

[00:35:08] at the speed of business, meaning no impact on the, or no impact on the business process

[00:35:14] or don't create any additional friction so that the business can continue to function,

[00:35:19] but still be able to provide this level of security and protection that data truly needs,

[00:35:25] which is one perspective. The second perspective is you can look at it as the ultimate backstop.

[00:35:31] If everything else fails, your endpoint monitoring fails, they find a new API or a vehicle,

[00:35:36] a third party software vehicle that they piggyback on, you want to be able to

[00:35:40] protect your data in the event something bad happens. We took that approach as an inside out

[00:35:45] requirement and that's what we've done is provide a layer of protection at the data asset level.

[00:35:52] Very simply, if you think about all the things that happen at the network level,

[00:35:56] we achieve all of those at the data level with no impact to business. That's what we do.

[00:36:00] So since you didn't tell him, I'm going to say that today's topic is all about protecting

[00:36:04] your data. Is that a fair? Pretty much. I mean, it's all about protecting your data.

[00:36:09] What can you do to keep your data protected without impacting your business? And that kind of

[00:36:14] goes into, as you said, it can go in a multitude of directions, right? There is not one single

[00:36:20] data asset. Right? No, and I think we struggle with that a lot, right? I think you look at

[00:36:25] anybody that's been in business for more than a few years, you suddenly go,

[00:36:28] do you know where all your data is? And of course their question is like, yeah,

[00:36:31] it's all on the server. And it's like, well, what about the data you sent to other people?

[00:36:34] What about the data you shared with fill in the blank? What about and all of a sudden they're like,

[00:36:39] okay, it's in a few places. Like, well, are you on any social media platforms? Do you,

[00:36:44] you know, like the information is out there and it's probably everywhere.

[00:36:49] And then maybe knowing where it is is less important than knowing how to protect it

[00:36:54] where it is. I think it would probably be a more comfortable feeling than just saying,

[00:36:58] don't necessarily care it's there. I just care that it's protected there.

[00:37:02] Yeah, which you actually bring up a good point, right? That's also a multifaceted or a two-dimensional

[00:37:07] issue, right? Knowing where the data is and knowing what form the data is in because you have

[00:37:12] databases, you have file stores, you have cloud stores, you have small data assets or repositories

[00:37:18] that think about so this the discovery aspect of it and providing a single platform for protecting

[00:37:24] data in all of its forms everywhere all the time. If you look at our tagline, which is a plug here,

[00:37:31] but it's also relevant, we say we protect data everywhere all the time.

[00:37:37] So that raises a question back to what you said earlier about how front actors are

[00:37:41] not just getting a credit card and that's it. They're done. It's all the other pieces.

[00:37:46] And I think this is an interesting, made me think like, if I think about data,

[00:37:52] and I think about the different types of data, not all data is equal, right? So I'm thinking like,

[00:37:56] okay, a credit card, that's kind of a big deal. What if it's like I drive a blue charger?

[00:38:03] Well, there's lots of blue chargers, but and blue charger is a data type that is attached to my car,

[00:38:10] but it's not really that important until you start using that to build that bigger package

[00:38:16] because you've pieced together information about me. And we were talking about this before the show,

[00:38:22] like when you can get access to that information very rapidly. And I think that's why I think we're

[00:38:29] seeing this new explosion of the doom and gloom of why things like critical infrastructure are

[00:38:35] suddenly being talked about in the news and the potential for bad things to happen is because

[00:38:40] of the way in which they're collecting the information to create the attack. It's not

[00:38:44] it's all of the little things adding up to a big problem. Exactly. I mean, it's a question like how

[00:38:52] does security and privacy keep up with innovation, right? Companies organizations are focused on

[00:38:57] commercialization or bringing to market new products that drive new revenue. And I kind of

[00:39:03] cite this all the time, a simple thing like a sprinkler controller, right? We're all excited

[00:39:08] because that minimizes your water bill kind of gives you control, we can turn it on,

[00:39:11] turn it off from wherever. But think about the potential data that a small device like that could

[00:39:17] be collecting, right? It could essentially predict your routine. It could tell them when you think

[00:39:25] they think you're going to be at home, when they think it's going to get,

[00:39:28] it's going to be vulnerable. You attach that to the fact that you drive a car, it gives you

[00:39:33] a level of credibility when somebody has a conversation with you and says, hey, I know

[00:39:38] the charger that you drive, right? It kind of makes you more amenable to start giving them more

[00:39:43] information than you would if the context wasn't there. I don't want to say IoT is bad. It's not

[00:39:51] inherently bad. But I think about devices like a thermostat or to your point, the sprinkler system,

[00:39:57] you start piecing those together and all of a sudden I know an awful lot about your

[00:40:02] patterns. Like, you know, you watch the movies and they're like someone gets kidnapped

[00:40:06] because they've been following them for weeks. They know they make the same path and the same

[00:40:10] coffee shop, the same, you know, every time. Well, what if it's not, it doesn't require any sort of

[00:40:15] physical tracking? We just are literally taking the data bytes that are coming in and saying, oh,

[00:40:20] we know that they leave the house every day at this time because that's when the lock gets

[00:40:24] turned on. We know, and we can see through the ring cameras that no one's home when that

[00:40:28] happens. Like suddenly you've built out a pretty extensive and you haven't done anything

[00:40:32] or gone anywhere. Yeah, and so that to that point, right? I mean, the company that's selling

[00:40:38] that information doesn't realize the value or the potential challenge that it's going to raise,

[00:40:44] right? Because they're selling it to monetize it to another business that says, hey, I'm going

[00:40:48] to leverage this data to sell them more services and you get a cut of the revenue.

[00:40:52] Fine, ship them the data. Right? Somebody that gets their hands on that data set

[00:40:57] collates it with someone else, something else. And then immediately, like you said,

[00:41:01] be able to develop a profile that essentially tells them what you do when you do it and when

[00:41:07] you're vulnerable. Yeah, I kind of think about like if we want to take this to scale, this is

[00:41:13] not really about, say, attacking an individual or a household or even

[00:41:19] residential commercial doesn't necessarily matter. But like, what if you're,

[00:41:21] if you were to monitor that level of data set for an entire neighborhood or subdivision

[00:41:26] or a small city or a large city and start recognizing patterns with things like water

[00:41:30] consumption or electricity consumption, and then you can start going, well, if we can manipulate

[00:41:36] those things, then we can control the commerce happening for that city or that, you know, hey,

[00:41:42] we suddenly cost of water just went through the roof. Why is everybody talking about a water

[00:41:46] shortage? Well, there isn't one. We're just talking about one, make you think that it's

[00:41:50] happening so that people panic. It's funny you bring that up because that's so relevant with this

[00:41:56] notion of the notion of charging price based on the demand, right? It's something to your point

[00:42:06] that with a data set and with the right analytics, you can create artificial demand and drive up

[00:42:14] prices. Absolutely. Or at least create confusion in the market so that you don't

[00:42:19] rely on it anymore. And that can be just bad too. And it's so surprising. That's where this

[00:42:27] starts, right? So going back in time thinking about how we would have protected data, I remember I had

[00:42:34] I think it was the co-founder of Tenable on and we were talking about security back in the

[00:42:39] late nineties, early 2000s and how security was basically do what you're told. And if you

[00:42:45] get to do some security, that's better than no security. So yes, the marketing department says

[00:42:49] you're absolutely going to expose the folder to the public internet because otherwise you've

[00:42:54] created friction. Fast forward to today, obviously a lot of those problems are minimized. We tend

[00:43:01] to do some of those things better than before. But we've now also introduced things like AI

[00:43:07] and large language models that allow for that really fast data collection, that data

[00:43:14] joining data sets together. In fact, in some cases and I love your thoughts on this,

[00:43:20] what data sets are talking to other data sets? Right? Like so you use an LLM and maybe whether

[00:43:25] it's chat, EPTA, I'm not saying that just as an example. But like you use it, but what other

[00:43:30] data sets is it talking to that it should or shouldn't be using to help generate what it's

[00:43:35] going to give me back as an answer? I think we're at the very, very starting point of

[00:43:42] data sets with an LLM. I mean, I think you mentioned that LLMs have been around for a while or

[00:43:47] models have been around for a long time. What's given them the power is available to computing.

[00:43:52] That kind of became the foundation for collating and digesting, processing a massive amount of

[00:44:00] data which was not even fast. You couldn't even think that there was possible a couple

[00:44:05] of years ago and suddenly it's there. That has already started concerns about what's in there,

[00:44:13] in that data set that's already publicly available. As these evolve to drive more

[00:44:20] organizational specific needs, you're going to see the mingling of truly sensitive data

[00:44:28] about a consumer's interaction with the company. That data, the behavior that they've exhibited

[00:44:34] in the broader open community. Think about all the conclusions and inferences that can be drawn

[00:44:41] when you start mixing all this data together with really no safeguards or no controls.

[00:44:49] No rules, yeah. Which also means that garbage in, garbage out. If you don't have any control

[00:44:55] over the data sets, then you have literally no control over which ones are garbage and which

[00:45:01] ones are not as it puts data back into the output. Unfortunately, in today's world, garbage out also

[00:45:08] has a massive impact and quite usually a negative impact on the individual or the organization

[00:45:16] that's being referenced in that garbage output. We've really talked a lot about the negativity,

[00:45:22] well I shouldn't say negativity. The power of AI, the power of compute to get data,

[00:45:29] get answers. What we really haven't talked about is, is there a path forward that isn't just us

[00:45:35] speculating on if we were to do a better job protecting critical infrastructure,

[00:45:39] we'd be less worried about bad things happening to it? So absolutely, there is certainly a path

[00:45:46] forward. I mean that's what we're happy to be part of that thing and we're not the only one,

[00:45:51] there's multiple organizations trying to solve this problem. The first thing is keep data

[00:45:58] secure. I mean stop the hackers or the criminals from accessing your data. That's the number one

[00:46:03] thing and that relates to not this data but it also relates to the infrastructure challenge,

[00:46:08] making sure that they're secure and making sure that secure is preventing them from being

[00:46:13] contaminated, attacked or being compromised. That's a question of real-time monitoring

[00:46:20] threat detection and threat prevention. That involves real-time massive, well I mean I won't

[00:46:28] say massive but really quick detection of threats on even the smallest devices whether it's a

[00:46:33] niote or thinker, they normally become the gateway for somebody getting into a network.

[00:46:38] So to rephrase what you just said, you're essentially saying we've got to monitor everything

[00:46:42] and act on the things that are out of the ordinary which says, you know, Cetero

[00:46:47] doing something like that but you know, like you said there's others. You think part of this is like

[00:46:52] getting organizations to recognize that you have to turn them on. You got to monitor.

[00:46:56] Like I think there's a lot of still not doing it, like just not doing it at all or they don't know

[00:47:01] where their infrastructure is spread out so they're monitoring one network but they're not

[00:47:06] monitoring all the networks in their organization. Monitoring and we can relate that to saying

[00:47:11] monitor all of the points of access to a network. Don't just focus on the laptops that you

[00:47:17] give out to your employees. There's a multitude of more entry points to your network that you're not

[00:47:22] thinking about whether it's API gateways, third-party software that has access. It's not just the

[00:47:27] laptops and devices and people say we've done what we need to do for ransomware protection

[00:47:31] because we've got our endpoints protected. Your endpoints translates to usually the laptops.

[00:47:36] There's a lot more. Right? Monitor all of those in real time. From a vendor perspective

[00:47:41] I think it's equally critical that vendors realize that creating a new software product

[00:47:47] that requires organizations to completely upgrade their environments for this to be effective is

[00:47:53] not a solution. Oh, it's very cost-effective. Yeah and it's not practical or feasible to

[00:47:59] expect an organization to completely rip apart everything so we've got to build

[00:48:04] something. We've got to build products and solutions that work in their existing environments.

[00:48:09] And I think a good example of that would be like Dicom imaging systems in hospitals where

[00:48:14] they're running legacy operating systems but the technology still works just like it did when

[00:48:20] they bought it. It's still doing what it's supposed to do but the ability to secure it

[00:48:25] obviously is requiring things to change. We always called it compensating controls, but

[00:48:32] compensating controls can get extremely expensive too when that legacy, but to your point

[00:48:38] you can't just go and uproot a $15 million solution just because the OS is antiquated in that model. I

[00:48:46] mean it's just, you know, hospitals would be going out of business if they had every three to

[00:48:50] five years they have to upgrade that. Upgrade everything right? I mean I'll give you a more

[00:48:54] practical example. We talked to enterprises as part of our outreach and we've had

[00:49:00] enterprises tell us, I mean obviously we won't mention the name, these are multi-billion

[00:49:04] dollar companies that say half of our processing is still done on mainframes. Does your product work

[00:49:09] on the mainframe? And yeah absolutely, I mean that's part of what we thought about because we come

[00:49:15] from the world where organizations use a lot of different technologies to achieve their business

[00:49:20] outcomes and we're saying we understand that. And the product platform we build will always make

[00:49:25] sure that we work with everything that an organization has and not expect you to either

[00:49:31] protect a portion of, fraction of what you have or say upgrade everything for it to be.

[00:49:36] I mean I will say this about, you know, some of the critical infrastructure stuff. I was listening

[00:49:40] to, I think it's the guys behind Run Zero, I think that's the company. He was talking about how,

[00:49:47] you know, like seeing something in critical infrastructure that's now connected to the internet.

[00:49:52] The only thing you can see is that I can make it do this or that but I don't know what

[00:49:56] either of those things are and I don't know what it is but now I can see it. So now I can

[00:50:00] toggle this switch that opens a dam or closes a dam but I have no idea. So in those cases we're talking

[00:50:05] about like operational, you know, technology that's older than most of the IT or the IoT stuff that

[00:50:12] we use today like it was created before the idea of the internet 50 plus years ago. Do you think

[00:50:17] we need to upgrade that or do we need to do a better job of providing better naming convention?

[00:50:24] It's like I think that there are people out there that end up in having done something

[00:50:29] out of curiosity that causes bad things to happen because we've seen kids get into systems and go,

[00:50:37] oh this is cool. Click, no idea what click did they just wanted to find out what would happen.

[00:50:42] Had it said this is this, you know, if I had a little bit more information to it,

[00:50:46] you know the double edge is sort of like I feel like sometimes we just don't have enough

[00:50:49] information at our fingertips to make better decisions because what we are monitoring

[00:50:54] isn't telling us very much.

[00:50:57] All right, so we spoke about the negative aspects of AI or LLMs. I mean this is where that becomes

[00:51:02] so much more useful, right? It's going to be hard to come up with standard naming conventions

[00:51:07] and enforce them that we're talking about a decade long process, right?

[00:51:11] We have plenty of time.

[00:51:14] Part of what we do for our discovery and classification is leverage LLMs,

[00:51:19] right? Because they're so much more powerful in looking at the data asset and looking at the context

[00:51:25] and saying, hey this is what this most likely is going to be, right? So there is certainly an

[00:51:31] opportunity to be able to do that at scale on a more reasonable timeframe using technology

[00:51:38] to achieve where we need.

[00:51:39] So now you're getting into making sure that LLMs are being built in such a way

[00:51:45] that they maintain the integrity.

[00:51:46] Yeah, yeah. And I mean so again, I mean LLMs it's not a people kind of tend to focus on one

[00:51:54] aspect of LLM, right? I mean if you think about the entire process end to end,

[00:51:58] it starts from data collection, data processing, data storage, data persistence,

[00:52:03] the cleansing and sanitization of a prompt, the sanitization of the output, validation that

[00:52:10] there's no copyright or sensitive confidential information in there and then pushing that

[00:52:14] for consumption. So we're talking about an end to end process, right? Like everything else.

[00:52:21] We are happy to say that we're up there early building that end to end platform that enables

[00:52:27] what we refer to as trusted interactions with LLM.

[00:52:30] So you have to teach people to think differently, right? Because I think about

[00:52:34] like if I'm a great Python programmer that doesn't mean I understand what's going to happen

[00:52:38] when you go in and log into your chat gpt prompt and you start asking questions,

[00:52:45] it may not know how to process that but you keep asking the questions and suddenly it's like oh

[00:52:50] that does make sense. I can totally give you an answer for that and the reality is it shouldn't

[00:52:53] have given you, it should have been like alarms going off telling people like what is being asked

[00:52:58] is not okay for anybody to ask. Exactly. And if it's allowed to continue, well now it's also

[00:53:03] corrupting the model. The model, yeah. I mean think about this we've trained an entire generation

[00:53:10] of developers, right? Encoders to not worry about security and privacy because everything was

[00:53:15] automated in the thing. It just became simpler where they said I just need this answer, I'm

[00:53:19] going to get it. I don't, I'm not going to think about it. So in that context you want to

[00:53:24] be able to make sure that all of this is automated that they're operating within an

[00:53:29] environment that takes care of all of that because expecting every developer or coder out

[00:53:34] there to be cognizant and aware of this is not practical. Yeah but you remember if you go back

[00:53:39] in time to the days when we use like you know C sharp and C plus plus and you wrote it wrong

[00:53:44] and it caused the computer to crash when you went through it, like they don't have that,

[00:53:49] that doesn't happen anymore right? Like you might slow down, you might actually use something

[00:53:53] that's like wow this has taken really a long time to give me the answer that I'm looking for

[00:53:57] but you don't see a reboot, you don't see a... Yeah computers don't crash anymore.

[00:54:00] No not like that and we're using computers on scale so 10 computers can fail and the other 90

[00:54:06] are still up and running and up and running and generating either the wrong or the or a potentially

[00:54:12] dangerous output. So we have just a few minutes left. We've talked a lot about

[00:54:18] you know how we should go about doing this I think differently than what many of us have

[00:54:23] and I can tell you I've got doing a workshop for a group this month on how to approach AI,

[00:54:32] how to approach the risks of AI and like one of them is like you know what are the questions

[00:54:35] that you ask when you work with any vendor? Like you know one question would be like just because

[00:54:40] the vendor doesn't out that they're an AI company like ChatGPT opens like we know those are

[00:54:47] AI companies like that's what their model is that's their business but like we could say you

[00:54:51] know if you said Microsoft you don't automatically say Microsoft and AI in the same statement but

[00:54:56] the the idea here is like with risk profiling and things along those lines you might ask questions

[00:55:01] like do you use AI in your products? Do you use AI and how does it interact with

[00:55:07] fill in the blank? What data of mine are you storing? Give me some ideas or some suggestions.

[00:55:13] Maybe you've got something to share that would help everybody. Yeah I think there's a few

[00:55:19] basic questions you ask yourself right if I'm going to interact with an LLM model what is my objective?

[00:55:24] Am I just trying to get a general or a genetic response from the data that already exists there

[00:55:30] right? If that's the thing that's one thing then am I worried when I do that that I'm going to

[00:55:35] potentially extract and consume copyrighted or confidential information that could put me in

[00:55:41] legal jeopardy because I inadvertently consumed. The first one is like if I was asking about

[00:55:47] like the FAQ like it could be a knowledge base I'm asking questions much and it's going to be much

[00:55:52] more results will be much better than just trying to search the text phrase in a database right?

[00:55:59] Using its engine to yeah okay so then the second one you're talking about you know making sure that

[00:56:06] when you are using these prompts to understand where it's potentially pulling data from so

[00:56:11] you don't get in legal jeopardy. In legal jeopardy like for example you you ask for some

[00:56:15] images it gives you a copyrighted image from somebody that you're not aware of you use it

[00:56:19] hey guess what you just potentially violated a copyright but you are not even aware of right?

[00:56:25] That's like similar to what we just saw recently with the the New York Times where we had a chat

[00:56:31] for a you know language model that literally went behind their gate scraped all of that data and

[00:56:35] of course now you know and every multi-million dollar yeah yeah yeah I mean they have like

[00:56:42] to your point they have right to be upset because they've spent millions of dollars

[00:56:46] paying and sourcing this information that now anybody can consume right? Well and not only that

[00:56:51] but we're consuming it without knowing where it came from even like it's not like it says at the

[00:56:54] bottom New York Times copyright like yeah the third aspect of it is potentially saying okay now

[00:57:02] that's fine for the genetic or the public publicly available information what do I need to do

[00:57:07] if I need to use elaborate GLLMs for driving a business process or optimizing a business process

[00:57:14] within me does that include exposing my sensitive data how do I prevent my sensitive data or

[00:57:20] confidential information from being integrated with the open source information or the publicly

[00:57:26] available information? We see this a lot with like in the IP space where you're like maybe you're

[00:57:31] building software you ask a you know a chat prompt for help you're struggling with you

[00:57:35] know getting the your code's not working right and then of course suddenly suddenly it works great

[00:57:40] and you're like okay but then what happened to what you put in there what Wells now has

[00:57:45] that entire you know piece of code uh haven't forbid to get into the wrong hands like I don't

[00:57:49] even mean about your software but like just being able to use that to come after your your

[00:57:54] consumers in the future because now I've got this piece of code I understand that you're

[00:57:58] writing software. Yeah you you've thrown all your data assets in there to generate some kind

[00:58:02] of report or a output guess what all of that's in there right. So this really gets into what does it

[00:58:08] mean to say have private LLMs that you can leverage for the for the compute function

[00:58:15] you know and a lot of cases if you're a big organization you don't necessarily need to

[00:58:18] go check with the rest of the world and the LLMs that might be out there you needed to just

[00:58:22] compute well what you have yeah and so that leads to the fourth issue here right which is

[00:58:29] uh bias and skewness in data how good is your data to give you an accurate statistically valid answer

[00:58:37] right if you throw all the data in there have it train your model right the the proprietary model for

[00:58:43] you that doesn't mean that the data set is clean from a bias perspective. Yeah they're too

[00:58:49] true we saw this in Times and Newsweek where you show the bar chart or a pie chart and by

[00:58:54] the percentage are so close but they made it look really big so when it's blown up in scale

[00:58:59] it looks like it's way different. Yeah and just because your prior practices have resulted in

[00:59:06] you skewing your interactions towards a specific audience segment that doesn't mean that you

[00:59:11] want to do it in the future but if you're running on your data that's where you're going to

[00:59:14] end up so there's a statistics aspect of it that also needs to be implemented while you're

[00:59:19] training or preparing data sets to make sure that at least that you're aware that there's a bias in

[00:59:25] the data so you can account for it or be cognizant that your data is biased. To some degree it's kind

[00:59:31] of like saying you have to do maintenance just like you went on a car like you can't just let the

[00:59:34] data just continue to evolve on its own without somebody doing an integrity check period. Exactly

[00:59:40] and what's a good data set for one organization or for one object is not necessarily a good

[00:59:45] data set for another. Sure sure and in fact one wouldn't necessarily know unless they understand what's

[00:59:51] actually the makeup of that too very easy to get excited about it. And so here's the biggest

[00:59:58] problem that underlies all of this stuff right resources or skills needed to make this happen

[01:00:04] you're taking what an organization exists has within it and saying no we've got a new set of

[01:00:10] objectives technologies and capabilities can they adapt to it or do you go look for a platform

[01:00:17] that normalizes all of this but puts this information in a consumable form in front of the people that

[01:00:23] are sitting there and saying this is what the data says it is here's the potential bias in there

[01:00:28] here's the security problems here's how do you protect yourself. I think what you just said

[01:00:33] is kind of our reality right like how many people could make a great pie chart in an excel

[01:00:37] spreadsheet yesterday now they have co-pilot and they can make pie charts all day long. Exactly

[01:00:42] making pie charts. Exactly yeah yeah wow this was great you really opened my eyes to a few things on

[01:00:50] on the risks of AI and LLMs and also the the potential that it brings with it so

[01:00:56] for those of you listening this has been an episode of MSP 1337 thanks and have a great week.

[01:01:07] you

Search Episodes