What happens if your RMM goes down and doesn't come back? We sit down with Mike Stewart of Anchor Networks and Charles Love of ShowTech Solutions to get their take on the importance of Runbooks. By the way, they were at a peer group and brought this up. Let's just say those who sleep alone at night are always looking for friends to share their no-sleep drama.
[00:00:06] Welcome to MSP 1337. I'm your host Chris Johnson, a show dedicated to cybersecurity challenges solutions, a journey together, not alone.
[00:00:21] Welcome everybody to another episode of MSP 1337. This week is a little bit different as we rarely have more than one guest on the show. This week we have two, Mike Stewart from Anchor Networks and Charles Love from Showtech Solutions. Guys, welcome to the show.
[00:00:36] Thanks, Chris.
[00:00:37] Thanks.
[00:00:39] So, this is a shift in some of the conversations that many have heard on this show and that is to start with a story.
[00:00:49] That's actually how the episode, that's how 1337 started was to tell a story and it was to tell the whole story, the truth, like not what was necessarily published in the Glamour magazine, right?
[00:00:59] What happens when you get hit by ransomware. And so what I wanted to do is have you guys, for those listening,
[00:01:06] the title of this episode is runbooks. We'll just call it that simple, it's just runbooks.
[00:01:11] But there's a reason why it's on the front of both of your minds and why, as Charles, you said, when the story unfolds, imagine yourself now hiding in a bathroom.
[00:01:23] So, guys, talk me through the story that took place at your MSP Ignite peer group recently.
[00:01:31] You know, what took place and why, you know, where did this challenge come from?
[00:01:37] You want to go, Charles?
[00:01:39] You know what? You started this nightmare. I think you should start.
[00:01:44] Absolutely. Yeah. So, I was going through some of the security trust marks, figuring out how we better our security posture as an organization.
[00:01:55] And one of the topics that I started talking about within my company was just around how do we recover from a significant situation where we may lose complete access to all data?
[00:02:10] We may lose an entire platform, whether it's an insider threat, someone pushing the wrong button or the vendor just doing something wrong and us having to restart from zero.
[00:02:19] And it scared a lot of people within my company. So I decided to take it to peer group and see if they were willing to help.
[00:02:26] And apparently I scared a lot of people in peer group as well.
[00:02:28] But talking through like RMM was a good example is, you know, you wake up one morning and RMM is just gone, not unavailable, but absolutely gone.
[00:02:38] You don't have access to the portal. You can't get anything into it at all.
[00:02:43] And how to recover from that? Do you know where to start, how to reset it up, whether it's with the same tool or a new tool, but going through and figuring out how you how you recover from from zero?
[00:02:53] So and then while we were talking about that in in peer group, the next day we come in and the crowd strike situation happened kind of in that in that same area is like recovering from from some some major impact like that.
[00:03:08] And the impact doesn't have to be specific to your company to have it have catastrophic impact on your company.
[00:03:17] Yep.
[00:03:19] Well, there's so much to unpack from what you said, Mike. Right.
[00:03:24] So, yeah, so we were in peer group and Mike brought up the fact, hey, what happens if we lose our man?
[00:03:29] And we're like, oh, you mean like like it's down like that happens.
[00:03:33] I'm on Concord. It happens once a quarter. Right.
[00:03:36] And but it normally comes back up. He's like, no, no, no, no, no, no.
[00:03:40] What happens if it goes away?
[00:03:43] And I'm like, well, what do you what do you mean go away?
[00:03:46] He goes, what happens if for some reason, either some somebody takes over the account, nukes all the data.
[00:03:52] Right. It's kind of like the Microsoft conversation where they tell you, you should do a third party backup.
[00:03:58] And and I'm thinking during this whole thing and the whole meeting ground will halt like we just felt the oxygen suck out of the room.
[00:04:07] Right. Because none of us had ever really thought about this.
[00:04:11] Sure. To this extent.
[00:04:13] So so when Mike said, what happens if you lose RMM?
[00:04:17] You know, we're all like, oh, well, we have other remote access tools because so what happens if it doesn't come back?
[00:04:24] I was like, well, that then then I'm buying a new RMM maybe.
[00:04:30] Right. Yeah, there's there's one like I have a hall pass where if all of a sudden I were to to move on to another one, it's going to be this other one.
[00:04:38] Right. Right. I was like, oh, we could just go to that vendor and call the day.
[00:04:42] He's like, absolutely.
[00:04:44] Because Charles, you have like two thousand endpoints.
[00:04:46] I said, yeah. How long is it going to take you to deploy those?
[00:04:49] Oh, damn. Yeah. Right. So so then.
[00:04:53] So this was the I forget what it is, but there's a like story and then a rebuttal story rebuttal.
[00:04:59] So Mike and I had a very healthy conversation where everybody in the peer group watched us in horror.
[00:05:05] And I said, well, I could just fine. Let's just pretend we have Azure AD.
[00:05:09] I can redeploy the agents.
[00:05:10] It's going to take me a couple of days, but I'll get it done.
[00:05:13] He's like, all right. What about the scripts?
[00:05:15] I'm like, damn it, Mike. So I go, what do you mean? What about the scripts?
[00:05:18] He goes, well, you have a bunch of automation.
[00:05:20] I was like, oh, yeah, I have like a ton of automation.
[00:05:23] Do you back that up?
[00:05:26] I do now. Right.
[00:05:28] A couple of weeks ago it was no.
[00:05:30] So so then then then Mike was like really going deep into what what automated jobs do you run?
[00:05:38] How often do you run them?
[00:05:39] All these things that, you know, we kind of never thought of.
[00:05:44] And and he brings up a really valid point, because when you think of like, let's say an IT glue or what's the other one that connect wise one?
[00:05:54] Is this docs or the other glue one, right?
[00:05:57] It boost boost.
[00:05:59] Yeah. So when you think of glue and boost, both of them have a concept of a run book.
[00:06:05] And in our brains, that's that's kind of what I go off of.
[00:06:09] If I traded a customer with another MSP, I'll send them my run book.
[00:06:14] They'll send me their run book, you know, things like that.
[00:06:17] But.
[00:06:18] Our member providers, the best provide like these vendors don't offer run book.
[00:06:25] Things right so I can give a customer excuse me, I can give an MSP this entire run book of all their passwords, all their documents, all the changes that I can't pull any of that stuff.
[00:06:39] Right. So what were some of the other things, Mike, that you were so scripts?
[00:06:43] Yeah, the user accounts, the permission levels that you may create, because a lot of times you're creating custom permission groups, what access certain people have integrations was it was a big one of what integrations have to be reset up and reconfigured.
[00:06:57] Yes.
[00:06:58] And all of that.
[00:06:59] And the other big piece that we are seeing is a lot of these vendors that we work with don't back up their data.
[00:07:07] And I'm sure if we read through the hundreds of pages of contracts that we sign, it's very similar to Microsoft of we're not responsible for your data.
[00:07:17] It would not surprise me if most of the vendors were set up that way.
[00:07:20] So it's going through and there's no easy way to back up platforms.
[00:07:24] Well, right.
[00:07:25] Right.
[00:07:26] Especially because you're starting to talk about things that are proprietary, right?
[00:07:29] Like you're not restoring something RMM to another RMM if they don't speak the same language that allows you to say import configurations type stuff.
[00:07:40] So, you know, the way you guys are describing this is interesting to me because I think runbooks is a word that we don't use, but it isn't necessarily something that you're not doing.
[00:07:50] I think you guys went down a very deep rabbit hole on scenarios that would often be associated with tabletop exercises that we would then often generate things that are more conducive to what we would call playbooks, right?
[00:08:03] So like if, you know, ransomware occurs, what does the playbook tell us to do?
[00:08:08] What does our incident response plan tell us to do?
[00:08:10] All of those things that are tied to a workflow that's very much reactive because the event has occurred.
[00:08:16] What you're describing is how do we ensure that what we implement can stay implemented and if it goes away that we can switch to something else, not because you've been crippled,
[00:08:28] but because you now have to make a decision that takes you in a different direction, which is where that whole, and I think we, you know, business impact analysis is kind of guys, kind of what you guys went through, right?
[00:08:38] Like you're, you're identifying, these are the assets.
[00:08:41] These are the things that our company or even business disruption might even be a better way of saying it.
[00:08:47] Like before we get down into the, an incident has occurred and we are being ransomed.
[00:08:51] You're just talking about like, what if it just went away?
[00:08:54] Like what if, heaven forbid that product just got wiped?
[00:08:58] I mean, we've seen the recent news.
[00:09:00] We've totally seen it.
[00:09:00] Yeah.
[00:09:01] We've seen it.
[00:09:01] So yeah.
[00:09:02] We witnessed it last week.
[00:09:04] We saw certain people suddenly show up at other vendors and you're like, wait, what's going on?
[00:09:07] And it's like, well, yeah.
[00:09:09] The company that we worked for is now defunct.
[00:09:11] You know, like, what do you mean they're defunct?
[00:09:12] They still are.
[00:09:13] I'm still getting emails from them.
[00:09:14] They're like, try replying to that email.
[00:09:15] I'm like, Oh, Oh, Oh, got it.
[00:09:17] Got it.
[00:09:17] Like new plan.
[00:09:19] Yeah.
[00:09:19] And, and while we were, while we were there at peer groups, somebody kind of pulled me
[00:09:23] aside and said, I'm going to tell you something funny.
[00:09:25] I said, sure.
[00:09:26] What?
[00:09:26] He goes, aren't you and Mike to run books for your companies?
[00:09:29] Or I'm like, well, I guess so.
[00:09:33] Right.
[00:09:33] Cause we know, we know how all the things are kind of set up.
[00:09:36] You're the skeleton.
[00:09:39] But if like RMM is the beast, right?
[00:09:41] Right.
[00:09:42] Because that has the most integrations that has the most scripting.
[00:09:46] So, but, and this, by the way, not to plug peer group, but I'm going to plug peer group.
[00:09:51] This is, I've been in my bubble.
[00:09:54] Right.
[00:09:54] And we go to these meetings to hear what our peers are thinking about.
[00:10:01] Right.
[00:10:01] Like this goofy session where like nobody wanted to eat at lunch.
[00:10:06] Everyone was going to throw up.
[00:10:07] It was great.
[00:10:08] But, uh, this is Alka seltzer, please.
[00:10:11] Yeah.
[00:10:12] This is why we go to peer group.
[00:10:13] Right.
[00:10:13] And we, we keep joke.
[00:10:15] And every time Mike jumps on a call, we're like, so what's the nightmare for today, Mike?
[00:10:20] He's like, well, boy, do I have like, but he's thinking of that stuff that a lot of people just are kind of ignorant to.
[00:10:29] And if there's one vendor, I just, let me just say this.
[00:10:32] There's this one vendor who now offers QuickBooks online backups.
[00:10:35] Okay.
[00:10:36] Right.
[00:10:37] And I've never really heard of a SAS backup vendor.
[00:10:41] Like, cause they always go, Oh, we backup Google and Microsoft and that's all that matters.
[00:10:45] I've heard one that does SAP, but yeah.
[00:10:47] Yeah.
[00:10:48] So now we're starting to see these vendors.
[00:10:50] So I wonder if we're ever going to come to a point where you're going to have a vendor being able to run book a PSA to run book an RMS.
[00:11:01] Well, right.
[00:11:03] Hmm.
[00:11:03] So, so you, you bring up something that's kind of interesting, but yeah, but, but I think if we rewind a little bit, you know, some of these things you, to your, to your point of like, haven't thought about it at this depth is tied to the fact of like, but how?
[00:11:18] How? Right.
[00:11:18] So like, if you were to say hypothetically QuickBooks is gone, what are you restoring this data that this third party is backing up to?
[00:11:26] So you can get back to business because it now doesn't exist anymore.
[00:11:29] Yeah.
[00:11:30] We're talking about the, the ultimate extreme.
[00:11:32] Right.
[00:11:33] So, um, I was sharing this with Mike before you, before you joined us, Charles, like I, we had a company, they were hit with ransomware.
[00:11:41] They definitely had the, you know, the breach.
[00:11:43] They said, we're not going to pay.
[00:11:44] We're going to restore on our own.
[00:11:46] They didn't have any backups.
[00:11:48] So they rebuilt accounting from the reports that had been generated over the last, however many years, that was their like worst case scenario.
[00:11:55] Their application wasn't gone.
[00:11:57] So I think kind of to your point, run books come in two flavors, right?
[00:12:02] They come in a very specialized to the application.
[00:12:04] If it were to actually go away, what's the alternative, what's RMM version two that I now have to transfer and build out on versus the alternative, which I think is a far more likely scenario of, I can't use what I have.
[00:12:18] I've got to restore the data that's been backed up to the same thing all over again, but I can't put it back in the same place that it was right.
[00:12:25] The, the equivalent of a ransom.
[00:12:27] Yeah.
[00:12:27] So, so let me just say this, this, this also is Mike and his craziness.
[00:12:33] I said, and I love him.
[00:12:36] Right.
[00:12:36] And I just, I mean, I'm just saying, this is not a bad thing, but this is, uh, Mike pushes me and I push Mike, right.
[00:12:42] Sure.
[00:12:42] I like to think of that way.
[00:12:43] I think he pushes me more than I push him.
[00:12:46] Um, I said, oh, well, I'll, I'll just take screenshots or whatever, and I'll store it in SharePoint or something like that.
[00:12:54] And then he's like, and what if SharePoint goes away?
[00:12:57] It's like, damn it, Mike.
[00:12:58] Right.
[00:12:59] He's like, so there, there's a, there's a business procedure that Mike is actually doing where not only is he storing data in a secure platform.
[00:13:09] Mike's going old school.
[00:13:10] He's printing it.
[00:13:12] Right.
[00:13:12] So he has a physical copy.
[00:13:15] Right.
[00:13:16] Uh, like I, I want to say he has spreadsheets on his bed sheets, but that's right.
[00:13:21] That's right.
[00:13:23] But like, he has a physical copy of things and he has a digital copy of things in two different mediums.
[00:13:30] And I'm like, dang it.
[00:13:31] That's, that's what we need to do.
[00:13:33] So I, I'm going to say there's a caveat here though.
[00:13:36] So I think for each of you, each of you having different businesses in different locations, there are a lot of things at play.
[00:13:42] So there's really two, two flavors of run books, right?
[00:13:46] One is I hire Charles to come work for me and I have a run book for each application.
[00:13:50] I need you to be able to understand and be successful with how you're supposed to use that tool.
[00:13:56] That's one scenario, right?
[00:13:58] Like those are run books that are extremely important.
[00:13:59] So something doesn't get broken because they failed to follow, you know, first you put the key in the ignition, make sure your foot's on the brake, all of those things.
[00:14:07] Right.
[00:14:08] The run books that we're talking about right now.
[00:14:10] I don't think you just jumped to conclusions on unless you've done due diligence with say something like a business impact analysis, where you're looking at what are the risks to my organization, business disruption, natural disasters, those, those types of things.
[00:14:24] Right.
[00:14:24] So, you know, you would have like network failure, hardware failure, power outages again, back to natural disasters, security breaches.
[00:14:35] And there's probably others that you would want to take into consideration as you build your run book, which now if I think about it, it's like, okay, well, what do my playbooks also look like?
[00:14:48] Because I have this because I know the scenario that is taking place and I know what I'm supposed to have everybody do.
[00:14:55] But the playbook now is the reflection of the event has occurred.
[00:14:58] Charles knows that he has to call Mike.
[00:15:01] Mike knows that he's going to call everybody else on his team.
[00:15:04] Those things that come into play that aren't even necessarily about the application coming back online or what the alternative is.
[00:15:10] But I think they both take us all the way back to saying, if you haven't done a business impact analysis, you're potentially spinning your wheels on low probability.
[00:15:21] I impact, but if it's low probability, like how much time should you be spending on?
[00:15:26] Yeah, we got backups of that.
[00:15:28] We've got them in two different locations.
[00:15:30] And in fact, I just sent some to Charles just so he has a copy in Florida because that's a little bit of regional diversity here.
[00:15:36] So I think what you uncovered is brilliant because no one's thinking about this.
[00:15:42] How many would be truly paralyzed if any of those things you're describing happened?
[00:15:47] Yeah, so go ahead, Mike.
[00:15:49] I was going to say, and I agree with that because I started going down that route of like how much time am I putting into something that most likely isn't going to happen?
[00:15:58] But what it really did accomplish is it had us looking at different areas that were impactful.
[00:16:06] So as we were creating these things, we were looking at, oh, we need to have an idea of what security groups are there.
[00:16:10] Oh, why do we have that security group?
[00:16:12] Oh, we need to have a copy of what jobs are running.
[00:16:15] Why are those jobs running?
[00:16:16] So it almost provided a check and balance for us to be able to go through those exercises and figure things out.
[00:16:22] And again, yeah, identifying like, you know, quick conversation of if this product goes out, you know, what is that impact?
[00:16:30] Am I going to be in a point?
[00:16:31] And if it's not going to be a big impact of, oh, I just buy a new product and roll it out.
[00:16:37] Less priority to have those types of run books or playbooks in place.
[00:16:40] But it starting at such a high, doing it at such a deep level allowed you to fall, allow us to really identify a lot of gaps in areas that needed to be addressed in that sense.
[00:16:55] And I kind of find it funny because as an industry, we've learned nothing, right?
[00:17:00] You know, when the Kaseya breach happened back in the day and we have friends who lost access to their tools.
[00:17:10] And we didn't really think about that.
[00:17:15] You know, everyone, you know, we have a couple of people in question.
[00:17:17] We're like, hey, I'll help your customers.
[00:17:20] This guy will help your customers, things like that.
[00:17:22] So this is kind of like business continuity for the MSP.
[00:17:26] It is.
[00:17:27] So a run book lives inside your playbook.
[00:17:29] Right.
[00:17:30] So like you think about run books, like recipes for a, you know, cookies and the playbook is like you're going to have a party that you need cookies at.
[00:17:42] There's so you can have a playbook that has lots of recipes in it.
[00:17:45] Right.
[00:17:45] You can have a playbook that says if we have business disruption where these four critical applications are taken down inside my playbook, I have four run books for those critical apps.
[00:17:56] And I think to the point that you guys started the conversation on is.
[00:18:00] You need to be able to have the conversation of what if it doesn't come back because business disruption, you can only survive so long on that gap of being able to go back to work.
[00:18:11] But it doesn't matter what the scenario is, right?
[00:18:14] The service.
[00:18:15] You can't deliver services to your clients.
[00:18:17] So that means there's a lot of problems coming into play really fast.
[00:18:21] Yeah.
[00:18:22] So what Mike has kind of pushed me forward on is really analyzing each vendor and each technology and where it kind of fits.
[00:18:33] Right.
[00:18:34] Is it nice to have?
[00:18:36] Like, look, if my third party patch system went away, I can figure it out.
[00:18:43] Right.
[00:18:43] But we had a situation.
[00:18:45] I'm not going to name the vendor, but one of the vendors, our database was inadvertently disabled.
[00:18:52] Sure.
[00:18:53] And that software touches 2000 endpoints.
[00:18:58] And we lost all management of that.
[00:19:01] This one was like on a scale of one to 10.
[00:19:03] It was like an eight.
[00:19:04] Wasn't quite a 10.
[00:19:05] Wasn't like a zero being nothing.
[00:19:06] Yeah.
[00:19:07] But for that hour, we're like, holy smokes, what do we do?
[00:19:10] Right.
[00:19:10] But they fixed it within the hour and life went on and nobody was impacted.
[00:19:14] Nobody really knew.
[00:19:16] But what happens if that tech or whoever inadvertently purged our database and it's gone?
[00:19:24] Yeah.
[00:19:25] Which can't happen.
[00:19:28] And again, we are describing a little bit some of the more extremes, not saying that that one can't happen.
[00:19:34] But like, let's use the CrowdStrike one as an example.
[00:19:36] I think it's a great example.
[00:19:37] Like, it doesn't have to be CrowdStrike, but the fact that someone has to be in front of the machine to bring it back times 2000 endpoints.
[00:19:48] That is not realistic for any MSP to do in a reasonable amount of time from a business disruption standpoint.
[00:19:56] It's not possible unless you have some sort of, you know, like every client you have, that there's potentially another MSP that you could call upon and say, hey, I need boots on the street.
[00:20:08] And this client already knows that if they come in with this badge on, they've been approved by us.
[00:20:15] Because you look at the Delta situation and I think it's crazy the way, you know, I'm not going to say that.
[00:20:22] But we'll just say that it's a different, you know, so, so soft and SuperStrike are going after you because they think that you didn't do what you should have.
[00:20:33] Like, I mean, I had this conversation with Matt Horning the other day back and forth in Teams and he's like, don't we teach our clients and our peers?
[00:20:43] Like, you don't just accept free help just because they can come in and help you.
[00:20:48] Like, does the person that's letting you into their office to reboot with the flash drive, how are you deciding that they legitimately are who they say they are?
[00:20:57] And in this case, maybe it's because, well, they're wearing a Microsoft polo, so they must be with Microsoft.
[00:21:02] Like, you know, I think we, I think there's some things to be taken away from this that goes hand in hand with what you're describing.
[00:21:09] Outages will happen, hurricanes and tornadoes and all of those things are happening on a more frequent basis than we've ever seen before.
[00:21:15] Or it's just a matter of time before something gets hit that is tied to critical infrastructure that we have to make pretty, you know, deep decisions really fast that involve like, yeah, that RMM tool has been wiped off the map.
[00:21:29] They have a three to six months timeframe before they'll be able to bring everything back online.
[00:21:34] You're like, okay, I have 30 days max.
[00:21:37] Yeah.
[00:21:38] And it's going through the process of identifying that.
[00:21:41] I mean, one of the big areas that I got hung up on around this was like the amount of work it was going to take.
[00:21:46] Like, oh, I got to do a screenshot.
[00:21:47] Anytime there's a change to a new user or information, someone has to go in and update it.
[00:21:54] Like, it just got to the point where it seemed super unrealistic.
[00:21:57] And chatting through peer group and with others, we found that a lot of times the answers are as simple as we are running reports.
[00:22:07] And those reports are stored in this location.
[00:22:09] Yeah.
[00:22:10] That location is being backed up this way.
[00:22:12] And then there's your redundancy.
[00:22:14] And it just gives people a point to look at.
[00:22:17] So, you know, the RMM jobs, like we have the jobs on a monthly basis being exported and saved to a folder.
[00:22:24] If we need to rebuild jobs, we at least, it may not have every piece of detail in it, but we at least have the names of the jobs of, oh, that one's been created by, you know, the community.
[00:22:34] We can go back and repair it.
[00:22:35] Or we can get an idea of how to recover just based off the title and not having, like, every specific piece of information.
[00:22:42] You can see the recipe in the report.
[00:22:44] Yep.
[00:22:45] You know, you guys have been plugging the Trustmark pretty consistently of, like, things that you need to do.
[00:22:50] And I thought it's important to add to this, like, what you said about storing those reports and keeping track of the evidence that's being automated and placed in different folders.
[00:23:01] Some of the safeguards say something as simple as, like, turn on audit logging.
[00:23:05] It doesn't say review your audit logs every two weeks.
[00:23:08] It just says make sure logging is turned on.
[00:23:11] And for that very reason, right, like, people forget that it's not necessarily about prevention.
[00:23:17] It's minimizing business disruption because we can't stop everything from happening.
[00:23:23] Yep.
[00:23:24] Yeah.
[00:23:24] And what I could tell you is based upon our meeting, what I have started to do is pay a little bit more attention to things.
[00:23:33] It's where we're grading every vendor on criticality, right?
[00:23:40] Like, if they were to go away or if the system was down, how bad is it?
[00:23:43] Or if I had to rebuild it.
[00:23:45] So what we've started to do, and I'm doing my two key thing.
[00:23:49] I'm putting a copy of the video on a thumb drive in the safe in the office.
[00:23:53] And then we're doing another one in a secure, like, online storage thing.
[00:23:59] But I'm just talking to myself in the Zoom video.
[00:24:04] I hit record to this computer.
[00:24:06] Yeah.
[00:24:06] And I walk through all the settings and things like that and the jobs and where to find them and all that kind of stuff.
[00:24:14] So I've started on all of our top tier products making all these goofy videos of me just trying to find stuff.
[00:24:22] So that in the event, tool XYZ, we decide to either move on, go to a different one, or we want to reset, things like that.
[00:24:29] Yeah.
[00:24:30] I at least have something to go back to, right?
[00:24:35] It's kind of like, it's the whole conversation about do you upgrade a server or do you stand up a new one, right?
[00:24:41] Well, if you upgrade it, it's easier, but it comes with all the garbage.
[00:24:44] If you stand up a new one, great.
[00:24:47] But you don't know where the installers are for the old stuff, right?
[00:24:51] But at least if we have, I always like to say, as long as I know where I'm getting to, I'll figure it out, right?
[00:24:58] So I can use my friends and be like, hey, Mike, I forget, how do we install 9.9 again?
[00:25:04] He's like, oh, here's the script.
[00:25:06] Cool.
[00:25:06] And move on with life, right?
[00:25:08] So we have been adopting a better documentation of our own stuff to identify these gaps because of this meeting we had.
[00:25:21] Yeah.
[00:25:22] Yeah.
[00:25:22] And I was going to say, this has also caused me to start pushing our vendors harder too, finding out more like, you know, what do their audit logs look like?
[00:25:33] Because, again, the insider threat, whether it's intentional or unintentional of text going in and doing something, most of the audit logs just show like they signed in and signed out.
[00:25:43] Well, that's great, but what are they doing in there?
[00:25:46] How do I know what's being changed, when settings are being changed?
[00:25:49] Can I be alerted when certain settings get changed or people are in areas that they shouldn't be?
[00:25:56] It's pushing the vendors harder on that, asking more detailed information around backups.
[00:26:01] How can I, you know, even this scenario is like, you know, tech goes in and deletes everything.
[00:26:06] How do I recover from that with your platform?
[00:26:08] And they say, start over.
[00:26:11] And that won't be a good enough answer.
[00:26:12] But at least you have an answer, right?
[00:26:15] Yep.
[00:26:15] Well, and the old school thinking of everything needs to be with one vendor so you have that one throat to choke has been proven inaccurate this year.
[00:26:27] Right.
[00:26:27] This year, last year.
[00:26:28] Right.
[00:26:28] So one vendor who owns a bunch of the tools we have was having a major issue.
[00:26:33] So now I can't get into my knowledge base.
[00:26:36] I can't get into RMM.
[00:26:37] I can't get into, you know, my other tools.
[00:26:40] So we've actually really started to look at, do we really want everything under the same umbrella?
[00:26:47] Right.
[00:26:48] For that, you know, for that exact reason.
[00:26:51] Because what if, you know, they think we didn't pay a bill, so they shut us off, even though.
[00:26:57] Which has happened.
[00:26:58] We've all had this conversation on calls.
[00:27:00] Yes.
[00:27:01] So real quick in the time we have left, I think it's important to note that runbooks, one, they don't have to be manual processes.
[00:27:08] These, in some cases, these can be automated.
[00:27:11] And there's, I think, several different types, some of which I know as you're going through the trust market, you're going to identify like reviewing audit logs.
[00:27:18] No one says Mike Stewart has to review the audit logs.
[00:27:21] It just says that you need to be reviewing audit logs.
[00:27:23] You might be outsourcing that to a third party.
[00:27:26] So you've got some function happening there.
[00:27:29] It is a runbook.
[00:27:30] And if you don't define what that runbook is supposed to look like or understand how that runbook is being used, that's problematic.
[00:27:37] Daily backups is technically a runbook, right?
[00:27:40] For each client, you've got a runbook that says we backup these things at this level of frequency.
[00:27:45] And then the other one would be like constant monitoring.
[00:27:49] That is a form of runbook.
[00:27:50] You're monitoring for system performance.
[00:27:52] Like, why is that CPU pinned out at 100% for the last 24 hours?
[00:27:56] Well, because someone's, you know, moving the entire OneDrive to a flash drive, right?
[00:28:01] So, you know, but then specialized runbooks is what we largely talked about today.
[00:28:05] Like, when bad things happen to infrastructure.
[00:28:09] And I think the one big takeaway that I've gotten from this is you've clearly articulated the number one reason to have tabletop exercises that do not have to be complicated and can easily be accomplished in under an hour.
[00:28:21] Hey, what if it's gone?
[00:28:24] Okay, let's talk through that.
[00:28:25] Because if it's gone, we need to have a plan in place.
[00:28:28] And I know it's an extreme example, but like, we know there are scenarios when something can be down and it can be down for long enough that you're getting phone calls from your clients that make it even worse.
[00:28:39] So what's the plan, right?
[00:28:40] Like, what would you going forward?
[00:28:43] What is it that you're doing with your staff to articulate with your clients?
[00:28:47] Because I don't think this is a, I'm the MSP and we have runbooks.
[00:28:51] This is a largely have to incorporate as part of your playbooks that your clients are part of that conversation so that when they go, hey, I can't do something.
[00:29:01] That their panic is at least reduced to know that they need to make a phone call and have that conversation that says, yep, we've executed the XYZ playbook.
[00:29:10] This is what we're running through right now.
[00:29:12] And, oh, that happens to be XYZRMM.
[00:29:16] We're going through that runbook right now to make sure that this is something that, and we'll let you know as soon as we have an update.
[00:29:21] Yep.
[00:29:22] Yeah.
[00:29:23] Because usually they just want to know, right?
[00:29:24] Like most clients in my experience, even when I was working for the school district, they'd call me even though they knew I was on the phone with support, like find out how long it was going to be before we're back online.
[00:29:32] I'm like, well, the longer I talk to you, the longer it's going to be before I have an update.
[00:29:39] Any last words, guys?
[00:29:40] Because I might go hide in the bathroom now too.
[00:29:43] Yeah.
[00:29:43] Yeah.
[00:29:44] You know, we kind of talked about it, the tabletop exercise.
[00:29:48] Um, I would recommend everybody sit with their team and, and just kind of workshop these like, Hey, because funny enough, I asked my guys and I said, Hey, when RMM is down, what tools, what tools do you use?
[00:30:04] And the new guys are spouting off.
[00:30:06] I'm like, no, what do you, what do you mean?
[00:30:08] Team view.
[00:30:09] What are you talking about?
[00:30:10] Like I pay.
[00:30:11] Oh, Microsoft.
[00:30:13] Yeah.
[00:30:13] I pay for these other two tools.
[00:30:15] I kid you not.
[00:30:16] One of them, he's only been here about a month.
[00:30:17] He goes, Oh, we have that.
[00:30:20] Right.
[00:30:20] It's like, like, yes, yes, we have that.
[00:30:23] So I realized that even my own team doesn't know some of the fail safes that we have built in.
[00:30:29] So we're, we're trying to rectify that.
[00:30:31] I can see an interesting run book along those lines.
[00:30:34] Like these are all the, if then statements in your, in your code that says there's no red alarm going off, but you are being let you, you know, someone's telling you.
[00:30:43] Yep.
[00:30:45] What about you, Mike?
[00:30:47] Uh, again, there's a lot of jokes going around, like the, the sleepless nights that I cause for people.
[00:30:52] And that, that's not my intention.
[00:30:55] Um, but I, I got to share, I got to share, uh, cause she can't just all be on me.
[00:31:00] Like I got, I got to share that the sleepless nights with everyone.
[00:31:03] Uh, but, but it's, it's again, not all always about like, this is going to happen, but it gets you, it gets the creative juices going up.
[00:31:11] Absolutely.
[00:31:11] If it does happen and then it creates it, it makes it less, less scary when something does happen, whether it's that exact situation or, or a variant of it.
[00:31:20] You also made it finite, right?
[00:31:22] So like when we do tabletop exercises, we can keep adding variables in that constantly change, make worse or better.
[00:31:29] However you want to do it in a tabletop.
[00:31:31] When you're talking specifically, as you've got a playbook for whatever it is, you're tracking it in your, you know, uh, as an organization.
[00:31:38] But when you start talking about a run book, you're getting very specific to a very specific thing, right?
[00:31:44] You could say the network, or you could say an application or you fill in the blank, but at least it's finite as far as what's being impacted in the conversation.
[00:31:51] And you're not ending up on this, like, you know, squirrel of like, well, what happens over here?
[00:31:57] And, you know, fill in the blank that now has us talking about something completely different.
[00:32:01] You can easily bring it back and going, Hey, we're talking about our RMM tool.
[00:32:04] Not, not Azure AD, not all of these other things.
[00:32:08] Yep.
[00:32:09] This is great guys.
[00:32:12] Lots to chew on.
[00:32:13] Maybe this will be a workshop we should strategize for CCF or channel con.
[00:32:19] Maybe sooner than that for everybody listening.
[00:32:21] This has been an episode of MSP 1337.
[00:32:24] Thanks and have a great week.

