Overview
Before diving into the concept of what the Modern IR Runbook is, it’s important to understand where it came from. One of the biggest issues in IR is simply that it’s chaos.
Often, there’s a lot of money invested in playbooks, having the right people, and ensuring your team is as prepared as possible. But when the crunch hits, these preparations often fail to function as intended.
There’s a new approach though to set your business and your team up to better handle these incidents whenever they happen, because they will happen. It’s inevitable.
Consider the following scenarios:
Scenario 1:
This will hit home for many of us that have experienced an incident. It’s 3:00am and the phone rings. There’s been a security breach. What’s happening now is that we’re determining whether everyone on the team received the notification. Is this clearly defined for all teams involved? Does everyone know what is expected of them? Is your team truly prepared?
As we set up this communication to inform people, we realize that time is ticking away. And those first couple of hours in an incident, especially if it’s a big breach, are absolutely critical.
Scenario 2:
This is the scenario you’ll be able to experience. You still get that call at 3:00am because threat actors don’t care that it’s 3:00am. This time you have the Modern IR Runbook approach established in your environment. Actually, there are multiple runbooks, not just one that should be integrated into your SOAR platform.
As a result, there is less chaos and more cohesiveness. The team knows what they are doing; they’re already informed about what has happened. They have established communication. Everyone is following the established guidelines, saving time in the process.
“Often, there’s a lot of money invested in runbooks, having the right people, and ensuring your team is as prepared as possible. But when the crunch hits, these preparations often fail to function as intended.”
The Current Playbook Structure
Let’s look at the current playbook structure. We can’t talk about this modern approach without first talking about what we are dealing with today.
These challenges, they really are challenges. You’re probably already using some of these less-than-ideal playbooks in your own organization. How can you transition to a more effective approach? How do you create these runbooks at a high level?
It all begins with an automated, alert-centric response. It will boost your efficiency through the use of technology. Limiting the human element initially ensures that critical first steps flow seamlessly rather than relying on an individual to begin initial triage.
Let’s look at two ransomware scenarios—how this would look if it were applied to a real ransomware case. You’ll then understand that no matter how it starts, you can reach the same conclusion.
Most playbooks consist of many words that do not provide useful information. Anyone who’s worked a case will say that they’re usually full of fluff that doesn’t tell the team what they’re doing, where to look, or how to identify the threat in their environment based on the initial behaviors identified by your organization’s tool(s).
Most are in a Word document format that is too bulky and can contain dozens of pages. If it’s 3:00am and you have multiple teams working on an incident, they must sift through those bulky documents to find what’s relevant while the clock is ticking.
Another format is the decision tree, popular with some organizations and likewise, bulky. There are also the two narrower formats, such as swim lane diagrams and checklists. In any case, if they’re either too bulky or too narrow, missteps happen. If the playbook is too bulky, the team skims through looking for what’s relevant and will miss something important.
Another problem with these playbooks is that they are often organized by incident archetype, such as ransomware, DDoS, data exfiltration, and other common keywords for the different types of incidents. The issue arises when an incident begins with phishing, transitions to ransomware, and ultimately involves data exfiltration, which requires navigating through three different labels to determine the appropriate response.
Long term, this doesn’t work. It may help one case, but the IR team will likely struggle the entire time, going from one playbook to the next to figure out what they must do based on the evidence available.
Finally, playbooks are highly standardized. They start with identify and then go on to analysis, containment, and eradication. As incident responders, we know that most cases rarely run linear. Often times, we find that once we make it to containment, a new finding is discovered and previous steps must be completed again. It doesn’t necessarily work the way it should. Your organization and team must be adaptable during every type of incident and all hurdles that will come their way.
Maturing the Practice
How do we mature in the playbook practice? A good place to begin is by identifying a better approach. Every category of an incident response step, from learning to IT recovery, clearly can benefit from automation.

This doesn’t imply that the humans on your team aren’t essential. They definitely are. For example, today, Tier 1 and Tier 2 are absolutely necessary when it comes to identifying IOCs and TTPs.
With the traditional approach, Tier 1 and Tier 2, are going to look at alerts first and then start IOC enrichment and pivoting on TTPs to try to determine what’s going on. At that point, if it still seems suspicious, they’re going to send it over to Tier 3, which, for some organization, may be the incident response team. They may handle evidence capture, and analysis, and containment.
Depending on your organization, there may be more than Tier 1 through Tier 3, but it’s easy to see where automation can be implemented, especially in Tier 1, which is around the alerting portion. A human should not be the first to handle alerting, given the sophistication of the available tools. Moreover, your organization’s tooling can primarily handle IOCs and TTPs. Essentially, your tooling should do the heavy lifting at first. Tier 1 and likely Tier 2 are not as necessary at that point.
Ransomware is a popular trigger for IR. No matter what ransomware incident you’re working on, the steps are generally the same. If your organization has a ransomware playbook, which may be necessary for a number of reasons, it may be time to reevaluate how useful it is for your team. If a ransomware attack hits your network, it’s too late to open your ransomware playbook. You know you have been hit by ransomware. The real question is how did it get there and what all happened before it executed.
While every organization has different teams running its cases and may have a slightly different setup, there are some that are essential. However, it’s important to note that not every team is involved in every step of the incident process, which means they don’t need to read a 70-page Word document that covers all those processes. All the network team needs to know is where to pull evidence relevant to their role based on the known behaviors identified. . They know their roles; they just need to find the data for the IR team to analyze.
Automated Alert-Centric Response
In the initial triage of an incident, threat analysts use the tools that are available to them. If they are going through IOCs to figure out all the pieces, it’s going to take time. The same is true of TTP pivoting. And of course, MITRE tagging is also possible and should be taken into consideration during this process.
Taking an automated approach to incident response makes all of this critical. MITRE tagging can help your team identify the different behaviors observed in the network and help identify which type of threat your team is dealing with. At that point, MITRE tagging is critical for responding to an incident properly. Many SOAR platforms come with basic MITRE tagging built in but be aware of the specific threats that target the industry of your specific organization. It’s also critical to have those techniques already established in the platform so your SOAR can react to them appropriately.
This doesn’t start with the incident archetypes, but with the actual alert types. It will broaden your starting point to be much more flexible from the outset. By starting with an alert, you don’t box yourself in. The automated alert-centric model can lead the team anywhere, no matter what they’re dealing with.
The benefit is that after setting up your Runbooks this system doesn’t require human interaction for it start your investigations, resulting in lower manual labor costs.
What Does Automated Enrichment Look Like?
In looking at IOCs, what does automated enrichment look like? It’s a very simple idea and most analysts already have a clear understanding of it.

In the above example, IP includes information such as geolocation, which can throw flags when a US-based business gets a login originating from a Russian locale. With a URL , your team is examining information such as the reputation and certificate. These tasks are quick, but it’s a waste for a human to do them when you have tooling that can handle it in significantly less time.
What does pivoting look like? Specifically, it’s a scoring approach. The SOAR platform has already scored the given data by the time an analyst takes a look. At a certain point, it’s critical to pivot from using the tooling and turn over some tasks to the team. You shouldn’t 100% trust the tool for everything. Your tool can’t do beginning-to-end incident response, but it will handle those necessary repetitive tasks to free up your team for more in-depth analysis.
Ransomware Scenarios
Consider the alerts from two different tools: one is the EDR or SIEM tool alert for a malware IP—a very basic alert. Then there’s the email security tool alerting you to a phish. At this point, the team is addressing both a malware IP and a phishing alert. From there, a runbook can run to do IOC enrichment. Example: VirusTotal shows the IP has a relation to a file-sharing site. Your tool also indicates a file was downloaded. Maybe a few other system-standard tools are being abused, likely for reconnaissance.
On to the TTPs: since the IP is known and your tools have identified other potential IOCs, we can see some C2 activity taking place. There’s also some account manipulation in the mix, scheduled tasks for email, and a credential harvester which is tied back to the filesharing site. So, that’s how the threat actors got unauthorized access: someone decided to input their credentials to a prompt. From here, the threat actor moved on to reconnaissance, and established persistence via an email rule.
With the IOCs and TTPs already set up in the SOAR platform, Tier 3 is likely to get notified at this point. Evidence capture can begin, and based off the behaviors, the logs analyzed.
For this specific situation you need evidence such as:
- Email Flow Logs that will provide more details about emails than what’s in the body
- Network logs indicate the presence of malicious traffic.
- Authentication logs
- File capture artifacts refer to scheduled tasks and email rules that follow a high-level naming convention.
What Does Evidence Capture Look Like in the Runbook Form?
First, not all of these things should necessarily be in the same runbook. They should be split up depending on the behavior and relevant team. Unauthorized access should go to the access team, who will pull that evidence. That’s all they need to know based on the case.
Then, on to execution, and from where that evidence should be pulled. That’s all that needs to be known at that point to do the evidence capture. This is just a small example; however, it is clearly how it has excluded the extra fluff and gets your team straight to where the evidence is located.

Lastly, web server logs can often be the make or break in many cases. Sometimes they’re setup to be pruned after a few days, so if not reviewed or pulled quickly, the data is essentially gone. In other instances, they might not be logged correctly. It’s critical to ensure these things are working before a breach or incident occurs.
It is also easy to see how these steps can be put into a runbook and your tool can perform many of the evidence capture steps for more time saving. However, even if your organization uses automation for all of the repetitive tasks, analysis takes humans. So, this is where your people enter into the incident response process.
Not every organization keeps a malware analysts as part of their team, but an automated sandbox can suffice. However, most often, a human is capable of going much deeper to do static and dynamic analysis. Your people are also looking at network traffic and authentication events. One question begets more, and now they have all that evidence captured either by automation or Tier 1 or Tier 2. The runbook tells them where to find specific evidence based on what they are dealing with. . So, what does analysis look like in runbook form? Like evidence capture, this is also very simple. Don’t overcomplicate it. All the analyst needs to know is that when they access the authentication logs, the relevant Windows event IDs are the ones the team needs to focus on.

Then in each category, more Windows Event IDs, may be outlined if your environment is a Windows shop. Then in artifact capture, they’re looking at email rules as mentioned before. Reviewing email flow logs for scheduled tasks is also critical. More than anything, it’s significant to see how much easier this process is than 50–60–70-page documents.

Now that they’ve been able to come to the same conclusion, and much quicker, it’s led them to the same thing. They are implementing host containment, account containment, and network containment measures. A human might have taken an hour to manually analyze the IOCs and TTPs, but automation can complete the task in just a minute. That’s a time savings of 59 minutes already that now went into evidence capture, which, if you had that automated, your SOAR platform would have set this to pull all those logs as well. So, instead of Tier 1 and Tier 2 taking roughly an hour to pull all the logs after remembering where they are at 3 AM. Your SOAR already knew and did it within 2 minutes instead of a couple of hours.
Then the analysts can get in there rapidly to start analyzing, which is very critical. In terms of compliance reporting, it’s important to address the incident promptly to determine if reporting is necessary.
Now is the time to review everything built up in the organization’s incident response program, take it apart, and put it back together. It doesn’t need to be immediately but look at the limitations the current playbooks are putting on the team, especially if it’s an internal team. Examine the current resources and consider how they can be improved. Begin thinking about how to transition from this rigid approach to a more adaptable one.
Remember that this all really starts with that automated alert-centric response. It must begin there. Let the technology in place handle that. The business has invested in those tools, and they are already capable of handling many of these tasks.
The best way to get started here is by:
- Leverage existing tools
- Create a road map to SOAR
- Build off of key takeaways
- Continuous improvement
Simply implementing SOAR doesn’t mean it’s done. It’s just the first step. The platform will need to be monitored, modified, and changed as the threat landscape evolves and as new tools are added.
About the Author
Abby Dykes is the Threat Operations and Forensics Team Lead on the Novacoast Threat Operations team. Specializing in Threat Hunting, Managed Detection & Response, and Incident Response/Forensics. Abby is passionate about helping organizations across various industries further strengthen their security posture. She provides recommendations on how to mitigate vulnerabilities and hygiene issues, as well as eradicate active threats present in their environment.