By security practitioners, for security practitioners novacoast federal | Pillr | novacoast | about innovate
By security practitioners, for security practitioners

The Future Of Endpoint Monitoring: SIEM, EDR, XDR, and Data Collectors

The endpoint monitoring space has evolved through a number of stages to arrive where it currently is today. Novacoast’s Adam Gray shares his perspective on how the space is moving forward but in many ways is still spinning its wheels.

If EDR and EPP were as good as advertised, we wouldn’t have jobs. We would be done.

But the fact is, even the most advanced endpoint monitoring tools—those that fall under the categories of SIEM, EDR, XDR, and whatever the next big thing will be called—are at best only marginally effective. What follows is some background on the space and where I think we’re headed.

A Brief History 

To be able to properly appreciate the forecast for endpoint monitoring, it’s necessary to look back to where we started and follow the evolution of how the current crop of tools came about. From simple logging to the current trends like SOC-as-a-Service, the tools have come a long way to adapt to modern security concerns.

Data Logging

In the late 1990s and early 2000s a number of groups arrived on the scene, and it was right around that time that ArcSight was founded. It’s even possible that some businesses are still running that technology in their environment.

At the time, what ArcSight did was revolutionary. It collected all of the data, the junk that Operations and Security needed to search though, and made it available in one place. It was a very simple concept.

The problem from the very first moment log management came about was that the data was too big. Many were chasing that dream—thinking if they could only get enough data…more data— they could find the problems. That is still happening today.

If we look at ArcSight in 2000 and Chronicle 18 years later, we see that the feature set hasn’t really changed. There isn’t any magic that happened between 2000 and 2018. We’re two decades in and log management for the most part looks the same:

  • It’s complex
  • Way too much data
  • You’re probably not getting the right data

And in the end, all you’re doing is laying it onto disks and hoping you never need it.

SIEM Correlation: 2006-Present Day

At some point around seven years after data logging became popular, data correlation arrived. Data correlation is the idea that the occurrence of an event of one type combined with an event of another type correlate to indicate a particular problem.

For example, 10 failed logins followed by a successful login in a foreign country that you don’t normally do work in means you may want to check that out. That’s correlation and it continues to evolve.

But these days, everyone wants to talk about Artificial Intelligence (AI) and Machine Learning (ML) and how great they are; you won’t hear that here. They’re not great. They’re dumber than my dog. And they’re not getting any better, not for a long time, despite some incremental progress and hot coverage in recent news that make them seem poised to take over everyone’s job.

Eventually, many vendors into this data correlation space and continue to rehash this concept. Today we know them as Antivirus (AV), Endpoint Protection Platform (EPP), and Endpoint Detection and Response (EDR) vendors.

AV/EPP/EDR

Unfortunately, we are in a worse position in 2023 than we were in 1987. The efficacy rates of these tools are horrific. If I was going to spend my time in what is quite possibly the worst place in the cybersecurity, I would choose Endpoint Protection and EDR.

In last year’s presentation at the Innovate Cybersecurity Summit, I talked a lot about efficacy rates—the very best is still 50% detection rate. That means 50% of the time the tool is just wrong or threats are completely missed. Remember, this 50% is the best it can do. Realistically, it performs closer to about 30%, which means only 3 out of 10 threats/attacks/infections the EDR space is finding.

If we were wrong about that statistic, we could all go home. Job done. If EDR and EPP were as good as advertised, there would be no CISO conferences and we wouldn’t have jobs. The tools would take care of it and we would be done.

In 1987 McAfee emerged, and from 1987 to 2013 not much changed. In some ways, we’ve probably gone to a worse place, which, for us as an industry, has been the removal of what we know about a specific file or connection and the insertion of behaviors.

It was common to disregard our little computers as not strong enough to make real-time decisions about all the data that we know about, so instead it became: focus on building behaviors to tell us if something is good or bad. We’re not going to look at the knowledge we already have about a file.

So, if the world already knows this file is malware and bad, the EPP vendors don’t take that into account in a real-time position to make a decision on whether to act. Instead, they look at the behavior of the file to say, “oops, I’ve got to kill that.” That’s not good. And there’s no easy answer to fix this either.

If EDR and EPP were as good as advertised, there would be no CISO conferences and we wouldn’t have jobs. The tools would take care of it and we would be done. We’re all here because these issues have not been solved. We’re not there yet and there’s still a lot of work to do.

NDR

If I look at Log Management, SIEM Correlation, AV, EPP, and EDR, then look at where we are headed, the last piece to talk about from a historical perspective is NDR—your network devices.

We’ve been doing this for a long time. When I dug this up, I thought I have to remember Zeek (formerly Bro) and I couldn’t remember when they started. I was shocked—1995. They’ve been around forever.

Unfortunately networking stuff is worse off than even EPP and it’s not the fault of the networking side of the vendors. The speeds to keep up with bandwidth are not really the problem. We have good, fast network cards, capture cards, and all kinds of interesting things there. The real problem is cryptography.

We encrypted everything from an endpoint and left the same technology on the network to try and decipher it. Unfortunately, when you encrypt everything on the network you’re left with extremely limited options for what you can do.

Some will say: “but I can upload certs and decrypt them.” Yes, right up until TLS 1.3 and then that went away. Now you’re in the position where you can’t man-in-the-middle this stuff. You can’t decrypt it. You’re putting terabytes of data across the wire and expecting to do something with network tools.

Here’s what we have today:

  • Log Correlation that hasn’t changed in 20 years
  • EPP that has probably gotten worse
  • Network Defense Tools hobbled by encryption

And what has that turned into?

XDR

It’s turned into XDR. Everyone’s excited about XDR, which is understandable in an industry where the promise of a magic bullet always gains momentum, especially in marketing. Here’s what we’re going to do with XDR:

We’re going to combine all that network data that’s useless with all that log data we can’t keep up with, then combine that with all the EDR data that has a 30% efficacy rate…and this is going to solve all our problems. This is what we’re building today. Everyone’s excited. This is fantastic. I hope the sarcasm is obvious enough, because it is the story of XDR.

I would argue that it’s all still just tools and if you don’t have the necessary people behind it, it’s pretty useless. It’s just more data that we can’t use in an actionable way, so don’t get caught up in the hype.

With that brief history lesson, we’re caught up to present day with some idea the landscape of endpoint protection and how it evolved.

SOCaaS

The final entry in endpoint data collection history is the recent concept of SOC-as-a-Service. This adds people and processes atop the technology, which is good because if you don’t have the people and the processes you will need to find someone who does.

Where We’re Headed

The thing we have to accept is that technology changes. When you sit down to evaluate the technology of collecting the data there are some important considerations.

Will it scale to the level you need for your organization? There are some ugly and challenging things to deal with all the way down to:

  • What programming language will be used to build your data connectors?
  • Is it something I can operate?
  • Is it something that will be fast enough?

We’ve learned some things over the last year building data collectors for other groups. Over the last decade we’ve writing them in both Java and Python, which were the popular languages used for data collection and the predominant languages for picking up, parsing, and spitting out normalized data regardless where it would be stored. It almost didn’t matter where it would be stored—the backend matters a lot less than how you capture it.

Recently we rewrote some of our Python collectors in the GoLang. Why? We found a 100x speed increase on the same data collector and in some cases a 1000x speed-up in the ability to capture and write data. So, when I look at large infrastructures that we are capturing data from, one of the questions that continually comes to mind is which language are we building this in, and why?  

That might seem like a concern that’s really off in the weeds, but at scale it could mean the difference between running 5 machines or 500 to capture data. And that has serious cost ramifications for capturing large amounts of data.

What are we doing to make sure we’re capturing the data at the right speed? That’s a very different question from are we capturing the right data? But are we capturing it the right way? Is it efficient?

Do we have the necessary data to make good decisions is a whole different question as well.

As for storage backend, it’s not much of a conversation because practical options are limited—it’s either Elastic or something using Hadoop, or one of these vendors has invented their own. Who are we kidding? It’s Elastic or Hadoop and they just don’t want to tell you.

Market Consolidation

What else is affecting this space? Market consolidation. Beyond technology changes, this is the next most important factor. The market will continue to consolidate, and the reasons are clear.

Driving Factors

Funding is slowing, for one. And some groups with parent companies are being combined. For example, we saw Fishtech and Herjavec get jammed together—why? Money was getting more expensive, and the private equity firm that owned them both said “you guys will get along great! Now you’re a new company.”

That’s going to continue to happen. You’re going to see a lot of consolidation in this space, both from a services/people perspective as well as a product perspective. A few things will drive it, the biggest being that money is more expensive as interest rates are trending up.

Without access to cheap funding, groups that are not in a position to continue operating on their own—or have aspirations of growing bigger than they can with the funding they have—are going to get acquired.

It’s been repeated over and over and over, this wave of acquisition, consolidation, investment in new companies and startups followed by acquisition and consolidation. This is not a new topic.

I think what is interesting is inflation and interest rates are driving this rather than other factors. We haven’t had interest rates drive like this in a couple of decades. And with money no longer being free the collapse of organizations into each other is happening at a great rate and the big guys are capitalizing on that.

Google acquired Mandiant at the fair price of 5 billion, but 2 billion was cash so actually it was 3 billion for 1600-1700 people. That seems a little high, a little stiff to me.

Then they acquired Simplify and invested in Cyber Reason, and that’s just Google’s activity. Crowdstrike went out and bought Humio and a handful of other groups.

All of the players are consolidating their investments. If you’re an EPP vendor you’re consolidating your investment with a SIEM vendor. If you’re a SIEM vendor you’re consolidating with EPP or something else.

SIEM/EPP Consolidation

On the SIEM/EPP side we’ve seen this for a long time. IBM entered the space with Q1 Labs. ArcSight was bought and sold half a dozen times. Humio was acquired by Crowdstrike. There are only a couple unique factors driving it.

One is the funding side of it, but the other is everyone is racing toward the same goal. Your SIEM and EPP vendors are really the same thing. They are data aggregators, that’s it. They just aggregate data.

Their function is purely to aggregate the data about your environments so they have access to it. If we take that view and look at who else aggregates your data and why, there is one other group that is key in this space…

Vulnerability Management. They aggregate all your data, too.

Many people focus on collecting firewall logs, but they’re useless. Firewall data is all encrypted these days. The best you’ll get from them is source port, destination port, source I.P., and destination I.P. Maybe some volumetric data. There are better ways to get that data.

It’s Just Collecting Data

So, we’ve got:

  • VM Players
  • EPP Players
  • SIEM Players

They are all vying for the same thing: who can collect the largest amount of your data? That’s really what the future looks like for this.

The biggest, most obvious space for this is Qualys and Rapid 7 who’ve both now entered the market. They are long time Vulnerability Management players and if you were to ask them what they’re working on, they’ll tell you two things:

  1. They really want you to use their patch management capability.
  2. They really want you to use their SIEM capabilities.

Why? It’s an opportunity to market another data collection service.

What we’ve seen over the last twenty years is that hoarding the data is strategic. It’s very difficult to move from one vendor to another and migrate the data. Very, very difficult.

And the data has value in and of itself. As an organization you will have to start considering how to evaluate the groups you might take on for SIEM, EPP, NDR, and XDR or whatever we’re going to call it in the future. How am I going to take those on and do I want them to have that amount of data?

And, unfortunately, the answer is yes, you do, for a few different reasons.

Market Specialization

You’re going to run into market specialization groups. If you’re a big operational technology (OT) shop, you probably know Dragos.

If you’re a financial shop, you might talk to Adlumin. Their whole world is creating datasets for those specific organizations in the finance space.

Are they good? Are they not good? I don’t know, yet. Too early to tell.

The other side of it is the cloud players. They all want all of your data. They don’t want just your email, they want everything you produce living in their shop. All of it. Like, bar-none. They would like it you didn’t own a single computer that doesn’t directly connect to them. That is the eventual goal: you won’t even store anything, you won’t even need a hard drive in the future. It’s going to be, “here’s your PC, it connects to my cloud.” That’s it.

In that space there’s obviously Sentinel from Microsoft, then Chronicle from Google, and Open Distro from Amazon. I don’t think they accurately address the problem today, but I think they will soon. These guys are all driven by the same thing.

Driven by Visibility

The major cloud security vendors are all going to be driven by having visibility into your environment. The term observability is what’s becoming common in this space.

If I think about the cloud vendors wanting to capture all of your data, it’s clear that’s a great way to capture a customer. But ultimately observability is what you’re looking for, and scale.

We know that having this stuff on-prem has been difficult in the past. We know that collecting the right kinds of data has been tough. We know consolidation in the market is happening and there’ll be a handful of players to do it.

But one of the key components that’s interesting in the space, and sort of the future for us, is datasets. Datasets should not be an island for individual corporations. If I can combine a thousand corporations’ data into one dataset, my threat hunting becomes more accurate. So, if you spin up your Sentinel environment, as an example, you’re only really scanning your own set of data and your own infrastructure. It lacks the benefit of knowledge about other environments.

The long-term view is we’ll stop separating individual datasets, and instead we’ll put them in the same place—much the same way that VirusTotal has become the repository for what we know about files. They’ve done a pretty good job at it. They get a lot of submissions and a lot of data every day and while I wouldn’t say to use just that as a hunting source, it also provides a great view into what’s happening in the market and what efficacy rates on EPP really are.

If you sit on enough file hashes, you can do some really interesting queries. But looking at that, visibility and observability are really key.

Back to VM, SIEM, and EPP vendors: they all have one thing in common. They want to own your data. That’s it, bar none. They don’t care if they stop attacks, they don’t care if they’re effective, they don’t care if they have any effect on your security problems. They want to own your data.

That is really the future of where we are going in this industry. That’s probably not the message you want to hear. But I think for any of us to be effective we have to focus on a few things, and these are the hard problems.

Unfortunately, it’s not more AI and ML—that’s just not going to be the answer.

Service and People Are Key

At this point, we’ve arrived at the question: “how can I make this mountain of data that does nothing by itself actually provide value in my security strategy?”

The answer is people and process. Specifically, building processes to utilize security data and staffing with qualified people who can build, operate, maintain and respond to what is detected in the data. The trajectory of how this unfolds will be driven by hiring practices and how staff is built up.

The bigger vendors have already figured that out, case in point:

  • Google acquired Mandiant
  • Herjavec and Fishtech merged
  • Proofpoint hired a small DLP player to run their services
  • Sophos acquired a small SOC player to run their services

These things are all happening because people are the key commodity in the space. Managers and CISOs are all struggling to find staff right now. No one is as fully-staffed as they’d like.

Looking at the problem of collecting, consolidating,  and storing data, as well as making it easier to search across multiple organizations—that is what your vendor’s long term goal is. Your goal should be finding the right people to help with combing through the mass of data and making use of it. That’s the current challenge and where the focus has to be right now. The technology of capturing the data is not the hard part anymore.

The answer to these challenges is not going to be AI and ML—probably not until we’re all retired. It may happen at some point, but it will be a while.

Conclusion

In this final section, I’ll offer a few distilled takeaways that should stick with you in your planning:

EPP efficacy 50% is your best bet, but probably closer to 30%. Sometimes as low as 9% depending on the vendor you’ve chosen.

Legacy SIEM is easy to beat for value. If there is one thing you takeaway from this article, it this: if you’re running a legacy SIEM player, it’s easy to beat in today’s market.

Many people focus on collecting firewall logs, but they’re useless. Firewall logs are all encrypted these days. The best you’ll get from them is source port, destination port, source I.P., and destination I.P. Maybe some volumetric data. There are better ways to get that data. This piece of advice will be difficult to follow for regulated industries, but start capturing Netflow data and stop capturing your firewall data. Nobody cares about it and it’s not going to solve any problems.

When your EPP tooling detects a problem, that shouldn’t necessary be a relief. At that point you should be asking: “what were the other seven items it missed and what should my investigation now look like?”  

I prefer the “nuke and pave” strategy of when something gets hit, wipe it and start over. I know it’s controversial, but it’s a much easier and cleaner policy. I’m not going to say you can’t take an image of it for forensics, but you can never trust that computer again. If it was hit by something that was bad and you know your efficacy rate is 30% on a good day, how do you trust that it will be ok in the future? My advice is: don’t. Wipe it and move on. Your ops team probably doesn’t want to hear that, but that’s the model I run on.

We know Machine Learning and AI aren’t going to be able handle these issues for a long time, despite how hot the news headlines are.

The one thing I want you to walk away with is: we’re all here because these issues have not been solved. We’re not there yet and there’s still a lot of work to do, despite a lot of conflicting messaging.

The author

Adam Gray is Chief Technology Officer at Novacoast, a cybersecurity services firm and managed services provider spread across the U.S. and U.K., and Chief Science Officer at Pillr, a dynamic cybersecurity solutions platform providing SOC-as-a-Service. This article was translated from his presentation at the 2022 Innovate Cybersecurity Summit in Scottsdale, AZ.

Previous Post

The Future of Identity

Next Post

Weekly Top Ten Cybersecurity Stories – 4.21.2023

Innovate uses cookies to give you the best online experience. If you continue to use this site, you agree to the use of cookies. Please see our privacy policy for details.