We spoke briefly to cyberthreat expert Elise Manna from Novacoast on the advent of AI-assisted attacks utilizing two new tools: WormGPT and BlackMamba. Is this a turning point in the security industry?
Eron Howard Hey, Elise. I appreciate you taking the time to chat today. Following your last Monthly Threat Briefing, I want to discuss the contents of your presentation. You know I’m interested in AI and its applications in security, and your expertise is in threats and threat actors, it seems our areas of interest have intersected this month. It feels like a turning point where security experts may struggle to keep up with AI enabled threats. Perhaps you could start by giving us an overview of what’s emerged.
Elise Manna The first framework, WormGPT, was created by an attacker. They are offering a subscription for it via Telegram and it facilitates the creation of sophisticated phishing emails. It allows a good amount of language control with an unlimited character limit. These features essentially allow someone to compose a high-quality malicious email in any language. The second component was a proof-of-concept malware named BlackMamba, created by researchers at HYAS. It’s essentially a keylogger that employs AI in its attack code, even during its execution.
EH Let’s start with WormGPT. When we first experimented with ChatGPT, we realized that this could be used to create phishing emails. Groups like OpenAI have tried to block this type of activity on their platforms, but how have the attackers managed to utilize WormGPT in this way?
EM They have taken the large language model for GPT and modified it. They have effectively bypassed the responsible ethical controls set up by OpenAI by creating their own version of the model. There’s also the component of what they’re calling jailbreaking for ChatGPT or GPT inputs, also known as input manipulation or prompt engineering. This is equally dangerous and it’s being used widely. This has expanded the problem significantly. As those controls are removed, they have complete freedom to manipulate it and they’re just getting started. I believe there are already around 5000 users.
EH That’s a good point. I’ve seen some groups effectively ‘jailbreak’ the system. They’re tricking the large language model into behaving in ways it has been trained not to.
EM Yes, AI is only just beginning to interpret and exhibit human-like emotions. It’s already being used in social engineering campaigns with deep fakes, where the AI is leveraged for malicious purposes. WormGPT is essentially doing this in email, but it’s not going to stop there. I expect this technology to be further developed or features to be added to it for writing malicious code.
EH So, we have AI being used for phishing and this will likely lead to increasingly complex phishing emails being sent in high volumes. Let’s discuss the second part. What happens when someone clicks on one of these phishing links?
EM The potential for AI in malicious code is both terrifying and exciting from a research standpoint. The researcher from Hyas was able to set a couple of realistic goals for the malicious code to evade detection, and they were successful in integrating this with the Black Mamba keylogger. We can extrapolate from the different functions they built into this and apply it to what we know about real-world malware families to predict which ones are most likely to use these techniques. This includes eliminating a lot of the command and control traffic that occurs at the beginning of an infection, and using benign domains for command and control traffic. It’s very difficult to defend against from the Blue Team side. And one of the more concerning aspects of this is AI debugging itself during malicious infections.
EH If I understand correctly, a payload gets on my machine, most likely in memory, and tries to extract data. If it fails, it goes out to OpenAI via HTTPS to a commonly used website and requests new code to bypass this issue?
EM That’s exactly right.
EH Does this spell the end for our industry?
EM I certainly hope not. I’d like to think that our AI will be able to combat malicious AI in the end. But this concept of AI being able to accurately fingerprint a system and dynamically make decisions on how to infect a device or account makes our jobs as defenders incredibly difficult. I think we need to apply our analytical minds to this issue and get ahead of the ways that this technology can be leveraged for malicious purposes.
EH What would AI on an endpoint even look for in this scenario where the malware is rewriting itself on the fly in memory?
EM There are certain technologies that monitor memory, even of legitimate processes, which I find intriguing. I’d love to see more Endpoint Detection and Response (EDR) systems inspect memory for anomalies and injections. Microsoft has tried to make this more difficult on Windows, but bypassing is still possible. I believe the acronym you’re thinking of is ASLR (Address Space Layout Randomization), which is a method to randomize memory. Another interesting method is behavioral biometrics, akin to the old mouse jiggle tool used by many system administrators.
If a process is launched, a website is interacted with, but there’s no evidence of an actual user doing anything on the device, such as keystrokes or mouse movements, it could be suspicious. AI could use these patterns to learn what a typical user does. An endpoint could learn these behaviors from each individual user, then determine if the behavior makes sense.
EH Right, so AI that correlates human behavior on their machine with system activity and alerts when it’s unlikely that the activity is a result of normal human interaction.
EM Absolutely. Similar to the excitement when data science started being applied to security, this could accelerate it. Beyond heuristics, which are based on known bad behavior, we need to move to identifying good behavior. Recognizing anomalous actions is crucial. If we can get our tools to adopt this approach, we’ll have a much better chance of combating AI threats.
EH But with the BlackMamba research already available to attackers, what’s to prevent them from using this technique at scale, thereby bypassing our current defenses?
EM Basic best practices, such as application whitelisting and restricting unnecessary user permissions, can help prevent unknown attacks. I believe a stronger defensive posture is the only way as we can’t entirely rely on detection capabilities.
EH Right, like the idea of whitelisting all processes on a machine rather than just blacklisting unwanted ones has been around for a while.
EM So far, it’s the only thing I can think of. AI, if applied beneficially, can do a lot. Imagine apps that monitor your device usage and advise you to take a break, similar to those on our phones. An AI-powered endpoint agent integrated into an EDR could function in a similar manner, but for fingerprinting user behaviors.
EH So, you’re suggesting that User Behavior Analytics (UBA) could be integrated into EDRs to provide visibility into potentially anomalous behaviors?
EM Yes, and if possible, we should empower it at scale with AI. The logic and the tools already exist, we just need to scale them up. This could also help reduce alert fatigue from tools that are already deployed by making detections better and reducing false positives.
EH Got it.. Thanks for sharing this with us, Elise. Hope to hear some good news next time we talk.
The authors
Eron Howard is Chief Operating Officer at Novacoast, a cybersecurity services firm and managed services provider spread across the U.S. and U.K. Elise Manna is the Director of Advisory Services at Novacoast, specializing in threat response, analysis, hunting, penetration testing, and intelligence. She actively participates in the infosec community, including as a speaker and a volunteer.