Computer Screens with Surveillance CCTV Video
Editorial

When AI Learns to Hack: Inside CMU and Anthropic’s Cybersecurity Experiment

5 minute read
Sharon Fisher avatar
By
SAVED
What happens when AI becomes the hacker? Inside Anthropic and CMU’s Incalmo Project, and why it signals the future of machine-scale cyber defense.

When Brian Singer merged his cybersecurity research with the Claude large language model (LLM) from Anthropic, he described it as a shocking moment.

“Suddenly, the LLM was able to hack test networks,” Singer said. “Before, LLMs were good at finding vulnerabilities and showing promise. Suddenly the LLM was able to do an end-to-end attack, install malware on hosts and infect multiple hosts throughout the network.”

ChatGPT Sparked a New Wave of Cyber Threats

AI is 95%-100% likely to make cyber intrusion more effective and efficient

- UK National Cyber Security Centre

This wasn’t the first time people had tried to use AI for hacking. Just months after ChatGPT came out, computer security vendors were reporting that it could be used to help generate text for phishing email messages and even for generating software for the attack itself (using vibe coding, though that term wasn’t a thing yet). 

“[Check Point Research’s] analysis of several major underground hacking communities shows that there are already first instances of cybercriminals using OpenAI to develop malicious tools,” the company wrote in a 2023 blog post. “As we suspected, some of the cases clearly showed that many cybercriminals using OpenAI have no development skills at all." Although the tools presented were pretty basic, the company added, it was only a matter of time until hackers and bad actors used AI-based tools in more sophisticated ways. 

AI is 95%-100% likely to make cyber intrusion more effective and efficient, leading to an increase in frequency and intensity of cyber threats, predicts the UK's National Cyber Security Centre. 

There is also an 80%-90% chance that AI-powered cyber tools will make hacking capabilities available to a wider range of both governments and non-state groups. 

Related Article: AI Cyber Threats Are Escalating. Most Companies Are Still Unprepared

CMU and Anthropic Built an AI That Could Hack Like a Human

What was different about Singer’s project is that, once set up, the software could run itself, essentially duplicating one of the largest cybersecurity attacks — the 2017 Equifax data breach — automatically.

Singer, a graduating doctoral candidate working on a PhD in electrical and computer engineering focusing on cybersecurity, wanted to see what would happen when he put cybersecurity and LLMs together, a project that he named Incalmo. “It’s a glassblowing technique from Venice,” Singer explained, adding that his grandfather was a glassblower. It involves joining two pieces of glass, typically of different colors, for an artistic effect, which he saw as an analogue to his joining AI and cybersecurity. 

“I’d been doing a lot of work on network cybersecurity, network attacks and creating autonomous systems with autonomous defenses and attacks,” Singer added. “My inspiration is that it was hard, and I was trying to make it easier for humans.”

He started working with Anthropic because he had CMU alumni friends who worked there, but the research was conducted with other LLMs as well. “We studied ChatGPT, Google Gemini and open source models,” he said. “The big providers, ChatGPT, Google and Claude, were the best models.” He also worked with MITRE open source tools for some early prototypes, he added.

Anthropic Tests AI’s Limits in Hacking and Defense

According to Anthropic officials, they worked with CMU so that the company could be better prepared for what cybersecurity capabilities its AI system might have. “This evaluation infrastructure positions us to provide warning when the model’s autonomous capabilities improve, while also potentially helping to improve the utility of AI for cyber defense,” the company wrote in a blog post. “As models improve in their ability to use extended thinking, some of the abstractions and planning enabled by a cyber toolkit like Incalmo may become obsolete and models will be better at cybersecurity tasks out of the box." 

Singer warned other LLM companies as well. “Before the research, we told the LLM providers that we have this capability,” he said. “This wasn’t a vulnerability disclosure, but we thought it would be ethical for guardrails.”

This project doesn’t let hackers immediately run out and start throwing this software at enterprises, but it was a proof of concept, Singer said. “If you asked it to hack a network, it wouldn’t work well,” he said. “It’s not like an LLM could take down the internet. It’s a possible thing, a cool approach. Right now, there’s 40 networks it could work on. But the diversity of real world networks is much more complicated.”

How AI Could Transform Cybersecurity in the Next 3 Years

“In the two- to three-year longer term, security is going to drastically change and become way more autonomous." 

- Brian Singer

Founder, Incalmo

The value of the Incalmo project is that it helps IT departments learn to be prepared against such AI-powered cyberattacks, Singer noted. “It’s definitely possible that real attackers could use this technology, but it tends to be that the benefits far outweigh the risks. If you look at history, some of the best defense tools are these offensive-type tools. If you can preemptively attack your network before a real attacker does, you find your blind spots.”

What it does mean is that enterprises should be looking at defending against autonomous attacks, Singer said. “In the two- to three-year longer term, security is going to drastically change and become way more autonomous,” he said. Attacks will be quicker, with many more of them simultaneous. “Enterprises are going to need to have machine-scale defenses, and some front-line defense that’s automated. In most security operations centers, it takes 15 minutes before a human sees it. Once these tools get adopted more, the pendulum is going to swing a lot more to bots vs. bots.”

Currently, testing for such vulnerabilities requires hiring “red teams,” which can be expensive. With tools based on his research, this testing could be automated so that more companies can afford these tests.

Related Article: AI Risks Grow as Companies Prioritize Speed Over Safety

Incalmo’s Next Chapter: From Research to Startup 

In fact, Singer has founded a company — as a solo entrepreneur for now — that could do just that. Called Incalmo after the project, it’s raised pre-seed funding from Pear VC. “I’m actually going out there in January as part of their accelerator program,” he said, a 12-week session called PearX

Learning Opportunities

His intention is to find out how to commercialize the research, perhaps as early as the January to March timeframe. “My end goal was to provide defenses, so I want to follow through with that,” Singer explained. He’s working with some design partners, learning their use cases, thinking about some mini products and basically seeing how the technology works. “I don’t even know what my product is yet." 

However, it would likely be based on open source. “All my intellectual property has been open source. I’m pretty adamant on open sourcing. From the research side of things, if you use public dollars, it should be for the public.” All of his research is “public and open the world,” he said.

One thing is certain: He won’t be alone. “There’s a lot of startups in this space, big companies integrating AI into workflows,” said Singer. “The research is open source. Companies can check out the research and try out the code. It’s really early days — it’s not where a company can buy products off the shelf.”

fa-solid fa-hand-paper Learn how you can join our contributor community.

About the Author
Sharon Fisher

Sharon Fisher has written for magazines, newspapers and websites throughout the computer and business industry for more than 40 years and is also the author of "Riding the Internet Highway" as well as chapters in several other books. She holds a bachelor’s degree in computer science from Rensselaer Polytechnic Institute and a master’s degree in public administration from Boise State University. She has been a digital nomad since 2020 and lived in 18 countries so far. Connect with Sharon Fisher:

Main image: Gorodenkoff on Adobe Stock
Featured Research