[Webinar] Mythos and Project Glasswing: A Practical Look at the Future

Anthropic’s announcement of Project Glasswing and its new AI model Mythos has set off alarm bells across the security community. Touted as a powerful defensive tool for identifying vulnerabilities, Mythos is raising an important question: Are we empowering defenders or giving attackers a roadmap?

In this webinar, three seasoned cybersecurity and intelligence experts discussed what Mythos actually does, why it’s causing concern, and where the real risks (and non-risks) lie for security teams.

What you'll learn:

What Project Glasswing and Mythos actually are
Why vulnerability scanning at scale is sparking concern
How attackers might attempt to leverage tools like Mythos
Which concerns are valid vs hype, based on real-world security practices
How modern security platforms should already be designed for this reality, without needing major re-architecture or panic-driven changes

While we have a substantial update to our endpoint agent coming in Q2 that will absolutely help detect and contain AI attacks, there are a few platform-wide changes that are more broadly tackling new risks introduced by AI.

The first is a new AI Detection and Response capability. The first phase is answering the questions “what AI is running on my network, and who is using which AI suite/service/etc?”, all the way to being able to lock down AI to very low granularity.

The second is a bit more sensitive, but is targeted at the n-day problem, where it becomes a race between attackers launching AI-driven exploitation attacks, and the time it takes to patch networks. We have something very novel coming in this space, but we can’t quite talk about the details yet as it’s still in testing.

Field Effect MDR is the only tool I would recommend. I would argue that we were one of the first vendors leveraging AI throughout our entire platform and MDR service. We were actually recently granted a patent on our novel approach, which is a big reason we now have an 18 second MTTD with our endpoint agent. Other vendors use AI obviously, but we are now in a refinement phase to perfect our strategy, rather than just now (or recently) bringing it to market (like most vendors).

The bigger issue worth focusing on is that AI is enabling the creation of an enormous amount of software that hasn't gone through proper security evaluation. Vulnerabilities are being introduced to the wider software ecosystem faster than they're being fixed. From the information we currently have, it also looks like it will be more expensive to detect and fix vulnerabilities with AI than it is to have AI create software.

There will likely be a lot of churn in systems where AI is being used to find zero-days and then used to fix them. But there will still be zero-days that AI isn't good at finding, and the price for those types of bugs will likely increase.

In terms of defensive capacity, we've already seen a deluge of AI-generated bug reporting across the open-source community. The curl maintainer Daniel Stenberg is a useful litmus test here; curl has been through peak slop reporting all the way through to more recent improvements where legitimate bugs and vulnerabilities are being surfaced.

As models continuously improve, so will the bugs they find. Nothing breaks all at once, but the pace keeps increasing.

AI companies are addressing this in two ways:

Adding guardrails that prohibit prompts attempting to develop exploits or create worm-able code, while allowing vulnerability research; and
Limiting access to models like Mythos and GPT-5.4-Cyber

We’ve seen both bypassed recently, (1) saying you’re doing a CTF will bypass these protections and (2) the recent news of unauthorized Mythos access. The left hand isn’t doing a great job.

We do have our own cyber range that we use for all aspects of testing, including AI testing. We do use open-source agentic harnesses selectively, but we have matured our level of automation over the past 15 years to the point where both open-source and closed-source agentic AI is still catching up.

AI is a force multiplier, learn how to use it to your advantage and find ways to let it clone yourself. At the same time, the fundamentals still matter. A solid grounding in cybersecurity basics remains essential, because you need the right foundational knowledge to effectively leverage the AI tools available to you.

It increases the need for cybersecurity professionals, and makes it a good time to be investing in the space.

Think back to the wild-west Windows XP days, followed by the rapid build-up of exploit mitigation features, then the tides shifting back as exploitation scored some anti-mitigation wins, and so on. That cycle continues.

The Shadow Brokers leaks are worth reflecting on here too, and that situation was much worse, as those were highly productized zero-days that attackers could integrate immediately. It takes a long time to go from a bug you can hit 5% of the time in a test environment to a working chain with 98% coverage and minimal failure rates.

That said, equilibrium will likely be reached, and defenders have the initial edge on cost. It will be much cheaper to find and fix bugs than to find the rare few that can be productized into reliable zero-days.

AI doesn’t solve the problem of creativity and uniqueness, it solves the problem of scale. This is a new world filled with enormous opportunity to iterate incredibly fast, at scale.

I can’t speak to Anthropic’s position. If delegated agent identity becomes a priority, the likely approach would be tool integrations with user keychains, whether through MCP, a local extension, or the necessary apps.

It's also worth noting that this is an area where Field Effect's EDR capability and AIDR will help afford protections to how agents interface with the real world.

Cybersecurity has always been asymmetric, and AI doesn’t solve this asymmetry.

On the defensive side, AI will help:

Bug discovery in existing software at scale
Improve the code quality of new software
Automation across various domains (triage, SOC, etc.)

On the offensive side, AI will help:

Vulnerability discovery at scale and exploit development
Automation of reconnaissance and personalized social engineering

The bottom line is that existing software is vulnerable to models today, which currently favours the attacker. But as those bugs get patched and developers adopt the same models, newer code should become more resilient against these same attacks. Over time, the balance will shift toward defense.

According to Anthropic, while Mythos is great at vulnerability discovery, it still requires humans to create the mitigations. It tends to fixate on a single root cause and can confuse correlation with causation.

The same core protections still hold. AI doesn't produce novel attacks, so existing frameworks like FERPA remain relevant.

That's a conversation to have directly with Anthropic, OpenAI, and similar providers. In terms of open-weight models, I’d recommend looking at hosting GLM 5.1 or something similar that does well on CyberGym.

The discussion so far has focused mostly on software vulnerabilities, but AI tools are already having an impact across many other cybersecurity domains. On the offensive side, AI can help an attacker survey a network, propagate through it, and elevate privileges once they've gained a foothold. I’d expect similar capability improvements expected in network penetration testing, web app security, and beyond.

On the defensive side, frontier models are already being used to manage network configuration and security in meaningful ways. As has been said throughout: this is a tool that should be part of your workflow. If you're not using it to find weaknesses in your own defenses, you're leaving that work to attackers who already are or soon will be.

It looks like Anthropic has done a really good job of marketing this since we are all talking about it. But from reading through the information they have published so far, it seems like most of their claims, although maybe too alarmist, are legitimate.

Given the trend we’ve seen over the past eight months in the advancements of agentic models, their claims aren’t asinine. And it looks as though OpenAI is catching up to Mythos with GPT-5.5 if the benchmarks are right, which adds validity to their claims.

Things like bug bounty program websites and known security researchers’ laptops have always been high value targets for attackers trying to get information on vulnerabilities before they get patched. This doesn’t change. Like everything else, the need for good security controls just increases.

The very best hackers have always been able to bypass almost all security solutions. The new AI tools are providing an avenue for everyone to level up their skills, so mediocre hackers can now have the same skills that were previously only known to the very best. The theme here is that attackers are all going to be able to up their game (with enough money) and defenders are going to need to do the same.

That said, we’re not at the point where these models will find novel exploitation techniques. Practicing good cybersecurity hygiene in high-security environments is still effective.

At Field Effect, we don’t allow any customer data to be exposed to cloud models. For us, Mythos et al. enables development of analytics and tooling across the board, and it powers a tremendous amount of our automation. Only local open-weight models can be used to actually churn through datasets. This symbiosis between cloud frontier models and local open-weight models works well.

I’m sure this equation changes when you look at other vendors who are moving at machine speed.

Having a good understanding of the cybersecurity basics is still going to be very important, paired with learning how to take advantage of the AI tools available.

And this insight is valid across all domains impacted by AI: with great power comes great responsibility.

Even if it were globally available right now, there would still be significant costs to using it for exploit generation. It wouldn’t be complete mayhem right away. It's also worth noting that the stated purpose of Project Glasswing was to give key software companies a head start, allowing them to patch the most obvious vulnerabilities before attackers get their hands on Mythos.

The size of this model (and other frontier models) requires significant computer and energy to fully utilize. A leak would only really expand the group of nation states currently using it.

I have not seen any successful approach to this yet because AI is the first example of a cross-domain challenge. There are vendors that do zero trust applications or networking, but those approaches lack cross-domain completeness to truly apply the model to AI. This is precisely why we are building exactly this, and why Field Effect is the only vendor that can deliver it. We are the only vendor that has endpoint, network, cloud, DNS control and external monitoring in one suite of monitoring and protections. Any gaps in these capabilities leave a hole in the zero trust concept with AI. Plus, we don’t just process AI log files – we actually verify that AI is doing what its configured to do. It’s going to be a very exciting year.

Field Effect has not been given access to Glasswing (we tried from different business and social angles). However, I don’t think it matters at this point given OpenAI’s models (GPT-5.4-Cyber and GPT-5.5) are proving to be just as good and are much more accessible. We are actively investigating any impact from these new models, but thus far our instincts that they produce more of attack types that are already known, rather than new attack types, is being confirmed. I suspect that fast forward three months that Mythos and Glasswing will become over-hyped events in time.

Guardrails on usage are one option, and another is the approach OpenAI is taking with their comparable model: recording everything a user does to create trails of accountability. But it's been demonstrated repeatedly that AI guardrails can eventually be bypassed and models can be cloned or distilled, so we need to prepare for a world where this technology is available to everyone.

The broader takeaway: don't rely on Anthropic or anyone else to stop threat actors. Defenders need to stay on top of the latest technology to ensure there are no gaps for attackers to exploit.

Anthropic hasn't provided many details on this. If it were my company, I’d be nervous sending all proprietary source code over the internet for AI analysis. Perhaps they have some options to allow for some self hosted analysis so Microsoft could keep their proprietary code on premise.

That said, there's already been significant work on using AI to reverse engineer closed-source software and find vulnerabilities, and that research shows no signs of slowing with the release of newer models.

Tools like GhidraMCP are already highly effective with the latest frontier models, as are IDA, Binary Ninja, and radare2, all of which have MCP servers.

As we've seen with ARC-AGI-3, these models can't learn what they haven't been trained on. If defenders have sufficiently vetted their attack surface with Mythos, it's unlikely an attacker using the same model would find something new. Though not impossible.

None that we’re aware of.

For defenders, AI will enable bug discovery in existing software at scale, improve the quality of new software, and drive automation across various domains. For attackers, it will accelerate vulnerability discovery, help productize exploits, and automate reconnaissance and personalized social engineering. The human analyst's role shifts accordingly: less time on tedious work, more on judgment, context, and the things AI still can't reliably do.

All models, including Claude Opus, continuously improve with each new generation. Better models find better bugs and vulnerabilities. Field Effect will always be on top of that curve.

Info-stealer malware isn't new, and the point of ingress is almost always human. Add to that the rise of vibe coding, where developers are publishing secrets directly to public repos at increasing rates, and the exposure surface grows considerably. Developers need to be vigilant and practice good cybersecurity hygiene, and they can use these same models to help them do exactly that.

The data is already telling a clear story. Firefox has reported a significant uptick in CVEs year-over-year. Daniel Stenberg of curl reported a doubling of reporting this year, after a doubling last year. Higher volume, higher quality. That’s just to name two of the larger open-source projects currently impacted.

Beyond vulnerability discovery, we also know AI is being used for deep fake video calls to trick users into providing remote access to attackers. Both the quantity and quality of attacks are increasing thanks to generative AI.

Some AI vendors, including OpenAI, already require Know-Your-Customer (KYC) verification to access their more powerful models, with an additional layer through their Trusted Access program. Whether identity proofing actually works is another question altogether.

In any case, you can also bypass OpenAI’s KYC by using a middleman like OpenRouter.

In a defensive context, by utilizing PQC best practices. As a workforce enabler, quantum computing isn’t ready.

Prompt injection is used quite heavily. In Claude Code, for example, they have a multitude of scenarios where they inject into (modify) your prompt to get better results out of the model. This applies to other vendor’s software similarly.

OWASP has a good guide to protecting against malicious prompt injection. And obviously you can use Field Effect’s AIDR.

There is also prompt poisoning, which is similarly dangerous as it corrupts a model’s internal state.

Absolutely use these models to automate your workflows, but do it responsibly and put security precautions in place to protect your core IP.

The latest generation of frontier models are all broadly comparable. Do your research, know which ones perform best for your specific tasks, and don't lock yourself into a single model. Models are commodities, make sure you can pivot to another vendor as needed, whether for cost savings or quality improvements.

TL;DR: Pick the right model for the job, and don't be afraid to switch as the landscape evolves.

This one is a tough one because the right answer is "all software should be patched immediately," but that can be a daunting task and comes with organization complexity. I think getting operating system patches done as quickly as possible is a must-do, and then prioritizing third party software patching based on how critical particular machines are (i.e. Domain Controller vs workstation). Another factor should be business criticality - where is your data stored, what services are you running, etc. That should help with a heat map of where to start.

For what it's worth, we have something coming later in the year that will be a massive leap forward in solving this problem for SMBs. I just can't quite talk about it in full yet. But I agree with the urgency behind the question – it’s been a problem we have observed for years and we really want to help solve it.

It's ultimately an organization's choice whether to restrict AI use given the risks involved. Field Effect takes a strict AI tooling policy, only allowing vetted technologies. It’s a sound stance, because a lot of AI tooling is vibe-coded on previous generations of models. Mythos will find holes in those like Swiss cheese. Always be aware of the technologies you're introducing into your stack and the risks they bring with them.

Every software shop's engineering paradigm is different, but the approach that's worked well is pointing AI agents directly at your coding and design principles documentation and prompting them to follow it absolutely. This is a moment where every organization should be seriously assessing how they develop software securely, whether that means doubling down on safe development principles or relying on a memory-safe language.

On the automation side, there are plenty of tools available. If your project is hosted on GitHub, it can be as straightforward as setting up an OpenAI Codex code review on every PR, or instituting specific regression test guidelines.

Field Effect isn't currently part of the Glasswing initiative, but we do monitor CVEs and other evolving threats as they're reported. We map those against every client to rapidly identify exposures. These are then delivered to clients in real time via AROs.

Mythos and other models can help with infrastructure development in a number of ways: continuous configuration auditing of cloud accounts, authoring and reviewing Infrastructure as Code, and providing migration and modernization planning for legacy environments.

Matt Holland

Founder & Chief Executive Officer, field effect

After serving as a spy for Canada’s top national security agency for much of his career, Matt Holland went on to co-found, build, and successfully exit one of the world’s leading intelligence tradecraft companies. During that time, he also developed and delivered two mandatory training programs to more than a thousand Five Eyes developers across allied nations.

Matt has an established background and expertise in distributed computing, low level security systems and cross platform methodologies, and has a track record of commercialization and success with “hard problems”. Matt’s background directly contributes to Field Effect’s comprehensive cybersecurity solution, Field Effect MDR, and the aggressive growth of Field Effect worldwide.

Erik Egsgard

Principal Security Developer

With over 20 years of experience as a software developer and security researcher, Erik has uncovered vulnerabilities across a wide range of software and operating systems including Windows, macOS, iOS and Android.

Erik began his career in the private sector before joining Canada’s Communications Security Establishment (CSE), where he spent 15 years honing his expertise in advanced threat research. He later brought that experience to Linchpin Labs, focusing on cutting-edge vulnerability discovery and cross-platform endpoint development.

Recognized for his deep technical expertise, Erik has earned the CSE Award of Excellence and competed at the highest level of offensive security as part of the DEF CON CTF-winning Samurai team.

Sean Alexander

Principal Security Developer

Sean is an experienced software engineer and security analyst with over 15 years of experience in the cybersecurity industry. He has a BASc in Electrical & Computer Engineering and an MSc in Computer Science from Queen’s University.

He began his career at the Communications Security Establishment (CSE), spending four years in the Tailored Access Operations group, then moved to Linchpin Labs before ultimately joining Field Effect in 2016.

Sean started at Field Effect building a large-scale mobile test automation harness, but his primary focus has been developing and operationalizing the company’s endpoint security capabilities. He is the resident macOS and iOS expert and supports incident response and analysis as needed, while contributing across a variety of roles over the past decade.

CyberSecurity is our Priority

About Field Effect

Field Effect believes that businesses of all sizes deserve powerful cybersecurity solutions to protect them.

Our threat detection, monitoring, and response platform, along with our training and compliance products and services are the result of years of research and development by the brightest talents in the cybersecurity industry. Our solutions are purpose-built for SMEs and deliver sophisticated, easy-to-use and manage technology with actionable insights to keep you safe from cyber threats.

Visit Field Effect

Mythos and Project Glasswing: A Practical Look at the Future

Watch the webinar recording

Questions and answers

Speakers

Matt Holland

Erik Egsgard

Sean Alexander

About Field Effect

Complexity out. Clarity in.