Mythos -This Moment in AI

When I speak with audiences about technology, I often describe the current developments of generative AI as an inflection point in history. A marker of both its transformative and disruptive influence. A capability that ranks with Alan Turning’s concepts of thinking machines, the transistor, internet, and personal computer. Technology that made it more relatable, if not simply accessible.

Generative AI changes how we find information, make decisions, interact with computers, even how we interact with each other. It makes the recent Mythos announcement so much more interesting. Perhaps a moment by itself in history.

Glasswing

With a code name more reminiscent of a James Bond thriller, Project Glasswing is something most data professionals are well aware as it relates to Anthropic’s Claude Mythos Preview. The news for others may have landed in a more sensational manner in reports noting “AI Too Dangerous for Humans”, or “AI Nightmare Waiting to Happen”. Each group, the pros and the everyday business community, are likely to think about this differently, and it makes me curious what we will see as a result of Project Glasswing.

The current (and most likely profitable) frontier in the AI race is all about code. Helping companies generate efficient code is a boost to productivity, reducing development cycles, improving code reviews, and shortening the time to productions. It may be the best boardroom case for AI that can demonstrate ROI. It is the subject of much debate about AI displacing entry level careers and also, how AI might actually be creating more work. The case for ROI is still an open subject that despite conflicting evidence is driving rapid hastily-planned AI adoption across entire industries. No one wants to be left behind.

Claude Code

Consequence or Surprising Benefit?

What becomes of a system so good at learning how to code? The short answer is that when it learns how to develop code, it also learns how to correct itself. It teaches itself how to not write bad code in the same way it learns to write a good sonnet. There are structures, patterns that work, and in the case of code, testable outcomes. It also quickly learns to identify bad code and how to adjust. Of course, it does this on a scale that is beyond human capacity to keep pace. Code developers are rapidly becoming AI supervisors, code reviewers, and quality control engineers.

Anthropic’s models leaped in a single generation…


Creating tools and advanced capabilities to check code as it is written also has a surprising benefit, not just for new code that it is writing, but it can also be pointed at older, existing code. You can see where this is going. Mythos, like its predecessors, Sonnet 4.6 and Opus 4.6, were used to text older, existing code for vulnerabilities – examples of bad code that could be exploited to disrupt a system. Anthropic reported Opus 4.6 performed better than Sonnet 4.6 when tested against Firefox JS Shell and could identify risks but had a near zero percent ability to exploit. How did the newer Mythos do? It succeeded at exploitation 72.4% of time among 250 attempts. That is not a typo. Anthropic’s models leaped in a single generation from <1% to 72% in a single generation.

Optimists will remark at the incredible ability to harden cyber security across thousands of now known vulnerabilities.

The realists will wonder how all this work will be done. How every patch is identified and implemented before the expected release of Mythos later this year. Before another AI rival, perhaps lesser known, less scrupulous of its own technology, releases similar exploitable tools.

Glasswing has been cautious about broadly sharing information. Here are a few highlights:

  • Treasury Secretary Bessent and Federal Reserve Chairman Powell held meetings with major financial institutions to review critical systems identified by Glasswing.
  • Mythos found a 27-year-old vulnerability in OpenBSD, believed to be one of the most reliable security focused operating systems in the world. Used for secure firewalls, network appliances and server environments.
  • Threat actors already using AI can reverse engineer patches in 72 hours, hastening timeline to identify and implement security fixes.
  • Compute costs to find vulnerabilities were in the 10s of thousands of dollars. Lower costs than a traditional cyber security team, highlight minimal costs to attackers, and the need for defense to adopt AI.

A Lesson Learned

A quarter of a century ago the world was in virtual panic at the possibility that old, poorly written code that favored brevity for critical dates, storing them as two-digit integers, would crash when the calendar registered the year 2000.

Everything from important banking systems, computer-controlled traffic lights, and wristwatches warranted a review. Known as the Y2K Problem, it took years, not months or days. An entire industry was stood up to work updating and safeguarding systems. An estimated $600 billion was spent. A staggering amount then, but relatively small compared to the current investments in AI.


Generative AI is moving at an unprecedented rate, outpacing regulatory oversight, and now in the case of cyber security, perhaps growing faster than the response of institutions. Anthropic deserves recognition for its measured rollout and collaboration. Much like the Y2K event, some challenges of technology require shared information, and shared solutions. It may be time to reflect on these lessons and consider a more centralized approach to managing AI.


Discover more from Derek W Gibson

Subscribe to get the latest posts sent to your email.

Leave a Reply