Your AI Just Became a Security Architect

How GPT-4 and Claude Are Changing the Way We Threat Model Apps—Before They’re Ever Built

Imagine this:
You describe your app in plain English—“It’s a mobile dashboard with login, some APIs, and admin settings.”

A minute later, your AI gives you a full threat model:

  • Here’s where an attacker might spoof identity

  • These APIs could leak sensitive data

  • This admin panel could be misused

  • And here’s how to fix it all, step-by-step

🤯 Wild, right? This used to take days of meetings and specialized security reviews.

Now, with tools like StrideGPT, IriusRisk’s Jeff, and internal copilots built on GPT-4 or Claude, threat modeling is becoming faster, smarter—and way more accessible.

In this post, I’ll show you:

  • Why threat modeling matters (even if you're not a security pro)

  • How LLMs are transforming secure app design

  • The tools, real-world examples, and red flags you need to know

  • And how you can use this tech right now to build safer products

📥 Plus, I’m giving subscribers exclusive access to my full research report on this AI-powered security shift.

👇 Grab it mid-way through the post—and let your next app ship safer.

💥 The Problem: Threat Modeling Is Broken in Most Teams

If you’ve ever been on a dev team, you’ve likely heard of “threat modeling.”

It’s the practice of thinking through:

  • What could go wrong in your system?

  • How could someone attack it?

  • How do you prevent or detect it?

The most widely used approach is Microsoft’s STRIDE framework:

  • Spoofing identity

  • Tampering with data

  • Repudiation (denying actions)

  • Information disclosure

  • Denial of service

  • Elevation of privilege

But here’s the truth: most teams either skip threat modeling entirely or do it once during initial design—and then forget about it.

Why? Because traditional threat modeling is slow, manual, and often left to overworked security architects. It requires diagrams, deep knowledge of attack vectors, and time that fast-moving teams don’t have.

Enter large language models (LLMs).
They’re changing the game—and democratizing threat modeling for everyone, from startups to the Fortune 500.

🤖 The AI Shift: From Hype to Hands-On Help

In the past two years, we’ve seen a surge in AI-powered security tools—specifically tools that use LLMs like GPT-4, Claude, or Gemini to automate parts of the threat modeling process.

Instead of needing deep security expertise, you can now:

  • Describe your app in plain English

  • Paste a code snippet, system diagram, or architecture outline

  • Let the AI generate threats, organize them by STRIDE, and suggest how to fix them

And this isn’t a one-off demo. It’s being used in production by serious teams:

🧠 Pure Storage built “Threat Model Mentor GPT” to generate STRIDE-based threat models and attack trees in minutes
🏢 IriusRisk, a top enterprise platform, launched “Jeff”, an AI assistant that builds threat models from text or architecture diagrams
💻 Open-source tools like STRIDE GPT and AI Security Analyzer let devs self-serve threat models with GPT-4 or Claude under the hood

🧭 Why STRIDE Works So Well with LLMs

If you’re new to threat modeling, STRIDE is like a checklist for thinking about how your system could be attacked.

And it turns out: LLMs understand STRIDE extremely well. It’s been embedded into their training data, and when prompted properly, they:

  • Categorize threats under the six STRIDE types

  • Suggest realistic scenarios (e.g. "an attacker could spoof a user ID to bypass access control")

  • Propose actionable mitigations

For example:

“Describe your mobile app backend to GPT-4, and you’ll get back:

  • Spoofing: Weak auth headers

  • Tampering: Unprotected API inputs

  • Information Disclosure: Misconfigured CORS policies
    …plus advice like using JWTs, input validation, and access control layers.”

That kind of speed and structure is invaluable—especially when you’re moving fast or don’t have a security expert on every team.

🔍 What Tools Are Actually Doing This?

Let’s break down the landscape for both enterprise teams and indie builders:

Built by Matt Adams, this tool uses GPT to generate STRIDE threat models based on app descriptions.
✅ Outputs STRIDE threats, DREAD risk scores, Gherkin security test cases
✅ Works with OpenAI, Claude, Gemini, and local models
✅ Popular in open-source security circles

2. IriusRisk “Jeff” (Enterprise Tool)

Launched in 2024, Jeff is an AI assistant that builds structured threat models from diagrams or text.
✅ Used by major enterprises
✅ Produces models in seconds
✅ Integrated into their platform for full risk tracking

Scans your codebase and generates a security design doc—threats, attack trees, and mitigations included.
✅ Created by researcher Marcin Niemiec
✅ Supports Python, Java, JS, and more
✅ Great for deeper white-box (code-level) analysis

4. Threat Model Mentor GPT (Internal Tool @ Pure Storage)

An internal chatbot for teams to describe their systems and get back STRIDE threats and attack trees.
✅ Helps engineers self-serve threat models
✅ Scales secure design across every sprint

🧠 So How Good Are These AI Threat Models?

Glad you asked.

Security professionals ran benchmarks (e.g. TM-Bench, GenAI scoring frameworks) comparing different models—GPT-4, Claude, Gemini, etc.—on the same scenario.

Results:
✅ Some models (Claude 3, GPT-4-turbo) scored 5/5 on threat completeness, STRIDE accuracy, and mitigations
✅ LLMs often caught subtle issues like prompt injection in AI apps—something a manual checklist might miss
✅ AI-generated attack trees were often visually clearer than human-drawn ones

But…

❌ Not all results are consistent
❌ Sometimes LLMs hallucinate “threats” that aren’t real
❌ They may miss domain-specific nuances (e.g. finance, healthcare regulations)

Bottom line: LLMs give you a strong first draft. It’s up to humans to review, tweak, and validate it.

🚧 Subscriber-Only: Full Detailed “Threat Modelling with LLMs” PDF report

If you want the full deep dive, I’ve compiled everything into a exclusive PDF, including:

✅ How GPT-4, Claude, and Gemini are used in real-world threat modeling
✅ Tools breakdown: StrideGPT, IriusRisk Jeff, AI Security Analyzer, Arrows (and more)
✅ Use cases from companies like Pure Storage, IriusRisk, and open-source contributors
✅ Benchmarks: Which LLMs scored best in threat modeling accuracy and mitigations
✅ Limitations explained: hallucinations, inconsistent results, privacy concerns

👉 Subscribe below to unlock the full PDF and download instantly.

Already a subscriber? You’ll see the download link below 👇
New here? Hit subscribe, confirm your email, and come right back.

🧠 Pro Tip: Add newsletter email to your Safe Senders List so you never miss future guides and updates. That’s where I’ll be sharing follow-ups on AI coding tools, agent frameworks, and security-first practices for modern builders.

Subscribe to keep reading

This content is free, but you must be subscribed to Abhi's AI Playbook to continue reading.

Already a subscriber?Sign in.Not now