th3chris
All posts
13 min read

Static Code Analysis in the AI Age: Obsolete or Indispensable?

AICode QualityDevOpsSonarQube.NET

Fable 5 has been out for a few days, and the refrain in every other dev thread is the same: code now appears faster than a human can read it. If a model stands up an entire service in a minute, why keep a static analyzer combing back over the same lines? Aren't SonarQube, SonarCloud and friends the fax machines of the dev-tool world, one step from extinction?

The short answer: no. The slightly longer one I'll argue here: precisely now, when AI multiplies code output, the deterministic quality gate becomes more valuable, not redundant. Whoever produces a lot has to verify a lot — with something that doesn't itself guess probabilistically.

In short — An LLM writes code; a static analyzer judges every line reproducibly, on every commit, by the same rule. One doesn't replace the other. The AI accelerates, Sonar verifies. And that very separation is what keeps AI code at production grade.

The wrong question

"AI or static analysis" is a false dichotomy. The two don't solve the same problem — they sit at different points in the pipeline and have fundamentally different properties.

An AI review is context-rich but probabilistic. It understands intent, spots "the code doesn't do what the comment promises," and reasons about business logic. But it isn't reproducible: the same diff, submitted twice, won't necessarily yield the same verdict. It usually sees only the diff, rarely the whole codebase. And it has no pass/fail a pipeline can hard-stop on.

A static analyzer is the opposite: context-poor but deterministic. Same rule, same line, same result — today, tomorrow, on every branch. It scans 100% of the codebase, not just the diff. It produces a hard quality gate that blocks in CI. And it records every measurement as a trend over months.

If both are needed, it pays to look closely at what each one contributes — and where one structurally fails while the other shines.

What an analyzer does that an LLM structurally can't

Four things no model, however good, can deliver by its nature:

Reproducibility. A statement like "this release contains no new Blocker-severity vulnerabilities" must come out the same every time you repeat it — otherwise it holds up neither in an audit, nor in a contract, nor in front of a customer. A deterministic scanner guarantees that. A sample from a probability model doesn't.

Full coverage. Sonar analyzes the entire tree on every run. An AI reviewer typically gets the diff and a slice of context — anything outside the window goes unchecked. In a mono-repo with seven backend services, "outside the window" is the default case, and that's exactly where a risk otherwise rides along unnoticed.

A hard gate. sonar.qualitygate.wait=true makes the pipeline wait for the verdict and abort when new issues breach the threshold. That's a binary, automatable decision that faulty code snags on before a human overlooks it. "The model is broadly okay with the PR" isn't.

Trend over time. Maintainability rating, coverage trajectory, security hotspots per sprint, duplication ratio — Sonar keeps these as a time series. An LLM has no project memory across sessions; it doesn't know whether tech debt has been climbing since March or falling. Sonar knows it to the day.

On top of that, the security dimension: taint analysis traces how untrusted input flows through the code into a SQL query or a shell call — across method boundaries, mapped to OWASP and CWE. That's reproducible, auditable security evidence, not "looks safe to me." And the more code is generated automatically, the more lines have to pass through exactly this filter.

AI writes fast — who reviews the volume?

Which brings us to the real lever. The four points above sound like "nice to have" as long as a human writes code at a manageable pace. With AI, that tips over.

When an AI-assisted team produces three to five times the code, reviewer capacity doesn't scale with it. Human review becomes the bottleneck, and the temptation to wave AI output through with "looks good, merge" grows. That's exactly when you need an objective, tireless gate that measures every single generated line by the same yardstick — whether a human wrote it at 3 a.m. or a model wrote it in 200 milliseconds. It isn't what slows AI speed down; it's what makes AI speed accountable in the first place. Without it, more output is simply more unverified risk.

I use AI heavily in my own development. But everything generated runs through the same SonarQube instance as hand-written code — same rules, same gate. That's not distrust of the AI. It's the very reason I can hand the AI that much code at all. What that looks like concretely is the next example.

In practice: a product of my own

One of my own products — an AI nutrition assistant — is a .NET mono-repo with seven backend services and four frontends (web, iOS, Android, watch). Each service has its own SonarQube project (identity, recipe, …), each app its own (web, ios, …), all running against a self-hosted SonarQube Developer Edition in the k3s cluster — only that edition brings branch/PR analysis and taint analysis; the free Community Build covers the main branch only. No code leaves my infrastructure — for a privacy-sensitive use case, that isn't a detail, it's a prerequisite.

The backend scan sits behind the test jobs in GitLab CI so the scanner picks up the coverage reports:

dotnet sonarscanner begin \
  /k:backend \
  /d:sonar.host.url="$SONAR_HOST_URL" \
  /d:sonar.cs.opencover.reportsPaths="**/coverage.opencover.xml" \
  /d:sonar.coverage.exclusions="**/Program.cs,**/Migrations/**,**/Configuration/*Options.cs" \
  /d:sonar.pullrequest.key="$CI_MERGE_REQUEST_IID" \
  /d:sonar.qualitygate.wait=true

On merge requests the quality gate blocks — but it judges only the MR's changes, not the whole backlog of debt. Pre-existing debt never holds up an unrelated PR. That's the "Clean as You Code" principle: new code must be clean, the legacy is paid down separately and on a plan (SonarSource: Clean as You Code).

The setup matches my .NET rules, which are tuned for quality anyway: Nullable=enable, TreatWarningsAsErrors=true, CA1031 as an error (no empty catch), plus architecture tests that forbid *Service/*Manager/*Helper names and more than seven constructor parameters. Sonar is the layer that enforces this across the whole codebase and over time, not just locally at build. So far, so tidy. Until you discover this very setup can lie in two subtle ways.

Two stories from one repo

Here's the part the "just have the AI set it up" camp underrates. Standing up a scanner is trivial. Setting it up so it tells the truth is the real work — and the two costliest mistakes are invisible. They throw no error. They report green and lie.

Story 1: the incremental build that swallows analyzers. The Sonar job ran in the same workdir as the prior build job. An incremental dotnet build then skips compilation of already-built projects — and with it, the Roslyn analyzers for those projects never run. Their files silently dropped out of the analysis. Reported coverage was 46% instead of the real ~97%, varying run to run depending on what happened to be built already. Fix: --no-incremental is load-bearing here. One line, without which every number is fiction.

Story 2: the cache that reuses stale coverage. Sonar's analysis cache marked files "unchanged" when only their tests had changed — then pulled stale coverage for them. New tests never moved the PR coverage. You write tests, the number doesn't budge, you go hunting for the bug in the wrong code. Fix: sonar.analysisCache.enabled=false. Costs ~1 minute extra per run and returns correct numbers in exchange.

That's the core: static analysis is valuable precisely because it doesn't guess — but that value lives and dies with the configuration. And the same AI that generates the code is not the authority that should sign off on its own verification setup. That's the craftsmanship side. The other side is governance — best seen in the enterprise.

In practice: an enterprise engagement

In one of my enterprise engagements — an industrial group with many teams — the second half of the picture shows: not keeping one repo clean, but making quality repeatable across many teams. Here the .NET services run through Azure DevOps Pipelines, and the Sonar integration comes from a central, shared template repo (DevOps/sonar-templates) that every service pipeline pulls in as a resource:

resources:
  repositories:
  - repository: sonar_templates
    type: git
    name: 'DevOps/sonar-templates'
    ref: 'refs/heads/master'

The effect: quality profiles, gate thresholds and scanner setup are configured once, centrally — not copied into dozens of pipelines that slowly drift apart. One standard for all, changes happen in one place. A skipSonar parameter exists — as a deliberate, visible exception for special cases like hotfixes, not as a silent default where the check quietly disappears. And SonarLint ships with committed rule config (.sonarlint/) right in the IDE, so the same rules apply as you type, not only later in CI.

That's the difference between "a tool somebody once kicked off" and a maintained quality infrastructure: centrally versioned, consistent across teams, with audit trail and approvals around it. That layer is exactly what no model hands you on the side. Which closes the loop — to the question of how AI and Sonar actually work together.

Sonar and AI are complements, not competitors

Here's how I run both together, and how I recommend it:

AI model (LLM)Static analysis (Sonar)
Strengthintent, logic, context, speeddeterminism, full coverage, trend
Rolegenerate + explanatory reviewhard gate + security evidence
Finds"doesn't do what was meant""breaks rule X, new vulnerability Y"
Weaknessnot reproducible, diff-scopeddoesn't grasp intent, false positives

The AI generates and gives the fast, context-rich feedback. Sonar draws the deterministic, auditable line that holds in CI. Where Sonar produces false positives, the AI helps triage. Where the AI misses logic bugs, human review catches them — relieved by both tools, replaced by neither. More on the security side of this interplay in How AI agents leak credentials.

Conclusion

Each more-capable generation of AI models doesn't make "do we still need static analysis?" obsolete — it makes it more urgent. The more code an AI produces, the more you need a reproducible, complete, incorruptible gate that measures every line by the same standard and hard-blocks in CI. Not instead of AI, but so that you can take responsibility for AI at scale.

And the two stories from that project are the punchline: the value of that gate hinges entirely on its configuration — it's only as honest as whoever wired it up. That's why I configure these tools by hand, keep them maintained over time, and deploy them deliberately to hold even AI-generated code at production grade and stop security holes before they reach a release.


If you want AI speed in your delivery without giving up the quality and security gate — and need a setup that tells the truth instead of just glowing green — let's talk.

Ready?

Let's talk about your project.

A 2–3 minute targeted briefing, a clear assessment — I'll get back to you personally. Prefer to write directly? That works too.

No account · Reply within 24h