How AI agents leak your credentials — and how to stop it
You are pairing with an AI coding agent — Claude Code, Cursor, whatever — in your repo, and for "open the pull request, call the staging API, run the deploy" it needs credentials. Hand it some and the obvious handoff is already the leak — and "then I just won't give it any" does not save you: it then fetches them itself, reads the .env, queries the vault — and leaks them in the process, without you ever noticing. Either way the value passes through the AI model, and it is gone.
Here is the uncomfortable part: that credential gets copied to a third party — the model provider — stored there, and slips out of your control. For most companies that is also a compliance problem. Here is why it happens, and how to close it for good.
TL;DR — AI agents leak credentials by reading them as output that gets shipped to the model provider. The fix is to let an agent use a secret without ever seeing it.
Why the obvious setup leaks
The mechanism is simple — and sneaky: the moment the agent reads the value — from .env, a vault, or your message — it becomes part of the conversation the agent sends to the AI provider. The token is now in someone else's logs. Deleting the chat afterwards does nothing; it was transmitted the moment it appeared.
You cannot fix this by telling the AI to "please be careful." An instruction in a prompt is advice, and advice gets ignored — by this agent, the next one, or some tool wrapped around it. The fix has to be structural: the secret must be impossible to see, not merely discouraged.
This is not hypothetical. In early 2026, 7% of the skills in one popular agent marketplace leaked credentials in exactly this way — I broke that incident down in the OpenClaw leak.
And it is not only agents fetching secrets: millions of developers — and "vibe coders" — paste API keys straight into the prompt every day. The same leak, done by hand — out of convenience and a naive trust that it'll be fine.
Using is not seeing
Here is the whole solution in one line: an agent has to use a credential, but it never has to see it. Those are two different things, and keeping them apart is the entire trick.
Concretely, the secret goes into the program the agent runs — never into the text the AI model reads. The shell sits in between: it hands the value to a program as an environment variable, while the agent only ever sees the command it typed.
First: where does the secret live?
Keeping a secret out of the AI's sight only works if it lives somewhere a tool can fetch it on demand — not hardcoded in the repo, not pasted into a config the agent reads. That somewhere is a secrets manager (a vault): one place that stores credentials encrypted, hands them to authorized programs, and records who touched what.
I use Infisical, an open-source secrets-management platform. It fits an AI-agent setup for three reasons: it is self-hostable, so the secrets stay on my own infrastructure with no third-party SaaS in the loop; it is open source; and it ships first-class machine identities — headless, short-lived-token auth built for exactly this agent-and-CI use, with no human login in the loop. Its payoff for this problem specifically: it does the "hand the secret to the program, never to the prompt" step for you (next section). With a different vault you build that step yourself — a handful of lines, shown near the end. Both work; Infisical just makes it the path of least resistance.
In practice: it's mostly one command
Once the secret lives in Infisical, using it is mostly a single command. The CLI ships infisical run, which injects a project's secrets into a child process as environment variables and never prints them:
infisical run --env=prod --path=/git -- gh pr list
gh gets the token from its environment; it never appears on stdout, so it can't reach the transcript. For most teams, that single command is the solution.
I wrap it in a get-secret helper so the call is short and the auth is automatic:
get-secret exec git -- gh pr list
Under the hood that is just infisical run with a machine-identity token pulled from the OS keychain — so every call skips the --token, --projectId and --domain flags. For using a secret you could equally type the bare infisical run; here the wrapper is pure convenience. It does real work in only one place — the read path below, which is also where the raw CLI would leak.
The one gap: reading a single value
infisical run covers using secrets. But the moment you ask for a raw value — infisical secrets get GITHUB_TOKEN — the CLI prints it in plaintext to stdout. For a human at a terminal that is fine; for an agent, that stdout is the transcript again.
So the only piece worth hand-writing is a thin gate on that read path — print the value only when a human is actually watching:
# get-secret <folder>/<NAME> — print one value, but only to a human.
val=$(infisical secrets get "$name" \
--projectId "$PROJECT_ID" --env=prod --path="/$folder" \
--token "$(get_token)" --plain --silent)
if [ -t 1 ]; then # stdout is a real terminal -> a human
printf '%s\n' "$val"
else # stdout is a pipe -> an agent; refuse
printf '[redacted len=%s]\n' "${#val}"
fi
[ -t 1 ] is the whole trick: an AI agent's stdout is normally a pipe, not a terminal, so in normal use it receives [redacted len=40] and nothing more.
I verified it the way any security claim should be tested: fetch the real value, capture everything the command prints, and search that output for the secret. It appears nowhere an agent can read.
Wiring it to Infisical
The only other moving part is auth — and it is headless by design, no interactive login:
- once: store a Universal Auth
client_id:client_secret(a machine identity) in the OS keychain - per call:
infisical login --method=universal-auth …returns a short-lived token (cached ~25 min), passed to every command as--token --projectId,--env=prod,--path=/gitselect exactly what to read
That is the entire skill: infisical run to use secrets, a TTY-gated infisical secrets get to read one, machine-identity auth — maybe eighty lines of shell. Writing a new secret is the asymmetric direction: the value has to enter from somewhere, and if a person pastes it into chat for the agent to store, that paste is already the leak. So the write path takes values only over stdin or a file, never typed into a command:
echo -n "$VALUE" | set-secret git/TOKEN - # wraps `infisical secrets set`, reads stdin
If your vault has no run command
infisical run does the injection step for you: fetch the secrets, put them in the environment, run the command, never touch stdout. Vaults like HashiCorp Vault, pass or AWS Secrets Manager only hand you the value — so you write that one step yourself. It is a handful of lines:
#!/bin/bash
# run-with-secret <cmd...> — load a secret from any vault into the env, then run the command.
export GITHUB_TOKEN="$(vault kv get -field=token secret/git)" # or: pass show git/token, aws secretsmanager get-secret-value …
exec "$@"
run-with-secret gh pr list passes the token to gh through the environment via command substitution — it never crosses stdout, so the agent sees only the command, never the value. More secrets? More export lines. That is exactly what get-secret exec does under the hood; Infisical just saves you from writing it.
So the split is simple. The read gate above you build once, whatever vault you use. The injection is free with Infisical (infisical run) and a few lines with anything else. The safety never lived in the vault — it lives in how you hand the value to the agent: into the process, never into the prompt.
This is one small, structural decision. It is also the difference between "we let AI near production" and "we let AI near production safely." Helping teams draw exactly this kind of boundary — so automation moves fast without quietly creating risk — is part of what I do. If that is on your plate, let's talk.