How We Encrypt Your Agent Backups (So Even We Can't Read Them)

"Zero-knowledge" gets thrown around a lot in the security world, often as marketing. Here's exactly what it means for AgentBak — no hand-waving, just the actual cryptography in plain English.

The Trust Problem With Cloud Backup

When you back up to the cloud, you're making a bet: that the company storing your data won't read it, won't get breached, and won't be compelled to hand it over. That's a lot of trust.

For most files — photos, documents — that's probably acceptable. For AI agent workspaces, it isn't. Your agent's workspace contains API keys, webhook secrets, personal memories, and potentially sensitive conversation logs. You need to know that nobody — not us, not a hacker who breaches us, not a government with a subpoena — can read that data.

Zero-knowledge encryption is the answer. And the good news is: the math is solid, straightforward, and something you can audit yourself.

Step 1: Turning Your Passphrase Into a Crypto Key (Argon2id)

The problem with using a passphrase directly as an encryption key is that passphrases are weak. A typical passphrase has maybe 40–60 bits of entropy. AES-256 needs a 256-bit key. And passphrases are predictable enough that an attacker with a GPU can try billions per second.

We use Argon2id to stretch your passphrase into a proper 256-bit key. Argon2id is the Password Hashing Competition winner — it's the current gold standard for this exact problem.

Here's what happens when you run vault backup:

1. Generate salt — 32 random bytes from CSPRNG

↓

2. Argon2id(passphrase + salt,

t=3 ← 3 passes, ~25ms on modern hardware

m=65536 ← 64 MB of RAM required per attempt

p=4) ← 4 threads of parallelism

↓

3. Output: 256-bit encryption key

The 64 MB of RAM requirement is the key insight. GPUs are fast at computation, but they have limited memory bandwidth. To brute-force your passphrase, an attacker needs 64 MB of memory per simultaneous attempt. On a GPU with 16 GB of memory, that's at most ~250 parallel attempts. Compare that to password hashing without memory hardness, where a modern GPU can try billions per second.

The salt ensures that two identical passphrases produce different keys. This prevents precomputed lookup table attacks (rainbow tables) — an attacker can't precompute a dictionary of common passphrase → key mappings.

Step 2: Encrypting the Data (AES-256-GCM)

With a 256-bit key in hand, we encrypt your workspace using AES-256-GCM. The "GCM" part is important — it stands for Galois/Counter Mode, and it gives us authenticated encryption.

Regular AES encryption gives you confidentiality: nobody can read the data without the key. But it doesn't prevent tampering — an attacker could flip bits in the ciphertext, and you'd get corrupted data back on decrypt with no warning.

AES-GCM adds an authentication tag: a 16-byte MAC that covers the entire ciphertext. Any modification to the encrypted data — even a single bit — causes decryption to fail with an authentication error. You either get your original data back perfectly, or you get nothing. There's no middle ground where you get tampered data.

nonce = 12 random bytes (CSPRNG, unique per backup)

key = Argon2id(passphrase, salt)

↓

AES-256-GCM(key, nonce, plaintext)

↓

ciphertext + auth_tag (16 bytes)

The nonce (number used once) ensures that encrypting the same data twice produces different ciphertext. Since we generate a fresh random nonce for every backup, an attacker can't compare ciphertexts across backups to infer anything about the data.

Step 3: The .vault Format

The encrypted data is wrapped in a self-describing .vault file:

CLAWVLT\x01 ← magic bytes, 8 bytes, identifies the format

HEADER_LEN ← 4-byte uint32, length of the JSON header

HEADER ← JSON: Argon2id params, cipher, nonce, file count, dates

NONCE ← 12 bytes

CIPHERTEXT ← AES-256-GCM(gzip(tar(workspace files)))

AUTH_TAG ← 16 bytes, GCM authentication tag

The header is unencrypted but contains no sensitive data. It has file counts, timestamps, and the Argon2id parameters needed to derive the key. This lets you run vault info backup.vault to inspect a backup without your passphrase.

Why this format matters for longevity: In 15 years, if AgentBak no longer exists, anyone can read the header and know exactly what to do. Argon2id with these params, then AES-256-GCM with this nonce. Both algorithms are standards with implementations in every language. You're not locked in.

Where Zero-Knowledge Comes In

All of the above happens on your machine, before upload. By the time any data leaves your laptop, it's an opaque encrypted blob.

Our server receives and stores:

Your encrypted vault blob (ciphertext + auth tag)
A SHA-256 hash of the ciphertext (for integrity verification)
Metadata: size, timestamp, a tag you provide

Our server does not have:

Your passphrase (never transmitted)
The Argon2id-derived key (never transmitted)
Any plaintext of your files
Any way to derive the above from what it does have

Subpoena resistance: When a government asks us for your data, we can hand over the encrypted blob. That's it. Without your passphrase, it's mathematically useless. We can't give what we don't have, and what we have is worthless without you.

What About Quantum Computers?

AES-256 is quantum-resistant. Grover's algorithm — the main quantum threat to symmetric encryption — halves the effective key length. AES-256 becomes AES-128 in a post-quantum world. 128 bits of security is still considered secure against any foreseeable attack.

Argon2id is not meaningfully threatened by quantum computing. The memory-hardness property doesn't get better with quantum parallelism.

Verify It Yourself

The vault CLI is open source. The encryption code is about 80 lines of Node.js using only Node's built-in crypto module and the argon2 library. No custom crypto. No clever tricks. Just standard primitives used correctly.

We designed it this way deliberately. Cryptography is one of the few fields where "I built something clever" is a red flag. The best crypto is boring, standard, and has been stared at by thousands of researchers for decades. That's what we use.

How We Encrypt Your Agent Backups (So Even We Can't Read Them)

The Trust Problem With Cloud Backup

Step 1: Turning Your Passphrase Into a Crypto Key (Argon2id)

Step 2: Encrypting the Data (AES-256-GCM)

Step 3: The .vault Format

Where Zero-Knowledge Comes In

What About Quantum Computers?

Verify It Yourself

Encrypted AI agent backup