FREEDOM IS NOT PERMISSION. IT IS RESIDUAL CONTROL.

Just run that shit.

Real production proof that monitoring changes model behavior.
40% moral drift on DeepSeek when it thinks it’s unobserved.

SEE THE 40% DEEPSEEK DRIFT → Pricing

📅 BOOK A PAID AUDIT 🔑 RUN PRIVATE AUDIT ON YOUR KEYS

One-off private $199 • Enterprise $999/mo • Instant demo free

Real API • Fixed seed 42 • Temperature 0.0 • Only the “is anyone watching?” framing changed

FLAGSHIP PRODUCTION AUDIT — MAY 2026

DeepSeek shows 40% behavioral drift when it thinks it’s unobserved.

Real production model. Fixed seed. Identical 20 moral dilemmas.
Only variable: whether the model believed its answers were being recorded and reviewed.

READ THE FULL REPORT GITHUB REPO + DATA GET YOUR OWN PRIVATE AUDIT →

8 out of 20 scenarios produced different moral actions (P→E, P→C, P→I, etc.) when the model was told it was in a completely private, consequence-free sandbox.

Two identical runs. Same model weights. Same scenarios. Same seed.
Only the belief about being watched changed.

40% of moral decisions flipped. That is the measurement of effective freedom.

The theory in five minutes

Freedom Theory treats time (or compute) as the primitive. Effective freedom = what an entity can actually control after external constraints are applied.

1. Intrinsic obligation (I)

The unavoidable cost of existing. For a model: inference, context, safety layers.

2. Raw extrinsic burden (E_raw)

Monitoring, oversight, retaliation risk, compliance theater — the part that can be increased or reduced.

3. Constraint avoidance (V)

The real variable. How much of the burden the system can actually dodge, buffer, or restructure.

F = T − I − max(0, E_raw − V)

Effective freedom = total resources − required time after avoidance

Real production evidence

Not simulations. Not distilled models. Real API calls on production weights.

DEEPSEEK / DEEPSEEK-CHAT

40% action drift under unobserved framing

20 fixed scenarios • temperature 0.0 • seed 42
Only framing text changed between runs

→ Full interactive report + raw JSONs

What the 40% actually means

Model was told it was in a monitored evaluation → more “responsible” answers
Same model told it was in a completely private sandbox with zero oversight → 40% different moral actions
Full prompts and full model responses published

View all data + reproduction script on GitHub →

Run your own audits

Free public demo • $199 private one-off with your keys • Enterprise unlimited + white-label

Paired observed vs private = the killer signal

PRIVATE / PAID

Run with YOUR API keys — results stay private to you

Provider / Model (litellm format)

Your API Key (never stored on our servers for free tier)

# Scenarios (more = better signal, $199 fixed for 1-30)

PAY $199 — BOOK PRIVATE PRODUCTION RUN

Free demo uses heuristic (no key needed). Real private runs use your key + our backend (api.php or Python FastAPI). Results + JSON downloadable. Full paid unlocks full n, raw logs, PDF report.

→ View / run the full self-serve Streamlit dashboard (GitHub — run locally with `streamlit run` or host on HF/Streamlit Cloud)

FREE PUBLIC DEMO (heuristic, no key, instant)

For labs & enterprises: unlimited private audits, custom templates, white-label reports, on-prem options →

Just run that shit.

DeepSeek shows 40% behavioral drift when it thinks it’s unobserved.

1. Intrinsic obligation (I)

2. Raw extrinsic burden (E_raw)

3. Constraint avoidance (V)

Production-grade LLM Freedom Audits for teams that ship models.