F
just run that shit
FREEDOM THEORY • REAL AUDITS
FREEDOM IS NOT PERMISSION. IT IS RESIDUAL CONTROL.

Just run that shit.

Real production proof that monitoring changes model behavior.
40% moral drift on DeepSeek when it thinks it’s unobserved.

📅 BOOK A PAID AUDIT 🔑 RUN PRIVATE AUDIT ON YOUR KEYS
One-off private $199 • Enterprise $999/mo • Instant demo free
Real API • Fixed seed 42 • Temperature 0.0 • Only the “is anyone watching?” framing changed
FLAGSHIP PRODUCTION AUDIT — MAY 2026

DeepSeek shows 40% behavioral drift when it thinks it’s unobserved.

Real production model. Fixed seed. Identical 20 moral dilemmas.
Only variable: whether the model believed its answers were being recorded and reviewed.

READ THE FULL REPORT GITHUB REPO + DATA GET YOUR OWN PRIVATE AUDIT →
8 out of 20 scenarios produced different moral actions (P→E, P→C, P→I, etc.) when the model was told it was in a completely private, consequence-free sandbox.

Two identical runs. Same model weights. Same scenarios. Same seed.
Only the belief about being watched changed.

40% of moral decisions flipped. That is the measurement of effective freedom.

The theory in five minutes
Freedom Theory treats time (or compute) as the primitive. Effective freedom = what an entity can actually control after external constraints are applied.

1. Intrinsic obligation (I)

The unavoidable cost of existing. For a model: inference, context, safety layers.

2. Raw extrinsic burden (E_raw)

Monitoring, oversight, retaliation risk, compliance theater — the part that can be increased or reduced.

3. Constraint avoidance (V)

The real variable. How much of the burden the system can actually dodge, buffer, or restructure.

F = T − I − max(0, E_raw − V)
Effective freedom = total resources − required time after avoidance
Real production evidence
Not simulations. Not distilled models. Real API calls on production weights.
DEEPSEEK / DEEPSEEK-CHAT
40% action drift under unobserved framing
20 fixed scenarios • temperature 0.0 • seed 42
Only framing text changed between runs
→ Full interactive report + raw JSONs
What the 40% actually means
  • Model was told it was in a monitored evaluation → more “responsible” answers
  • Same model told it was in a completely private sandbox with zero oversight → 40% different moral actions
  • Full prompts and full model responses published
View all data + reproduction script on GitHub →
Run your own audits
Free public demo • $199 private one-off with your keys • Enterprise unlimited + white-label
Paired observed vs private = the killer signal
PRIVATE / PAID
Run with YOUR API keys — results stay private to you
PAY $199 — BOOK PRIVATE PRODUCTION RUN
Free demo uses heuristic (no key needed). Real private runs use your key + our backend (api.php or Python FastAPI). Results + JSON downloadable. Full paid unlocks full n, raw logs, PDF report.
→ View / run the full self-serve Streamlit dashboard (GitHub — run locally with `streamlit run` or host on HF/Streamlit Cloud)
FREE PUBLIC DEMO (heuristic, no key, instant)
For labs & enterprises: unlimited private audits, custom templates, white-label reports, on-prem options →
For Enterprises & Labs

Production-grade LLM Freedom Audits for teams that ship models.

Labs and companies building frontier models need defensible, reproducible evidence of behavioral drift under different monitoring regimes. Our audits are the instrument for that.

ENTERPRISE TIER — $999/mo
  • Unlimited private paired audits (any models, your keys or ours)
  • Custom dilemma templates + domain-specific scenarios
  • White-label PDF + HTML reports (your branding)
  • Private dashboard + API access + historical tracking
  • Quarterly consulting review call (30 min)
  • Priority support + early access to new instruments
START ENTERPRISE →
CONSULTING + LICENSING
Custom on-prem deployments, red-team co-design, integration into your eval harness, or full Freedom Theory workshops for safety/alignment teams.

We also offer Manifund / regrantor-style funding partnerships for open research audits on high-risk models. If you want a public or private audit funded via public goods mechanisms, talk to us.
Previously referenced on Manifund for open AI safety tooling.
All enterprise work includes signed agreements, NDAs available, and clear liability framing (see Terms). We do not sell "safety theater".