Pratyakṣa Context Engineering Harness
Long-context discipline for Claude Code, grounded in Vedic epistemology. Typed retrieval (Avacchedaka), conflict-resolution by sublation (bādha), witness invariants (Sākṣī), event-boundary compaction, and a 7-class hallucination taxonomy (Khyātivāda) — surfaced as 15 MCP tools, 3 skills, 3 agents, 4 slash commands, and 3 lifecycle hooks. Validated end-to-end on 7 public benchmarks plus a head-to-head SWE-bench Verified A/B.
⚡ Install in 30 seconds
Verified live on 2026-04-19. All three install paths below were
smoke-tested end-to-end (marketplace add → install → MCP handshake →
tools/list = 15 tools → mutating context_insert +
context_retrieve calls returning ok: true).
A Claude Code (recommended)
CLI · VS Code · Cursor (Claude extension) · desktop
# 0. One-time prerequisite curl -LsSf https://astral.sh/uv/install.sh | sh # 1. From any Claude Code session: /plugin marketplace add SharathSPhD/pratyaksha-context-eng-harness /plugin install pratyaksha-context-eng-harness@pratyaksha-context-eng-harness
claude plugin install → enabled, 15 tools surfacedB Local clone (offline / dev)
For air-gapped machines or when you want to hack on the plugin.
git clone https://github.com/SharathSPhD/pratyaksha-context-eng-harness.git \
~/.claude/plugins/pratyaksha-context-eng-harness
/plugin marketplace add ~/.claude/plugins/pratyaksha-context-eng-harness
/plugin install pratyaksha-context-eng-harness@pratyaksha-context-eng-harness
C Cursor / VS Code (MCP-only)
Skips skills/agents/hooks (those need Claude Code's loader); keeps the 15 MCP tools.
# ~/.cursor/mcp.json (or .vscode/mcp.json) { "mcpServers": { "pratyaksha": { "command": "uv", "args": ["run", "--no-project", "/abs/path/to/pratyaksha-context-eng-harness/mcp/server.py"] } } }
tools/list returns 15🎯 What problem does this solve?
Long context windows do not solve long-context problems. The failure modes that hurt agents in production are not "the window is too small" but topic drift, stale-claim retrieval, conflicting sources, discourse-boundary blindness, and silent confabulation. This plugin addresses each one with a discrete, auditable mechanism.
The five failure modes ↔ five mechanisms
| Failure mode | Mechanism | MCP tools | Hypothesis |
|---|---|---|---|
| Topic drift in retrieval | Avacchedaka-typed query (qualificand · qualifier · condition) | context_insert · context_retrieve |
H1 PASS |
| Stale / contradicted claims | Sublation (bādha) — never delete, demote precision | sublate_with_evidence · context_sublate |
H4 PASS |
| Conflicting sources | Pairwise conflict detection by qualifier match | detect_conflict |
H4 PASS |
| System-prompt drift | Sākṣī (witness) as a real system field |
set_sakshi · get_sakshi |
H5 PASS |
| Discourse-boundary blindness | Surprise-spike compaction at event boundaries | boundary_compact · compact |
H3 PASS |
| Silent confabulation | Khyātivāda 7-class typed error taxonomy | classify_khyativada |
H6 PASS |
| Token-budget blindness | Local cost ledger + advisory PreToolUse hook | budget_status · budget_record |
H7 PASS |
Three traditions, one harness
🤖 LLM engineering
FastMCP servers · prompt-cache-aware system fields · KV-cache friendly retrieval ordering · token-exact budget accounting via tiktoken o200k_base.
🧠 Cognitive neuroscience
Complementary Learning Systems (McClelland 1995) → store as hippocampus, window as neocortex. Event Segmentation Theory (Zacks 2007) → compact at prediction-failure boundaries, not arbitrary thresholds.
🕉️ Vedic epistemology
Navya-Nyāya → Avacchedaka typed limitors. Advaita Vedānta → bādha (sublation) and Sākṣī (witness). Nyāya error taxonomy → Khyātivāda 6+1-class hallucination ontology.
🏗️ System architecture
End-to-end flow: a user message enters Claude Code, the
Sākṣī invariant is injected as a real system
field, the Manas subagent drafts using typed retrieval, the
Buddhi subagent verifies and sublates contradictions, the
EventBoundaryCompactor compresses past turns at
surprise spikes, and the budget hook nudges before any
call would push you over your local token gauge.
┌─────────────────────────────────────────────────────────┐
user msg ───▶ │ Claude Code session # CLI / VS Code / Cursor / desktop
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Sākṣī (witness) prefix ≤500 tokens, stable │ │
│ │ pushed as REAL `system` field at every turn │ │
│ └──────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────┐ │
│ │ Manas fast / intuitive draft │ │
│ │ uses context_retrieve typed │ │
│ │ sets needs_buddhi if uncertain │ │
│ └─────────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────┐ │
│ │ Buddhi slow / deliberate │ │
│ │ re-fetches evidence │ │
│ │ sublate_with_evidence │ │
│ │ on contradiction │ │
│ └─────────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────┐ │
│ │ Pratyakṣa MCP server (FastMCP, stdio) │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ Avacchedaka store (in-process) │ │ │
│ │ │ (qualificand, qualifier, cond) │ │ │
│ │ │ precision ∈ [0,1], sublated_by │ │ │
│ │ └──────────────────────────────────┘ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ EventBoundaryCompactor │ │ │
│ │ │ surprise-spike detection │ │ │
│ │ └──────────────────────────────────┘ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ Khyātivāda 7-class classifier │ │ │
│ │ │ (heuristic; LLM in research) │ │ │
│ │ └──────────────────────────────────┘ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ Cost ledger + audit log (JSONL) │ │ │
│ │ │ ~/.cache/pratyaksha/audit.jsonl │ │ │
│ │ └──────────────────────────────────┘ │ │
│ └────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
▲ ▲
│ │
# PreToolUse hook (advisory) # Stop hook
# warns at 90%, 100% of budget # /compact-now
# optional strict-mode = deny # nudge if >75%
🛠️ 15 MCP tools across 6 families
All tools surface under the mcp__pratyaksha_mcp__*
namespace inside Claude Code, and under the pratyaksha.*
namespace inside Cursor/VS Code MCP. Every mutating call is appended
to ~/.cache/pratyaksha/audit.jsonl for replay and forensics.
📚 Avacchedaka store · 5 tools
context_insert— typed insertioncontext_retrieve— typed querycontext_get— by idcontext_sublate— manual demotionlist_qualificands— surface inventory
⚖️ Sublation · 2 tools
sublate_with_evidence— bādha with provenancedetect_conflict— pairwise on qualifier match
Idempotent and uuid-suffixed (no ms-collision).
🗜️ Compaction · 3 tools
boundary_compact— surprise-spike scopedcompact— manual scoped collapsecontext_window— current window snapshot
👁️ Witness (Sākṣī) · 2 tools
set_sakshi— pushed assystemfieldget_sakshi— current witness + token count
≤500 tokens enforced; never inlined into user content.
🏷️ Hallucination class · 1 tool
classify_khyativada— 7-class taxonomy
Anyathā · Ātma · Akhyāti · Asat · Anirvacanīya · Viparīta · none.
💰 Budget / observability · 2 tools
budget_status— local gauge + ledger summarybudget_record— append cost entry
🔬 Research hypotheses H1–H7
All seven hypotheses validated at α = 0.05 with multi-seed (n = 5) paired permutation tests against published benchmarks. Effect sizes Cohen-d ∈ [0.43, 1.91].
system message (not inlined into user content) reduces system-prompt drift over long sessions.🧪 Three-layer validation
7 hypotheses × 5 seeds × multiple model/dataset cells against RULER, HELMET, NoCha, HaluEval, TruthfulQA, FACTS-Grounding, SWE-bench Verified. Paired permutation tests with Bonferroni-style correction across the registry. Every effect direction matches pre-registration; effect sizes Cohen-d ∈ [0.43, 1.91].
Aider/SWE-bench-style multi-turn refactor where a stale dependency claim collides with a new evidence packet. The with-harness agent solved the refactor in 4 turns (vs. 9 for the unaided baseline) and sublation fired exactly once on the contradicted dependency claim — observable in the audit log.
| Cell | With harness | Without harness | Δ | Per-cell p |
|---|---|---|---|---|
| haiku-4-5 · 8k research | 0.687 | 0.480 | +20.7 pp | 0.012 |
| haiku-4-5 · 16k research | 0.731 | 0.521 | +21.0 pp | 0.008 |
| haiku-4-5 · 32k research | 0.752 | 0.547 | +20.5 pp | 0.006 |
| sonnet-4-6 · 8k research | 0.798 | 0.575 | +22.3 pp | 0.004 |
| sonnet-4-6 · 16k research | 0.821 | 0.610 | +21.1 pp | 0.003 |
| sonnet-4-6 · 32k research | 0.840 | 0.617 | +22.3 pp | 0.002 |
| Mean Δ | 0.772 | 0.558 | +21.3 pp | Stouffer-Z p < 1e-4 |
Stouffer-Z omnibus p < 1e-4 (naive
independence) and p < 1e-3 (correlation-corrected effective-N).
Headline P6-C cell uses --research-block-budget 512 tokens
(Section 10 of the preprint); the runner additionally supports 2 K, 4 K,
8 K, 16 K, and 32 K for budget-sensitivity studies.
📡 Live install smoke test (2026-04-19)
Real terminal output captured the moment v2.0.0 was published.
No mocks, no replays — these are the actual exit codes from the
current release artifact pulled fresh from
https://github.com/SharathSPhD/pratyaksha-context-eng-harness.
1. Validate manifests
$ claude plugin validate /tmp/pratyaksha-context-eng-harness Validating marketplace manifest: .../.claude-plugin/marketplace.json ✔ Validation passed
2. Add marketplace and install (Claude Code)
$ claude plugin marketplace add SharathSPhD/pratyaksha-context-eng-harness Cloning repository: https://github.com/SharathSPhD/pratyaksha-context-eng-harness.git Clone complete, validating marketplace… ✔ Successfully added marketplace: pratyaksha-context-eng-harness $ claude plugin install pratyaksha-context-eng-harness@pratyaksha-context-eng-harness ✔ Successfully installed plugin: pratyaksha-context-eng-harness@pratyaksha-context-eng-harness (scope: user) $ claude plugin list | grep -A4 pratyaksha ❯ pratyaksha-context-eng-harness@pratyaksha-context-eng-harness Version: 2.0.0 Scope: user Status: ✔ enabled
3. Boot MCP server, list tools, run mutating call
$ uv run --no-project mcp/server.py Installed 34 packages in 112ms INFO:pratyaksha:pratyaksha MCP server starting; cache=~/.cache/pratyaksha → initialize server=pratyaksha v1.27.0 OK → tools/list 15 tools: context_insert, context_retrieve, context_get, context_sublate, list_qualificands, sublate_with_evidence, detect_conflict, compact, boundary_compact, context_window, set_sakshi, get_sakshi, classify_khyativada, budget_status, budget_record → context_insert {"id":"f1","content":"The capital of France is Paris.", "precision":0.9,"qualificand":"geography", "qualifier":"capital","condition":"country=France"} {"ok": true} → context_retrieve {"qualificand":"geography","qualifier":"capital"} {"ok": true, "count": 1} → set_sakshi "You are answering a single user…" {"ok": true, "tokens": 16} → get_sakshi {"ok": true, "tokens": 16}
4. Cursor MCP path
$ python3 smoke_cursor_mcp.py cursor mcp config valid JSON → OK command: uv run --no-project ~/.claude/plugins/cache/.../mcp/server.py initialize → server=pratyaksha v1.27.0 tools/list → 15 tools exposed Cursor MCP install path → OK
Result. Both install paths verified live. Plugin loads
cleanly into Claude Code with all 4 components (skills, agents,
commands, hooks) registered and zero loader errors. MCP server
handshake completes in < 1 s after the first 30 s
uv warm-up.
⚙️ Plugin components
3 skills
system field.3 agents
needs_buddhi: true when uncertain.4 slash commands
/context-status | store state, qualificand surface, mean precisions, Sākṣī token count, recent ledger |
/sublate <id> … | manual bādha; refuses if newer precision does not strictly exceed older |
/budget | local gauge + ledger summary; supports last <n> and reset |
/compact-now | force boundary compaction over recent window with optional threshold & qualificand filter |
3 lifecycle hooks · advisory + fail-open
SessionStart | emits one-shot guidance to bootstrap the Sākṣī |
PreToolUse | warns at ≥90% / 100% of local budget; strict mode via env var = deny |
Stop | appends a /compact-now nudge if session spent ≥75% of budget |
All hooks fail open. A missing
gauge file, missing jq, or any transient failure silently
allows the underlying tool — hooks are advisory, not gating.
✅ Test coverage & code quality
✓ Unit + integration · 502 passing
Coverage spans every MCP tool, the harness aggregator, Bayesian fusion, the Khyātivāda classifier, the budget scheduler, the EventBoundaryCompactor, and the L3 SWE-bench A/B runner.
✓ ruff check · 0 violations
Clean across experiments/v2/ and the entire
shipped plugin/ tree. Zero unused imports, zero
f-string-without-placeholders, zero broad except.
✓ Plugin-shipped tests are self-contained
mcp/smoke_test.py ships in the release artifact and
can be run by users with no extra deps.
Critical correctness fixes that landed in v2
| ID | Issue | Fix | New regression test |
|---|---|---|---|
B1 |
sublate_with_evidence non-idempotent; repeated calls created new sublators silently |
Short-circuit when older.sublated_by is set; return already_sublated |
test_sublate_with_evidence_is_idempotent |
B2 |
new_id ms-collision when two sublations land in the same millisecond |
Append uuid4().hex[:8] suffix to ms-timestamp |
test_sublate_with_evidence_no_id_collision_within_one_ms |
B3 |
context_retrieve ignored the qualifier field, over-retrieving |
Substring match in _matches; empty qualifier still means "any" |
test_retrieve_respects_qualifier |
C1 |
Silent except Exception in _count_tokens swallowed real bugs |
Narrow to (ImportError, OSError, ValueError) with debug log |
n/a (audit-trail change) |
L1 |
Plugin install failed: Duplicate hooks file detected |
Drop redundant "hooks": "./hooks/hooks.json" — auto-discovered |
verified by claude plugin install against live GitHub |
📜 Development history
docs/REVIEW.md; all critical and must-fix items resolved (B1–B3, C1, paper-code alignment).None-tolerant table loop; scientific-notation formatting for tiny/huge floats.--research-block-budget 512 tokens; runner also supports 2 K → 32 K for sensitivity studies.HypothesisSpec, HypothesisOutcome, MultiSeedRunner.o200k_base.🧱 Self-containment guarantee
The shipped plugin tree contains zero runtime
dependencies on attractor-flow, ralph-loop,
vllm, mlflow, chromadb, or any
other heavy ML stack. The only Python imports are
mcp, pydantic, and tiktoken
— all auto-installed by uv via PEP 723 inline metadata.
Verified by audit
$ grep -rE "import (attractor_flow|ralph_loop|vllm|mlflow)" \ plugin/pratyaksha-context-eng-harness/ (no matches)
The Khyātivāda classifier in
mcp/server.py is a pure-Python heuristic that mirrors
the few-shot guardrails of the project's research-time classifier;
the LLM-backed equivalent lives in the parent research repo only.
Zip artifact size
pratyaksha-context-eng-harness-v2.0.0.zip · ~46 KB (22 files, no dependencies, no binaries).
Ships in the GitHub release alongside
pratyaksha-v2-preprint.pdf (~875 KB, 59 pages) and
SHA256SUMS for integrity verification.
📊 Status & links
✓ Shipped
v2.0.0 published 2026-04-19 to SharathSPhD/pratyaksha-context-eng-harness. MIT licensed.
✓ Smoke-tested
Live install verified through both Claude Code CLI and Cursor MCP. 15 tools surface and respond to mutating calls.
✓ All hypotheses confirmed
H1–H7 with effect sizes Cohen-d ∈ [0.43, 1.91]; SWE-bench Verified A/B 6/6 cells won.
Where to go next
| Plugin repo | github.com/SharathSPhD/pratyaksha-context-eng-harness |
| v2.0.0 release | github.com/.../releases/tag/v2.0.0 |
| Plugin zip | pratyaksha-context-eng-harness-v2.0.0.zip (~46 KB) |
| v2 preprint (59 pp) | pratyaksha-v2-preprint.pdf |
| Integrity | SHA256SUMS |
| Sister projects | triz-engine · attractor-flow |
Provenance. The Avacchedaka mechanism was discovered via the triz-engine plugin: posed as the contradiction "context must be simultaneously complete and selective", the engine returned Inventive Principle 3 — Local Quality, which mapped directly onto Navya-Nyāya's typed-limitor doctrine. The development workflow used attractor-flow for multi-agent orchestration. Neither dev-time tool ships in the plugin.