@mukundakatta/agentbudgetToken + dollar budget caps for AI agents — throws when an LLM call would push past the ceiling. Zero deps, drop into any provider SDK.
@mukundakatta/agentcastStructured output for any LLM call. Validate the model's response, retry with the validation error as feedback, return typed data or throw after N attempts. Bring your own validator (zod/valibot/JSON Schema/anything). Zero deps.
@mukundakatta/agentcast-mcpMCP server: structured-output enforcer. Extract JSON from messy LLM output, validate against a shape spec, and produce retry feedback when the model returns the wrong shape. Wraps @mukundakatta/agentcast.
@mukundakatta/agentfitFit your messages into the LLM context window. Token-aware truncation with multiple strategies (drop-oldest, drop-middle, priority), pluggable tokenizers, zero dependencies.
@mukundakatta/agentfit-mcpMCP server for agentfit: token-aware message truncation. Fit a chat history into the model's context budget with drop-oldest, drop-middle, or priority strategies.
@mukundakatta/agentguardNetwork egress firewall for AI agents — declarative allowlist of domains an agent's tools can fetch, throws or 403s on violation. Zero dependencies.
@mukundakatta/agentguard-mcpMCP server for agentguard: declarative network-egress firewall for agent tools. Check whether a URL is allowed under a policy before any fetch.
@mukundakatta/agentkitThe agent reliability stack: fit, guard, snap, vet, cast. One install for all five sibling libraries (token-aware truncation, network-egress firewall, snapshot tests, tool-arg validation, structured-output enforcer).
@mukundakatta/agentmemoryHonest pull-model alternative to Anthropic Dreaming. Time-bucketed episodic store + on-demand summarizer for LLM agents. Reversible deletes, no silent context injection, no background consolidation. Includes Postgres adapter and Claude demo.
@mukundakatta/agentsnapSnapshot tests for AI agents — record tool-call traces, diff against baselines, fail CI on regressions. Zero dependencies, drops into any test runner.
@mukundakatta/agentsnap-mcpMCP server: snapshot tests for tool-call traces. Capture, normalize, and diff agent tool-use traces to catch silent regressions. Wraps @mukundakatta/agentsnap for Claude Desktop, Cursor, Cline, Windsurf, and Zed.
@mukundakatta/agentvetValidate LLM-generated tool args before execution. Wrap your tools with a schema; throws ToolArgError with an LLM-friendly retry hint when the model hallucinates wrong types. Zero deps.
@mukundakatta/agentvet-mcpMCP server for agentvet: validate LLM-generated tool-call args before execution. Returns a typed error with an LLM-friendly retry hint when the args don't match.
@mukundakatta/citecite-mcpMCP server for RAG citation marker [1] [2] injection, parsing, and stripping. Helps LLMs round-trip citations in retrieval-augmented outputs.
@mukundakatta/embspec-mcpMCP server: embedding pipeline ops + drift detection for production RAG. Assert query encoder matches index manifest; compute retriever stability between two index versions on a frozen probe set.
@mukundakatta/emoji-mcpMCP server: convert between emoji characters and shortcodes, with metadata.
@mukundakatta/encoding-mcpMCP server: encode and decode HTML entities, URL-encoding, and Unicode escapes.
@mukundakatta/escape-mcpMCP server: escape strings for safe use inside regex, shell, SQL, JSON, or HTML.
@mukundakatta/llm-output-sanitizer-mcpMCP server exposing llm-output-sanitizer: strip dangerous HTML, SQL, and shell snippets from LLM output before rendering, executing, or persisting. Built for Claude Desktop, Cursor, Cline, Windsurf, and Zed.
@mukundakatta/maskprompt-mcpMCP server: redact PII (emails, phones, credit cards, SSNs, IPs, AWS keys, GitHub tokens, JWTs) from text before it leaves your control. Built for Claude Desktop, Cursor, Cline, Windsurf, and Zed.
@mukundakatta/mathexpr-mcpMCP server: evaluate mathematical expressions (mathjs-backed) with units and constants.
@mukundakatta/mcpcheckValidate MCP (Model Context Protocol) config files for Claude, Cursor, Cline, Windsurf, and Zed. CLI + GitHub Action with SARIF output.
@mukundakatta/mdtable-mcpMCP server: convert JSON arrays of objects to GitHub-flavored Markdown tables.
@mukundakatta/mime-mcpMCP server: look up MIME content types from file extensions and vice versa.
@mukundakatta/pii-sentry-mcpMCP server exposing pii-sentry: detect and redact PII (emails, phones, SSNs, credit cards, API keys) from text before AI processing. Built for Claude Desktop, Cursor, Cline, Windsurf, and Zed.
@mukundakatta/pluralize-mcpMCP server: convert English words between singular and plural; handles irregulars.
@mukundakatta/prompt-injection-shield-mcpMCP server exposing prompt-injection-shield: scan untrusted text for prompt-injection signals, score risk, strip dangerous lines. Built for Claude Desktop, Cursor, Cline, Windsurf, and Zed.
@mukundakatta/roman-mcpMCP server: convert between Arabic numerals and Roman numerals (1-3999).
@mukundakatta/secretsniff-mcpMCP server: scan text/code for accidentally-committed secrets — AWS/GitHub/Slack/Stripe keys, JWTs, RSA private keys, and high-entropy strings. Built for Claude Desktop, Cursor, Cline, Windsurf, and Zed.
@mukundakatta/shellquote-mcpMCP server for safe shell argument escaping (bash, sh, cmd.exe, PowerShell). Stops LLM-generated shell commands from breaking on quotes, spaces, $vars, and backslashes.
@mukundakatta/shlex-mcpMCP server: split a shell-style command string into argv tokens (POSIX rules).
@mukundakatta/skillintLinter for Claude Code skill files (SKILL.md). Validates YAML frontmatter, required fields, descriptions, and secret leaks. CLI + GitHub Action.
@mukundakatta/slug-mcpMCP server: slugify strings for URLs/filenames. Unicode-aware, configurable separator, case folding.
@mukundakatta/sqlfmt-mcpMCP server for deterministic SQL formatting. 16 dialects (Postgres, MySQL, BigQuery, Snowflake, Redshift, etc.).
@mukundakatta/stats-mcpMCP server: descriptive statistics over a numeric array (mean, median, stddev, percentiles).
@mukundakatta/streamparseStreaming JSON parser that yields partial valid trees as tokens arrive. Built for LLM tool-call payloads, structured output streams, and any place a regular JSON.parse waits too long.
@mukundakatta/streamparse-mcpMCP server exposing streamparse: parse partial / truncated JSON, extract JSON from messy LLM output, validate JSON streams. Built for Claude Desktop, Cursor, Cline, Windsurf, and Zed.
@mukundakatta/textsanity-mcpMCP server: fast unicode/whitespace/encoding cleanup before LLM input. NFKC, zero-width strip, control strip, smart-punctuation conversion, emoji strip. Built for Claude Desktop, Cursor, Cline, Windsurf, and Zed.
@mukundakatta/xml-mcpMCP server: convert between XML and JSON. Attribute-aware, CDATA-safe.
@mukundakatta/yaml-mcpMCP server: convert between YAML and JSON. Preserves keys, arrays, nested structures.
PyPI packages (52 on pypi.org)
agent-loop-breaker-pyDetect repeated agent steps and stop runaway loops. Python port of @mukundakatta/agent-loop-breaker.
agent-regression-lens-pyDetect regressions between baseline and current AI agent runs. Python port of @mukundakatta/agent-regression-lens.
agent-run-diffCompare baseline vs current agent runs and surface regressions as structured reasons: success loss, new errors, failed tool calls, output drift, step/latency/cost bloat.
agent-trajectory-replay-pyReplay and diff AI agent event trajectories for debugging regressions. Python port of @mukundakatta/agent-trajectory-replay.
agentcast-pyStructured output for any LLM call. Validate-and-retry loop for JSON responses; BYO LLM and validator. Python port of @mukundakatta/agentcast.
agentfit-pyFit your messages into the LLM context window. Token-aware truncation with multiple strategies, pluggable tokenizers. Python port of @mukundakatta/agentfit.
agentguard-firewallNetwork egress firewall for AI agents. Declarative allow/deny list of hosts your agent tools may reach. Python port of @mukundakatta/agentguard.
agentsnap-pySnapshot tests for AI agents. Record an agent's tool-call trace, diff against a baseline, fail CI on regressions. Python port of @mukundakatta/agentsnap.
agentvet-pyValidate LLM-generated tool args before execution. Wraps tool functions with arg validation, raises ToolArgError with LLM-friendly retry hint. Python port of @mukundakatta/agentvet.
ai-eval-forgeZero-dependency eval harness for LLM and agent regression testing. Scores outputs with exact, contains, regex, JSON, citation, and token-F1 checks. Compares two runs to flag regressions.
ai-supply-chain-manifest-pyBuild and validate lightweight AI model / data / tool manifests. Python port of @mukundakatta/ai-supply-chain-manifest.
citation-integrity-checkVerify answer citations refer to supplied source ids and that cited sources actually support the claims. Python port of @mukundakatta/citation-integrity-check.
claude-commands-checkLinter for Claude Code slash-command files (.claude/commands/*.md). Validates YAML frontmatter, allowed-tools shape, description quality, and flags hardcoded secrets.
claude-hooks-checkLinter for Claude Code hooks configuration (the 'hooks' block of settings.json). Validates event names, matcher shape, command entries, and flags dangerous commands or hardcoded secrets.
claude-skill-checkLinter for Claude Code SKILL.md files. Validates YAML frontmatter, required fields, description length, and common secret patterns.
codex-skill-kitScaffold and validate Codex skills from the command line.
consent-redaction-log-pyRecord consent-aware redactions for privacy review trails. Zero-dep Python port of @mukundakatta/consent-redaction-log.
context-drift-detector-pyDetect topic drift between user intent, retrieved context, and AI answers. Python port of @mukundakatta/context-drift-detector.
context-forge-pyContext engineering toolkit for ranking, packing, and risk-scanning RAG context. Python port of @mukundakatta/context-forge.
context-window-packer-pyPack context chunks into a budget by relevance and priority. Python port of @mukundakatta/context-window-packer.
designlint-pyHTML/CSS accessibility and design linter: contrast, touch targets, headings, form labels, leaked secrets. Stdlib-only Python port of @mukundakatta/designlint.
embedding-dedupeDeduplicate near-identical embedding records by cosine similarity. Pure Python, zero runtime deps. Python port of @mukundakatta/embedding-dedupe.
eval-dataset-smith-pyGenerate balanced AI eval fixtures from source examples, bugs, docs, and policies. Python port of @mukundakatta/eval-dataset-smith.
eval-flake-detectorDetect flaky LLM eval cases across repeated runs. Pass-rate + standard-deviation per case, with per-case severity. Python port of @mukundakatta/eval-flake-detector.
hallucination-risk-meterEstimate hallucination risk in LLM answers from uncertainty language, unsupported specifics, citations, and context coverage. Python port of @mukundakatta/hallucination-risk-meter.
jailbreak-corpus-mini-pySmall local jailbreak and prompt-injection fixture set for tests. Python port of @mukundakatta/jailbreak-corpus-mini.
kavach-pySmall, inspectable threat-scoring library for AI-app security monitoring. Zero-dep Python port of @mukundakatta/kavach.
llm-cost-guard-pyEstimate LLM request cost and enforce per-request or per-session budgets. Python port of @mukundakatta/llm-cost-guard.
llm-output-sanitizer-pySanitize LLM outputs before HTML, SQL, shell, or markdown sinks. Python port of @mukundakatta/llm-output-sanitizer.
llm-response-schema-lite-pyTiny schema validator for structured LLM responses. Python port of @mukundakatta/llm-response-schema-lite.
llm-trace-sampler-pySample LLM traces by risk, errors, latency, and deterministic ids. Python port of @mukundakatta/llm-trace-sampler.
llm-usage-reportParse LLM API response logs (Anthropic, OpenAI, Google) and generate token / cost reports. Supports a --alert-at budget alarm that exits non-zero when total cost exceeds a threshold. No framework adoption required.
mcp-config-checkLinter for MCP (Model Context Protocol) config files used by Claude Desktop, Cursor, Cline, Windsurf, and Zed. CLI + library API.
mcpcheck-pyLint MCP config files for Claude Desktop, Claude Code, Cursor, Cline, Windsurf, and Zed. Stdlib-only Python port of @mukundakatta/mcpcheck.
mk-agentkitThe agent reliability stack in one install: agentfit + agentguard + agentsnap + agentvet + agentcast (Python ports).
model-fallback-planner-pyPlan model fallback chains from capability, cost, and health data. Python port of @mukundakatta/model-fallback-planner.
model-router-policy-pyPolicy-based model routing by capability, cost, latency, and privacy. Python port of @mukundakatta/model-router-policy.
partial-json-streamStreaming JSON parser that yields partial valid trees as tokens arrive. For LLM tool calls, structured outputs, and partial recovery.
pii-sentry-pyDetect and redact PII and secret-like values before logging or sending text to AI providers. Python port of @mukundakatta/pii-sentry.
prompt-injection-shield-pyScan retrieved text for prompt-injection risk before adding it to model context. Python port of @mukundakatta/prompt-injection-shield.
prompt-token-trim-pyTrim prompt messages to fit a token budget while preserving priority. Python port of @mukundakatta/prompt-token-trim.
prompt-version-diff-pyDiff prompt templates and flag risky instruction changes. Python port of @mukundakatta/prompt-version-diff.
rag-quality-kitHeuristic quality metrics for RAG retrieval and grounded answers. Python port of @mukundakatta/rag-quality-kit.
rag-staleness-auditor-pyFind stale RAG chunks by age, version, and freshness requirements. Python port of @mukundakatta/rag-staleness-auditor.
retrieval-acl-filter-pyEnforce document ACLs after retrieval and before prompting. Python port of @mukundakatta/retrieval-acl-filter.
semantic-cache-keyStable semantic cache keys for LLM requests. Invariant to whitespace, casing, and key ordering; sensitive to model swaps, tool list, and retrieval context. Python port of @mukundakatta/semantic-cache-key.
skillint-pyLint Claude Code SKILL.md files for frontmatter, required fields, descriptions, and hardcoded secrets. Stdlib-only Python port of @mukundakatta/skillint.
system-prompt-leak-scanDetect system prompt leakage in LLM model outputs via known patterns, configured-prompt substring matching, and unique fingerprint phrases. Python port of @mukundakatta/system-prompt-leak-scan.
tool-call-contracts-pyValidate LLM tool-call payloads with small JSON-like contracts. Python port of @mukundakatta/tool-call-contracts.
tool-permission-gate-pyPolicy-check agent tool calls before execution. Python port of @mukundakatta/tool-permission-gate.
tool-result-taint-pyTrack untrusted tool output before it enters prompts or actions. Python port of @mukundakatta/tool-result-taint.
vector-poison-scoreScore (query, document) pairs for vector/RAG poisoning signals: vector-text mismatch, instruction-like payloads, NaN, suspiciously round numbers. Python port of @mukundakatta/vector-poison-score.
agent-deadlineCooperative per-task deadline primitive for AI agent workflows.
agent-decision-logWHY-layer decision log for AI agents: record options considered, the option chosen, the rationale, and the outcome, then persist as JSONL.
agent-epoch-counterNamed counters that reset on epoch boundaries for per-session rate tracking
agent-error-recoveryComposable error recovery strategies for LLM agent tool calls
agent-event-busTiny in-process pub/sub for agent loop events. Sync-only Rust mirror of the Python agent-event-bus library: register handlers by event name (or `*` for every event), emit, and remove via opaque Subscription handles. Subscriber panics are isolated so the bus keeps dispatching.
agent-event-emitStructured event emitter for agent runs. Append-only, JSON-line-serializable events with monotonic ids, run id, and timestamps. Zero deps beyond serde_json.
agent-event-logStructured lifecycle event log for AI agent runs
agent-feature-gateFeature flags and capability gates for AI agent tool access
agent-lockCooperative execution lock to prevent concurrent agent runs on the same key
agent-memory-storeSimple key-value memory store for AI agents — persist facts across turns
agent-message-windowSliding window of recent LLM conversation turns with paired-protection: never drop a tool_use without its tool_result sibling. Zero deps (serde_json only).
agentfitFit messages to an LLM context window. Token-aware truncation with pluggable tokenizers and multiple strategies.
agentguardNetwork egress firewall for AI agent tools. Declarative domain allowlist; throws on violation. Optional reqwest-middleware integration.
agentidempIdempotency keys for LLM agent retries. Deterministic content-derived keys (UUIDv5 or sha256-hex) so retries dedupe at the provider.
agentpromptLLM prompt templates with Jinja2 syntax. Render system/user/assistant turns into a typed message list, ready for the Anthropic or OpenAI SDK.
agentsnapSnapshot tests for AI agent traces. Record once, replay-and-compare on every run; the agent equivalent of Jest snapshots.
agenttapWire-level prompt introspection for LLM SDK calls. See exactly what was sent, with credentials redacted by default.
agenttraceCost + latency aggregation for LLM agent runs. Group calls into named runs, get totals, p50/p95, and per-model breakdowns. Composes with cachebench.
agentvetValidate LLM-generated tool args before execution. Throws a structured ToolArgError with LLM-friendly retry hints when the model hallucinates wrong types or missing fields.
annflat-corePure-Rust core for annflat: small in-memory flat-file ANN over f32 vectors.
bedrock-costCalculate AWS Bedrock invocation cost across vendors (Llama, Mistral, Cohere, Titan, AI21). Cross-region inference profile aware. No SDK dependency.
bm25-rerankBM25 reranker for RAG: in-memory term-frequency reranking against a small candidate set. Stateless, zero deps.
bom-stripStrip UTF-8/16/32 BOM bytes and stray U+FEFF code points from text before parsing or hashing. Zero deps.
cachebenchPrompt-cache observability for LLM APIs. Per-call hit ratio, cost saved, regression alerts. Anthropic, OpenAI, Bedrock.
char-token-estTokenless byte/char-based token-count estimator for LLM prompts. Per-model-family calibration for Claude, GPT, Gemini, Llama. Zero deps.
chunk-flushFlush-on-newline buffer for streaming LLM output. Holds bytes until a newline or N millis pass, then yields a complete chunk. Zero deps.
citeciteCitation-marker [1] [2] injector + parser for RAG outputs. Round-trips between sources and rendered text.
claude-costCalculate Claude API call cost from a usage block. Cache-aware (cache_creation, cache_read), supports Anthropic API and AWS Bedrock model IDs, BYO pricing override. No SDK dependency.
claude-streamParse Anthropic's Server-Sent Events stream into typed events. No SDK dependency — feed it bytes, get back message_start, content_block_delta, etc.
code-chunkSplit source code into RAG-friendly chunks that respect function and class boundaries. Brace and indent-aware, language-agnostic heuristics. Zero deps.
content-casContent-addressed cache primitive: store bytes under their SHA-256 hex, retrieve by hex, atomic on-disk persistence. Zero deps.
conversation-codecJSONL save/load for LLM conversation messages with optional per-message redaction. Zero deps beyond serde_json.
cosine-fastHot-loop cosine similarity for f32 slices. Auto-vectorized scalar core, optional precompute-norms helper. Zero deps.
cost-meterAggregate LLM API cost across providers, models, and time windows. Provider-agnostic — pairs with claude-cost, openai-cost, gemini-cost, bedrock-cost. No SDK dependency.
embed-keyDeterministic cache key for an embedding request: hash text + mix in provider, model, and dimensionality. So a cache survives model upgrades without false hits. Zero deps.
embedcache-corePure-Rust core for embedcache: a content-addressed embedding cache.
embedrankBatched cosine, dot, L2 distance for f32 embeddings, with a heap-based top-k selector. No BLAS, no allocator surprises.
emoji-sanitizeNormalize or strip emoji-related Unicode (presentation selectors, variation selectors, zero-width joiners) from text before LLM input. Zero deps.
eval-flake-rsDetect flaky LLM eval cases by tracking pass/fail across repeated runs. Returns per-case flip-rate and an overall flakiness score. Zero deps.
gemini-costCalculate Google Gemini API call cost from a usage block. Cache-aware, supports Gemini 2.5 Pro/Flash/Flash-Lite and 2.0 families, BYO pricing override. No SDK dependency.
gold-cmpPairwise comparison runner for gold-set LLM evals: A vs B winner counting, statistical-significance helper, win-rate summary. Zero deps.
homoglyph-detectDetect Cyrillic/Greek lookalike chars masquerading as ASCII. For prompt-injection and phishing defense. Zero deps.
html-entity-fixDecode HTML entities (& < > " ' numeric refs) that LLMs sometimes emit in plain-text output. Zero deps.
json-pluckPluck a single value out of a serde_json::Value by dotted path or simple JSONPath. Lossy, forgiving, intended for LLM-emitted JSON.
json-streamparse-rsStreaming JSON balance detector: feed bytes incrementally, ask whether the buffer currently holds a complete top-level value. String/escape aware. Zero deps.
latency-bucketsStreaming histogram + percentile estimator for LLM call latencies. Fixed log-scale buckets, O(1) record, p50/p90/p95/p99 in microseconds. Zero deps.
lineifyTurn a token-by-token stream into stable line events. Buffer until a newline arrives, then emit the whole line. Zero deps.
llm-budget-windowTime-windowed token + USD budget. Define multiple rolling windows (e.g. $5/minute, $100/day) and reject when any window's cap would be breached. Thread-safe, zero deps.
llm-circuit-breakerTiny circuit breaker for LLM API calls. Opens after N consecutive failures, half-opens to probe after a reset window, closes on success. Thread-safe, no async runtime lock-in.
llm-content-blocksTyped fluent builder for Anthropic Messages-API content blocks (text, image, tool_use, tool_result, document). Emits the exact JSON shape the API expects. No SDK dependency.
llm-context-budgetTrack and enforce context window budget for LLM conversations
llm-cost-capPre-flight USD cost gate for LLM calls. Estimate input plus output cost from token counts and reject calls that would exceed a configured cap.
llm-fallback-chainMulti-provider failover for LLM calls. Try provider A, fall back to B then C on failure.
llm-fallback-routerMulti-provider failover for LLM calls. Try Anthropic, fall back to OpenAI/Gemini/Bedrock on retryable errors. Per-attempt audit log. Zero runtime deps. BYO clients.
llm-json-extractorExtract JSON from LLM outputs that may contain preamble or postamble text
llm-json-repairClean and parse JSON emitted by LLMs. Strips markdown fences, trailing commas, and surrounding prose so serde_json::from_str works.
llm-message-dedupRemove duplicate or near-duplicate messages from LLM conversation history
llm-message-dispatchRoute incoming messages to named handlers based on keyword or prefix rules
llm-message-hashStable canonical hash of LLM request/message structures. Recursive key-sorting JSON canonicalization + sha256, with per-provider ignore-lists so semantically-equal Anthropic/OpenAI/Bedrock requests produce the same hash. Useful for cache keys and idempotency.
llm-mockDeterministic mock LLM for testing agent pipelines
llm-model-selectorSelect the best LLM model based on task requirements and budget
llm-provider-idNormalize and parse LLM provider and model identifiers
llm-request-logStructured log of LLM API requests with IDs, timing, and token counts
llm-response-cacheIn-memory LRU cache for LLM responses keyed by request hash
llm-response-windowSliding window of recent LLM responses for context tracking
llm-retryRuntime-agnostic exponential backoff with full jitter for LLM API calls. Built-in retryable-error code lists for Anthropic, OpenAI, AWS Bedrock, Google. Sync + tokio. Pure-std core.
llm-sampling-paramsBuilder for LLM sampling parameters (temperature, top_p, top_k, stop sequences)
llm-stop-conditionsComposable stop conditions for LLM agent loops: max iters, USD, tokens, seconds, no-progress, custom.
llm-stop-token-guardDetect if LLM output was truncated by a stop token or max-tokens limit
llm-structured-retryRetry LLM calls by injecting previous error as a follow-up user message
llm-think-tag-stripStrip <thinking>/<think> reasoning blocks from LLM output (Claude, DeepSeek, etc.)
llm-token-splitSplit long text into overlapping chunks for LLM context windows
llm-tool-registryRegistry of available tools with schema for LLM agents
llm-turn-counterCount LLM conversation turns with per-role breakdown and limit enforcement
llmfleetFleet-level batch dispatcher for LLM APIs. Pool requests across tasks, route to provider Batch APIs, save 50% on cost without rewriting your agent loops.
lru-tokensLRU cache where eviction is weighted by token count, not entry count. Bound a prompt cache by tokens (or any other size unit) instead of N entries. Zero deps.
lshdedup-corePure-Rust core for lshdedup: MinHash + LSH near-duplicate detection.
markdown-chunkSplit Markdown into RAG-friendly chunks that respect heading hierarchy. Keeps each chunk under a soft char cap; never splits inside a fenced code block. Zero deps.
markdown-stripStrip Markdown formatting (headers, bold, italic, links, code, blockquotes) to plain text. Conservative, fast, zero deps.
maskprompt-corePure-Rust core for maskprompt: PII redaction for LLM prompts.
mmr-rerankMaximal Marginal Relevance reranker for RAG: diversify a set of retrieved documents by balancing query-relevance against pairwise novelty. Zero deps.
openai-costCalculate OpenAI API call cost from a usage block. Cache-aware (cached_input_tokens), supports GPT-5, GPT-4.1, o3, o4 model families, BYO pricing override. No SDK dependency.
otel-genai-bridgeTranslate LLM telemetry attributes between OpenInference and OpenTelemetry GenAI semantic conventions. No telemetry SDK dependency.
output-sanitize-rsStrip dangerous HTML/SQL/shell snippets from LLM output before render, query, or shell sinks. Rust port of @mukundakatta/llm-output-sanitizer. Zero deps.
prompt-cache-keyStable Anthropic prompt-cache scope hashes. Given (system, tools, model), produce a deterministic key that survives benign reordering.
prompt-cache-warmerPre-warm Anthropic prompt cache before user traffic. Injects cache_control breakpoints, fires a tiny warmup call, optionally verifies the cache hit, and reports tokens, latency, and estimated cost. No SDK dependency.
prompt-eval-rubricScore LLM outputs against named 0.0-1.0 rubrics. Rubrics rank; validators reject. Weighted aggregation, exception-isolated scoring.
prompt-fence-stripStrip ```code fences```, leading prose, and trailing chatter from LLM output so the structured payload survives. Zero deps.
prompt-hashDeterministic cache key for an LLM prompt: normalize whitespace, hash messages, mix in model + temperature. Pairs with semantic-cache-key. Zero deps.
prompt-inj-rsPrompt-injection risk scanner. Rust port of @mukundakatta/prompt-injection-shield. Returns a 0-1 score plus per-rule findings. Zero deps.
prompt-length-guardEnforce minimum and maximum length constraints on LLM prompts
prompt-part-builderBuild structured prompt parts: instruction, example, context, format, constraint
prompt-shieldPattern-based prompt-injection detection for LLM apps.
prompt-token-counterApproximate token counts for LLM messages, system prompts, and tools. Zero-config chars/4 heuristic, BYO tokenizer, content-block aware (text/image/tool_use/tool_result/document). One serde_json dep.
promptbudgetToken-budget-aware text truncation with multiple strategies. Bring-your-own tokenizer, no hard tiktoken dep.
promptverHash and version prompt templates so eval results, cache keys, and audit logs stay stable when templates change. Whitespace-normalized SHA-256. Zero deps.
ragdriftFive-dimensional drift detection for production RAG systems. Re-export of ragdrift-core.
ragdrift-corePure-Rust core for ragdrift: 5-dimensional drift detection for RAG systems.
ragmetricIR metrics for RAG retrieval evaluation: recall@k, MRR, NDCG@k, hit@k. Pure data ops, no model dependencies.
regex-pii-rsRegex-only PII detector for emails, phones, SSNs, credit cards, and prefixed API keys. Rust port of pii-sentry. Zero deps.
rerank-blendBlend N RAG reranker score streams (dense, BM25, cross-encoder) with configurable weights and rank-aware normalization. Zero deps.
rtl-flip-detectDetect right-to-left override (U+202E) and other bidi-control characters that flip rendering of strings. Used in filename-spoof and prompt-injection attacks. Zero deps.
schema-coerceCoerce LLM JSON values to a simple field-schema: string->int, bool, float; strip wrapper objects; fill defaults. Forgiving structured-output recovery.
secret-maskMask known secret patterns (API keys, JWTs, AWS access keys, GitHub tokens) in log lines before they reach stdout/files/sinks. Zero deps.
secretsniff-corePure-Rust core for secretsniff: source-code secret scanner.
snipsplit-corePure-Rust core for snipsplit: token-aware text chunker for RAG ingestion.
sse-frameStreaming parser for Server-Sent Events frames as used by LLM APIs (OpenAI, Anthropic, Vertex). Push bytes, get back complete event records. Zero deps.
step-idStable IDs for agent steps: deterministic hash of (run_id, step_index, kind) so events and traces share keys across reruns. Zero deps.
stopstreamStreaming-safe stop-sequence detector for LLM token streams. Handles partial matches at chunk boundaries.
stream-chunkrecRecombine LLM streaming token deltas into stable text. Buffers partial words, handles UTF-8 fragments across chunks. Zero deps.
textsanity-corePure-Rust core for textsanity: unicode/whitespace/encoding cleanup.
tiktoken-streamStreaming token counter for partial LLM responses. Accumulates token count across chunks without holding the full text. Pluggable estimator function. Zero deps.
token-budget-poolShared token + dollar budget across concurrent LLM tasks. Thread-safe, returns BudgetExceeded when a record would push past a cap. Zero deps.
toklab-corePure-Rust core for toklab: bulk tokenizer + counter for OpenAI BPE encodings.
toml-repairRepair messy TOML emitted by LLMs: strip markdown fences, normalize line endings, fix common quote-style slips, trim trailing whitespace. Zero deps.
tool-arg-coerceFix common type slips in LLM-generated tool arguments: string->int/float/bool, single-element array->scalar, null->default. Lossy and forgiving.
tool-arg-defaultsApply per-tool default kwargs to LLM-generated tool calls. Caller-supplied values always win. Null is a real value, not 'use the default'. Operates over serde_json::Value.
tool-arg-fuzzyFuzzy-match LLM-generated args to JSON Schema enum values. Zero-Levenshtein cascade (exact / case-insensitive / prefix / substring) with ambiguity guard. coerce_enums rewrites enum-constrained string properties before strict validation.
tool-call-batcherQueue multiple pending tool calls and flush them as a batch
tool-call-budgetsPer-tool call count caps for AI agents. Stops runaway tool loops before they hit the invoice.
tool-call-planBuild and execute ordered tool call plans for LLM agents
tool-call-traceRecord and replay tool call traces for LLM agents
tool-input-sanitizerSanitize and normalize LLM tool call inputs before execution
tool-loop-breakDetect repeated agent tool invocations to break runaway loops. Tracks recent (tool, args_hash) tuples and signals a loop when count exceeds a threshold. Zero deps.
tool-loop-guardDetect when an LLM agent gets stuck calling the same tool with the same args. Sliding-window guard that raises on repeated (tool_name, args) pairs. Custom key function support, zero heavy deps.
tool-output-truncateTruncate tool output (file reads, command runs, search hits) before adding to LLM message history. Char-aware head/middle/tail strategies with a configurable elision marker. Zero deps.
tool-result-aggregatorCollect and merge multiple tool call results into a single JSON object
tool-result-cacheContent-addressable LRU cache for LLM agent tool calls. Same tool, same args -> same answer, returned from memory. Optional TTL, content-addressable on (tool_name, args) with canonical-JSON keys.
tool-retry-policyDeclarative retry policy for LLM tool calls: per-tool max-attempts, exponential backoff, jitter, retriable-error filter. Returns a sleep duration; you run the call. Zero deps.
tool-secret-scrubberStrip secrets (API keys, JWTs, bearer tokens, AWS keys, etc.) from arbitrary JSON-like values before they hit your logs. Walks objects/arrays, preserves shape, never mutates input.
tool-side-effects-tagDeclare what an LLM agent tool actually does (read, write, idempotent, destructive, external, expensive, network) so the scheduler / retry layer can decide what's parallel-safe and retry-safe. Zero runtime deps; serde optional.
tool-timing-budgetTotal wall-clock time budget across all tool calls in an agent run
trace-diffDiff two agent traces semantically: align by event type + key, ignore timestamps and ids, return added/removed/changed steps. Zero deps beyond serde_json.
trace-redactRedact sensitive fields (api keys, tokens, emails, phone numbers) from agent traces before exporting to OTel or a log sink. Zero deps.
vecnorm-corePure-Rust core for vecnorm: bulk vector ops on f32 matrices.
yaml-repairRepair messy YAML emitted by LLMs: strip markdown fences, tabs->spaces, dedent leading whitespace, normalize line endings. Returns cleaned text suitable for any YAML parser. Zero deps.
zero-width-stripStrip zero-width and bidi-control Unicode characters from text. Defends against invisible-payload prompt injection. Zero deps.
MCP Registry servers (20 on registry.modelcontextprotocol.io)