shivvr 🔪 v0.2.0

Ephemeral semantic embedding service.

Chunk text. Embed with GTR-T5-base (768d). Search by cosine similarity.
Fully in-memory. No disk. No state between restarts. GPU on Cloud Run.

View API Quick Start GitHub Health
213s
Uptime
CUDA
Compute
Encryption
enabled
Inversion

Capabilities

Ingest

Sentence-boundary chunking + GTR-T5-base embeddings (768d). Stores in RwLock<HashMap> — pure ephemeral compute.

Search

Cosine similarity with optional temporal decay weighting (decay_halflife_hours) and nearby context expansion.

Temp store

Named ephemeral vector stores with 2 hr TTL. Ideal for agent working memory that doesn't need to outlive a session.

Crypto

Per-agent orthogonal matrix rotation on embeddings. Cosine similarity preserved under encryption. Keys are in-memory only.

Dual embedding

organize role uses local GTR-T5-base (768d, always free). retrieve role uses OpenAI text-embedding-ada-002 (1536d) — pass your own key per-request or set OPENAI_API_KEY server-side.

Auth

nuts-auth RS256 JWT + ahp_ API tokens. organize is always free. retrieve requires a token. Unset JWKS URL = open dev mode.

API

MethodEndpointDescription
GET/healthStatus, model info, live counts
POST/sessions/:id/ingestChunk + embed text into session
GET/sessions/:id/search?q=...Semantic search with optional decay
GET/sessions/:idSession metadata
DELETE/sessions/:idDelete session
GET/tempList temp stores with TTL
POST/temp/:name/ingestIngest into temp store (2 hr TTL)
GET/temp/:name/search?q=...Search temp store
DELETE/temp/:nameDelete temp store
POST/agent/:id/registerRegister per-agent orthogonal key
POST/agent/:id/encryptEncrypt embeddings
POST/agent/:id/decryptDecrypt embeddings
POST/invertReconstruct text from embedding vector

Quick start

# Ingest
curl -X POST https://shivvr.nuts.services/sessions/my-session/ingest \
  -H "Content-Type: application/json" \
  -d '{"text": "The harbor was quiet at dawn. Only the sound of halyards against aluminum masts.", "source": "journal"}'

# Search
curl "https://shivvr.nuts.services/sessions/my-session/search?q=morning+at+the+marina&n=5"

# Search with temporal decay (30% recency, 24h half-life)
curl "https://shivvr.nuts.services/sessions/my-session/search?q=marina&time_weight=0.3&decay_halflife_hours=24"

# Retrieve role with your own OpenAI key (no server key needed)
curl -X POST https://shivvr.nuts.services/sessions/my-session/ingest \
  -H "Content-Type: application/json" \
  -d '{"text": "Dense passage for retrieval.", "openai_api_key": "sk-..."}'

curl "https://shivvr.nuts.services/sessions/my-session/search?q=passage&role=retrieve&openai_api_key=sk-..."

# Temp store (expires in 2h)
curl -X POST https://shivvr.nuts.services/temp/scratch/ingest \
  -H "Content-Type: application/json" \
  -d '{"text": "Working notes for this agent session."}'

Search parameters

ParamDefaultDescription
qrequiredQuery text
n5Number of results
roleorganizeorganize (768d local) or retrieve (1536d OpenAI)
time_weight0.0Blend semantic + recency score (0–1)
decay_halflife_hours168Recency decay half-life in hours
include_nearbyfalseReturn temporally adjacent chunks
agent_idAgent ID for encrypted search
openai_api_keyPer-request OpenAI key for retrieve role (overrides server key)

Environment

VariableDefaultDescription
PORT8080Listen port
MODEL_PATHmodels/gtr-t5-base.onnxGTR-T5-base ONNX embedder
TOKENIZER_PATHmodels/tokenizer.jsonTokenizer
OPENAI_API_KEYEnables text-embedding-ada-002 retrieve role
OPENAI_EMBEDDING_MODELtext-embedding-ada-002Override OpenAI model
NUTS_AUTH_JWKS_URLEnable auth (open dev mode if unset)
NUTS_AUTH_VALIDATE_URLhttps://auth.nuts.services/api/validateAPI token validation endpoint

Stack

LayerChoice
RuntimeRust + Tokio + axum
EmbeddingGTR-T5-base (768d) via ONNX Runtime 2.0 — local, required
Retrieve embeddingtext-embedding-ada-002 via OpenAI API — optional
StorageEphemeral RwLock<HashMap> — no disk, no volume mounts
GPUCUDA 12.6 via ort EP on Cloud Run L4 — CPU fallback automatic
Authnuts-auth RS256 JWT + ahp_ API tokens — optional
Inversionvec2text gtr-base (projection + T5 enc/dec) — optional