Configuration¶
All runtime configuration lives in db.env (the sourced shell-env format), with tokens, allowed IPs, and tenant directories stored as plain-text files under $DB_ROOT.
db.env¶
Placed in the working directory where you run shard-db (usually build/bin/db.env). Loaded once at startup.
| Variable | Default | Description |
|---|---|---|
DB_ROOT |
./db |
Root directory for all data, indexes, metadata, and per-tenant subdirectories. |
PORT |
9199 |
TCP listen port. |
TIMEOUT |
0 |
Query timeout in seconds. 0 = disabled. Enforced via cooperative cancellation in scan loops. |
LOG_DIR |
./logs |
Directory for async logs (info-YYYY-MM-DD.log, error-YYYY-MM-DD.log, slow-YYYY-MM-DD.log). |
LOG_LEVEL |
3 |
0=off · 1=errors · 2=warnings · 3=info · 4=debug. |
LOG_RETAIN_DAYS |
7 |
Auto-prune logs older than N days. 0 = keep forever. |
INDEX_PAGE_SIZE |
4096 |
B+ tree page size in bytes (power of 2, 1024–65536). |
THREADS |
0 (auto) |
Parallel-for worker pool — drives every parallel hot path (shard scans, indexed find/count/aggregate fan-out, parallel index builds, bulk-insert phase 2). 0 = 4 × nproc, minimum 4. |
IO_THREADS |
0 (auto) |
Separate I/O thread pool size for cache-bypassing operations (O_DIRECT scans, bulk-fetch). 0 = 4 × nproc. I/O threads wait on page faults and benefit from oversubscription without starving CPU-bound queries. |
WORKERS |
0 (auto) |
Server-worker pool that accepts connections + dispatches request handlers. 0 = auto (CPU count, minimum 4). |
POOL_CHUNK |
0 (auto) |
parallel_for submission chunk size. 0 = nproc. Tasks are enqueued in chunks of this many; larger chunks reduce queue-lock contention but serialise concurrent submitters. Rarely needs tuning. |
GLOBAL_LIMIT |
100000 |
Default limit applied when a query omits one. Per-query limit is not clamped — pass any value to override. |
MAX_CONCURRENT_QUERIES |
0 (auto) |
Hard cap on queries in-flight simultaneously. 0 = max(4, min(nproc, 32)). When the cap is reached, additional requests get {"error":"server at capacity","max_concurrent_queries":N} and the client should retry. Worst-case query-buffer RAM = MAX_CONCURRENT_QUERIES × QUERY_BUFFER_MB. |
MAX_REQUEST_SIZE |
33554432 (32 MB) |
Maximum JSON request size per line. Oversized requests get {"error":"Request too large (max N bytes)"}. Every connection allocates a read buffer of this size, so total per-conn memory = N × MAX_REQUEST_SIZE. |
FCACHE_MAX |
4096 |
Unified shard-mmap cache capacity (entries). Strict allow-list: {4096, 8192, 12288, 16384}. Invalid values fall back to default with a warning. See Tuning. |
BT_CACHE_MAX |
derived | Not configurable as of 2026.05.1. Derived as FCACHE_MAX / 4 (so {1024, 2048, 3072, 4096}). Setting it in db.env emits a stderr warning and is ignored. |
QUERY_BUFFER_MB |
256 |
Per-query intermediate buffer cap. With MAX_CONCURRENT_QUERIES bounding fan-in, worst-case RAM stays predictable. Auto-tunes upward on big-RAM hosts with low slot counts (floor: 256 MB). |
INDEX_BUILD_BUDGET_MB |
1024 |
Peak per-pass memory budget for reindex / multi-field add-index / migrate's phase-2 rebuild. Floor 64 MB. Multi-field builds group fields into passes that fit this cap; an oversized single field still runs alone. See Tuning → INDEX_BUILD_BUDGET_MB. |
DB_ODIRECT_BUF_MB |
32 |
O_DIRECT buffer size per worker in MB. Each parallel worker reads shard data in chunks of this size using cache-bypassing pread. Peak O_DIRECT RAM is approximately 2 × DB_ODIRECT_BUF_MB × IO_THREADS. Larger chunks reduce syscall overhead on fast NVMe; smaller chunks reduce peak memory. |
DISABLE_LOCALHOST_TRUST |
0 |
Default: 127.0.0.1/::1 bypasses auth (assumes a trusted loopback proxy). Set to 1 for strict mode (tokens required even same-host). |
TOKEN_CAP |
1024 |
Open-addressed bucket count for the token store. Bump to 4096+ if you run thousands of tokens across scopes. |
SLOW_QUERY_MS |
500 |
Log queries slower than N ms to slow-*.log and the in-memory ring (stats endpoint). 0 = disable. Minimum 100 ms. |
RANDOM_SEQ_COST_RATIO |
8 |
Planner cost model: random-read penalty vs sequential scan. Higher values prefer full-scan; the planner chooses fetch-and-check when matches < live_count / RANDOM_SEQ_COST_RATIO. Tuning knob for workload-specific I/O patterns. |
WARMUP |
async |
Cache warmth on startup. async = detached thread (first queries race cache population); sync = block startup until complete; off = skip (rely on lazy populate). Primes OS page cache for kf + index shards so the first user query enjoys warm caches. |
AUTO_VACUUM |
0 |
1 = enable a background thread that periodically polls vacuum-check's recommendation logic and runs plain vacuum on objects that meet the thresholds. Never auto-runs --compact or --splits — both need an exclusive objlock for a long rebuild window and stay manual. |
AUTO_VACUUM_INTERVAL_SEC |
3600 |
Auto-vacuum poll cadence in seconds. Floor 60. Sleep is sliced into 1-second chunks so SIGTERM brings the thread down within a second. |
VACUUM_RECOMMEND_TOMBSTONE_PCT |
10 |
Tombstone ratio at which vacuum-check flags an object for cleanup (deleted * 100 ≥ total * N). Also drives auto-vacuum when enabled — same threshold for both manual and automated paths. |
VACUUM_RECOMMEND_MIN_DELETED |
1000 |
Absolute floor on deleted count below which vacuum-check does not recommend cleanup, even if the percentage clears. Prevents tiny objects from triggering vacuum overhead that exceeds the work saved. |
TLS_ENABLE |
0 |
1 = require TLS 1.3 on PORT; plaintext clients rejected at handshake. See Operations → Deployment → Native TLS. |
TLS_CERT / TLS_KEY |
(empty) | Server cert + private key paths (PEM). Required when TLS_ENABLE=1. |
TLS_CA |
(empty) | Client-side CA bundle for verifying the server (defaults to OS trust store). |
TLS_SKIP_VERIFY |
0 |
Client-side: 1 skips server cert verify (dev only — emits stderr warning). |
Example:
# db.env
export DB_ROOT="../db"
export PORT=9199
export TIMEOUT=30
export LOG_DIR="../logs"
export LOG_LEVEL=3
export LOG_RETAIN_DAYS=14
export INDEX_PAGE_SIZE=4096
export THREADS=0
export IO_THREADS=0
export WORKERS=0
export GLOBAL_LIMIT=100000
export MAX_CONCURRENT_QUERIES=0
export MAX_REQUEST_SIZE=33554432
export FCACHE_MAX=4096
# BT_CACHE_MAX is no longer configurable — derived as FCACHE_MAX / 4
export QUERY_BUFFER_MB=256
export INDEX_BUILD_BUDGET_MB=1024
export DB_ODIRECT_BUF_MB=32
export TOKEN_CAP=1024
export DISABLE_LOCALHOST_TRUST=0
export SLOW_QUERY_MS=500
export RANDOM_SEQ_COST_RATIO=8
# Startup warmup — primes caches on startup
export WARMUP=async
# Auto-vacuum — opt-in. Same thresholds drive `vacuum-check` recommendations.
export AUTO_VACUUM=0
export AUTO_VACUUM_INTERVAL_SEC=3600
export VACUUM_RECOMMEND_TOMBSTONE_PCT=10
export VACUUM_RECOMMEND_MIN_DELETED=1000
# Native TLS — leave TLS_ENABLE=0 unless terminating TLS in-process
export TLS_ENABLE=0
export TLS_CERT=""
export TLS_KEY=""
export TLS_CA=""
export TLS_SKIP_VERIFY=0
Every variable is optional; defaults apply when the file or a specific export is missing. Changes require a server restart.
Tenant directories — dirs.conf¶
Every data query must include a dir parameter (e.g., "dir":"acme"). That directory must be listed in $DB_ROOT/dirs.conf — one tenant name per line.
The default dir is auto-registered on first use. Add tenants with a plain edit + server restart, or at runtime via create-object (which will create and register the tenant path automatically if it's new).
Queries for unregistered dirs return {"error":"Unknown dir: <name>"}.
See Concepts → Multi-tenancy for the isolation model.
API tokens — tokens.conf¶
$DB_ROOT/tokens.conf — one token per line. Any request with "auth":"<token>" that matches a line is accepted from any IP.
Tokens are loaded at startup and refreshed when add-token / remove-token JSON modes are used (see Operations → Deployment).
IP allowlist — allowed_ips.conf¶
$DB_ROOT/allowed_ips.conf — one IP per line. Requests from these IPs bypass the token check.
Localhost (127.0.0.1 and ::1) is trusted by default — you don't need to add it. Use the allowlist for sidecar processes or trusted services on other hosts.
Precedence¶
A request is authenticated if either:
- The client IP is on
allowed_ips.conf, or - The request carries
"auth":"<token>"and the token is intokens.conf.
Both lists live under $DB_ROOT so they travel with the data root.
Schema — schema.conf and per-object fields.conf¶
$DB_ROOT/schema.conf— one line per object:dir:object:splits:max_key:2:streams[:auto_key=<mode>]. Auto-managed bycreate-object— don't edit by hand. The literal2is the engine-version slot, kept on disk for forward compatibility; this version refuses any other value at load (legacy v1 objects must be upgraded via 2026.05.4's./migratefirst).streamsis the number of seg-write streams (derived from nproc at create time, immutable for the object's life unlessvacuumself-heals a streams-mismatch).$DB_ROOT/<dir>/<object>/fields.conf— typed field definitions, one per line:name:type[:size|P,S][:default=...]. Also auto-managed (viacreate-object,add-field,remove-field,rename-field).
See Concepts → Typed records for the on-disk layout and all type definitions.
Storage layout¶
$DB_ROOT/
.shard-db.lock # flock guard — prevents two daemons sharing this root
tokens.conf # Global API tokens
allowed_ips.conf # Trusted IPs
dirs.conf # Allowed tenant directories
schema.conf # Object catalog (one line per object)
<dir>/ # Per-tenant directory
tokens.conf # Per-tenant tokens (optional)
<object>/
tokens.conf # Per-object tokens (optional)
fields.conf # Typed field schema
metadata/
sequences/ # Per-named-sequence counter files
data/ # v2 slotcask engine — see Concepts → Storage model
kf/
NNN.kf # Keyfile shards (24B header + 24B slot array)
streams/
NNN/ # One subdir per write stream (nproc-derived)
NNNNNN.dat # Append-only segment files (128 MB rotation)
indexes/
index.conf # List of indexed fields
<field>/ # Per-field directory (per-shard btree layout)
NNN.idx # Sharded B+ tree files, index_splits_for(splits) of them
<a>+<b>/ # Composite index — '+' joined name
NNN.idx
backup/
<YYYYMMDD-HHMMSS>/ # Per-backup snapshots created by `backup` mode
...
files/ # Stored files (put-file)
<filename> # Flat — basename is the lookup key
logs/
info-YYYY-MM-DD.log
error-YYYY-MM-DD.log
slow-YYYY-MM-DD.log
The slotcask engine separates keys from values:
data/kf/NNN.kf— keyfile shards. Each is a 24-byte file header (SKF1magic + version + live count + tombstone count) followed by a packed array of 24-byte slot headers. Slot count per shard is tiered onsplits— 1M atsplits ≤ 16, 256K atsplits ≤ 128, 128K atsplits ≤ 1024, 64K atsplits ≤ 4096. Shards auto-resplit in-place (doubling capacity) at 75 % fill, up to a per-shard ceiling of 16M slots.data/streams/NNN/NNNNNN.dat— append-only segment files. Each stream has its own subdirectory; segments rotate at 128 MB. Writers in different streams don't contend. The number of streams is fixed atcreate-objecttime fromnproc(≤ 8 → nproc; ≤ 16 → 8; else 16).
Each indexed field is split into index_splits_for(splits) btree files — non-linear curve in src/db/types.h: 8→2, 16→4, 32→4, 64→8, 128→16, 256→16, 512→32, 1024→64, 2048→64, 4096→128. Writes route by record hash to a single idx-shard; reads fan out across all shards in parallel.
Pre-2026.05.5 installs with legacy v1 (probe-into-slot, data/NNN.bin shard files with Zone A + Zone B layout) must first upgrade to 2026.05.4 and run that release's ./migrate to convert objects to the slotcask layout — 2026.05.5+ refuses v1 objects at load.
Next¶
- Operations → Deployment — systemd unit, native TLS or reverse-proxy TLS, log rotation.
- Operations → Tuning — when to raise
THREADS,FCACHE_MAX, etc.