find¶

The workhorse query mode. Filters records against a criteria array, with optional projection, sorting, pagination, and joins.

Shape¶

{
  "mode": "find",
  "dir": "<dir>",
  "object": "<obj>",
  "criteria": [ ... ],

  "offset": 0,
  "limit": 50,
  "fields": "name,email",
  "excludedKeys": "bot1,bot2",
  "order_by": "<field>",
  "order": "asc",
  "format": "rows",
  "join": [ ... ]
}

Required: mode, dir, object, criteria.

Parameters¶

Field	Type	Default	Meaning
`criteria`	array	required	AND-combined criterion objects (see Operators). Can be empty `[]` — returns all records.
`offset`	int	0	Skip the first N matches.
`limit`	int	`GLOBAL_LIMIT`	Max records to return.
`fields`	string OR array	all	Projection. `"name,email"` or `["name","email"]`.
`excludedKeys`	string OR array	none	Skip these keys from results. Comma-separated or array.
`order_by`	string	—	Sort by this field; matches are buffered and sorted before pagination.
`order`	`"asc"` or `"desc"`	`"asc"`	Sort direction when `order_by` is set.
`format`	`"rows"`	JSON objects	Return tabular `{"columns":[...],"rows":[...]}`. Ignored when `join` is present (always tabular).
`join`	array	none	See joins.

Response¶

Default (JSON records):

[
  {"key":"u1","value":{"name":"Alice","email":"a@x.com","age":30}},
  {"key":"u2","value":{"name":"Bob","email":"b@x.com","age":22}}
]

With "format":"rows":

{
  "columns": ["key","name","email","age"],
  "rows": [
    ["u1","Alice","a@x.com",30],
    ["u2","Bob","b@x.com",22]
  ]
}

Rows form is ~30% smaller on the wire and drops directly into spreadsheets / charting libraries.

With "format":"csv" (raw CSV, not JSON):

key,name,email,age
u1,Alice,a@x.com,30
u2,Bob,b@x.com,22

Plain text body — no JSON wrapping. Optional "delimiter":"|" picks a single-char separator (\t literal for tab; defaults to ,). Values containing the delimiter or " are RFC-4180-quoted with internal " doubled; newlines inside values are replaced with a space so one physical line equals one logical row. Errors still come as JSON {"error":"..."}. Not compatible with join (returns {"error":"format=csv is not supported with join"}). Same shape works on fetch and aggregate.

Criteria shape¶

Each element of criteria is a JSON object:

{"field":"<field-name>","op":"<operator>","value":"<value>"}

For between:

{"field":"age","op":"between","value":"18","value2":"65"}

Multiple criteria at the top level are AND-combined. Compose OR with a {"or":[...]} sub-node:

"criteria":[
  {"field":"status","op":"eq","value":"paid"},
  {"or":[
    {"field":"region","op":"eq","value":"us"},
    {"field":"total","op":"gte","value":"1000"}
  ]}
]

Reads as: status = 'paid' AND (region = 'us' OR total >= 1000).

{"or":[...]} — branch matches if any child matches.
{"and":[...]} — explicit AND sub-branch, useful when nesting.
Flat arrays remain implicit AND (zero change to existing queries).
Max nesting depth is 16. Empty or:[] / and:[] returns {"error":"empty or/and"}.

Execution paths¶

The planner selects automatically:

Shape	Example	Strategy
Single indexed leaf	`[A]` where A is indexed	Primary-indexed-leaf scan.
Pure AND, 2+ indexed leaves on rangeable ops	`[{a=x}, {b>10}]`	PRIMARY_INTERSECT — walk each leaf's btree into a `KeySet`, intersect candidate hash sets, skip per-record fetch for `count`. Walks most-selective-first. Eligible ops: eq, lt, gt, lte, gte, between, in, starts. Caps at `MAX_INTERSECT_LEAVES=8`.
Pure AND, mixed	`[{a=x}, {bio contains x}]`	Primary-indexed-leaf for the indexed sibling; the rest post-filter via `criteria_match_tree`.
AND + OR, AND sibling indexed	`[{a=x}, {or:[{b},{c}]}]`	AND leaf drives candidates; OR sub-tree evaluated per record. OR children don't need indexes.
Pure OR, every child indexed	`[{or:[{a=x},{b=y}]}]`	Per-child B+ tree lookups unioned into a lock-free `KeySet`. Sublinear — no shard scan. Pure-OR `count` returns `\\|KeySet\\|` directly.
OR with any non-indexed child	`[{or:[{a=x},{b=y}]}]` (b not indexed)	Full parallel shard scan + tree match.

Hybrid (non-indexed AND + fully-indexed OR sub-tree) uses the KeySet as primary-candidate source and applies the AND siblings as a post-filter.

Per-shard btree layout: each indexed field is sharded into splits/4 btree files at <obj>/indexes/<field>/<NNN>.idx. Reads fan out across all shards via the parallel-for pool; writes route by record hash to one shard.

Operators¶

38 operators. Every operator uses an index when the field is indexed (with the exceptions noted below for full-scan ops). Composite indexes (field1+field2) only assist eq, starts, and between-via-prefix; substring/range ops on composites fall back to leaf scan.

Equality, range, set membership¶

Operator	Description	Applies to	Example
`eq` / `equal`	Exact match	all types	`{"field":"status","op":"eq","value":"paid"}`
`neq` / `not_equal`	Not equal	all	`{"field":"status","op":"neq","value":"void"}`
`lt` / `less`	Strictly less than	numeric, date, datetime	`{"field":"age","op":"lt","value":"65"}`
`gt` / `greater`	Strictly greater than	numeric, date, datetime	`{"field":"age","op":"gt","value":"18"}`
`lte` / `less_eq`	`<=`	numeric, date, datetime	`{"field":"score","op":"lte","value":"999"}`
`gte` / `greater_eq`	`>=`	numeric, date, datetime	`{"field":"score","op":"gte","value":"100"}`
`between`	Inclusive range	numeric, date, datetime, varchar (lexicographic)	`{"field":"age","op":"between","value":"18","value2":"65"}`
`in`	Value in CSV set	all	`{"field":"status","op":"in","value":"active,pending"}`
`nin` / `not_in`	Value not in set	all	`{"field":"role","op":"nin","value":"bot,test"}`
`exists`	Field present / non-empty	all	`{"field":"phone","op":"exists"}`
`nexists` / `not_exists`	Field missing / empty	all	`{"field":"deleted_at","op":"nexists"}` (forces full scan — missing fields aren't in the index)

Varchar matching — case-sensitive (default)¶

Operator	Description	Notes
`like`	Wildcard — `%` or `*`	Indexed shortcut: `"foo"` (no `%`) → point lookup; `"foo%"` → prefix range; `"%foo"` / `"%foo%"` → leaf scan
`nlike` / `not_like`	Wildcard negated	Leaf scan
`contains`	Substring match	Leaf scan with per-entry filter
`ncontains` / `not_contains`	Not substring	Leaf scan
`starts` / `starts_with`	Prefix match	Indexed prefix range scan
`ends` / `ends_with`	Suffix match	Leaf scan

Varchar matching — case-insensitive (ASCII tolower)¶

ilike, not_ilike, icontains, not_icontains, istarts, iends — same semantics as their case-sensitive counterparts but tolower per byte before compare. Always leaf scan (no prefix shortcut).

Length filters (varchar only — answered from btree leaf entry's vlen, no record fetch)¶

Operator	Example
`len_eq`	`{"field":"name","op":"len_eq","value":"5"}`
`len_neq`	`{"field":"name","op":"len_neq","value":"5"}`
`len_lt`, `len_gt`, `len_lte`, `len_gte`	`{"field":"bio","op":"len_lt","value":"10"}`
`len_between`	`{"field":"name","op":"len_between","value":"3","value2":"8"}`

Field-vs-field (compare two fields on the same record — full scan only)¶

Operator	Example
`eq_field`, `neq_field`	`{"field":"createdAt","op":"eq_field","value":"updatedAt"}`
`lt_field`, `gt_field`, `lte_field`, `gte_field`	`{"field":"used","op":"gt_field","value":"limit"}`

The RHS is per-record so no btree shortcut is possible — these always full-scan.

Regex (POSIX extended regex on varchar — full scan only)¶

Operator	Example
`regex`	`{"field":"sku","op":"regex","value":"^[A-Z]{2}-[0-9]{4}$"}`
`not_regex`	`{"field":"phone","op":"not_regex","value":"^\\+1"}`

Compiled once at criteria-parse time; matched per record with REG_STARTEND on the hot path. Indexed fields could in theory walk leaves only, but the per-entry regexec cost dominates the saving from skipping record fetch — kept on the full-scan path.

Value formatting¶

varchar — raw string.
int / long / short — numeric string ("30", "1000").
double — numeric string with optional decimal ("3.14").
bool — "true" / "false".
date — "yyyyMMdd" (e.g., "20260418").
datetime — "yyyyMMddHHmmss" (e.g., "20260418153012").
numeric — decimal string matching the declared scale ("1500.75" for numeric:12,2).

Types are enforced: passing a non-numeric string to gt on an int field returns an error.

Sorting¶

order_by buffers every match into memory, sorts, then slices by offset / limit. Cost is O(matches * log matches) — fine for thousands of matches, not for millions.

Numeric types (int, long, short, double, numeric, date, datetime, bool, byte) sort numerically.
varchar sorts lexicographically.
order: "asc" is the default; "desc" flips.
Not compatible with join — tabular join output doesn't sort.

For streaming through large results in an arbitrary order, use fetch with keyset cursor pagination instead.

Cursor pagination (keyset)¶

For deep pagination on large result sets, offset-based paging pays O(matches) per page (the buffer-sort path above). A cursor driven off an indexed order_by field is O(limit) regardless of page depth.

// Page 1 — signal cursor pagination with cursor:null (or cursor:{})
{"mode":"find","dir":"t","object":"orders",
 "criteria":[{"field":"status","op":"eq","value":"paid"}],
 "order_by":"amount","order":"asc","limit":100,"cursor":null}

// Response wraps rows and emits the next-page cursor
{"rows":[...], "cursor":{"amount":"500.00","key":"ord_4912"}}

// Page N+1 — hand back the previous page's cursor verbatim
{"mode":"find", ..., "order_by":"amount","limit":100,
 "cursor":{"amount":"500.00","key":"ord_4912"}}

// Last page returns "cursor":null

Rules: - order_by field must be indexed — cursor queries reject otherwise with a clear error. - Cursor tie-breaks on hash16(primary_key) when multiple rows share the same order_by value, so pagination is stable across ties. - cursor:null or cursor:{} in the request opts into cursor mode (page 1, walks from start/end). - Omitting cursor entirely keeps backward-compat behaviour (unwrapped array, buffer-sort for order_by). - format:"csv" and join are not supported with cursor — use the non-cursor find path for those. - ASC + DESC both supported; the server-side k-way merge across per-shard btree iterators reconstructs global order.

Projection¶

fields narrows the returned columns. Supports either CSV or JSON array:

"fields": "name,email"
"fields": ["name","email"]

key is always included in the default form; in rows format, key is the first column.

Excluding keys¶

excludedKeys drops matching records by key. Useful for "everything except these":

{"mode":"find","dir":"default","object":"users",
 "criteria":[{"field":"active","op":"eq","value":"true"}],
 "excludedKeys":"bot1,bot2,system"}

Indexed vs full-scan¶

The planner picks an index automatically. Rules of thumb:

Indexed path — used when any criterion's field has a matching index (single-field or the leading component of a composite). B+ tree scan + record re-filter. 1–3 ms on 1 M records.
Full scan — used when no criterion is indexed. Parallel per-shard walk over Zone A + Zone B for candidates. 2–3 ms on 1 M records because Zone A stays in page cache.

See Concepts → Indexes for the selection logic.

Recipes¶

Pagination with filter and projection¶

{"mode":"find","dir":"acme","object":"invoices",
 "criteria":[
   {"field":"status","op":"eq","value":"paid"},
   {"field":"paid_at","op":"gte","value":"20260318000000"}
 ],
 "fields":["id","customer","total","paid_at"],
 "order_by":"paid_at","order":"desc",
 "offset":0,
 "limit":50}

"Not in" a set¶

{"mode":"find","dir":"default","object":"users",
 "criteria":[{"field":"role","op":"nin","value":"bot,spam,deleted"}]}

Field presence¶

{"mode":"find","dir":"default","object":"users",
 "criteria":[
   {"field":"phone","op":"exists"},
   {"field":"deleted_at","op":"nexists"}
 ]}

Count only (no records returned)¶

Use count for this — same criteria shape, zero record materialization.

Joins¶

Any find query can enrich results by joining other objects. See joins.