find¶
The workhorse query mode. Filters records against a criteria array, with optional projection, sorting, pagination, and joins.
Shape¶
{
"mode": "find",
"dir": "<dir>",
"object": "<obj>",
"criteria": [ ... ],
"offset": 0,
"limit": 50,
"fields": "name,email",
"excludedKeys": "bot1,bot2",
"order_by": "<field>",
"order": "asc",
"format": "rows",
"join": [ ... ]
}
Required: mode, dir, object, criteria.
Parameters¶
| Field | Type | Default | Meaning |
|---|---|---|---|
criteria |
array | required | AND-combined criterion objects (see Operators). Can be empty [] — returns all records. |
offset |
int | 0 | Skip the first N matches. |
limit |
int | GLOBAL_LIMIT |
Max records to return. |
fields |
string OR array | all | Projection. "name,email" or ["name","email"]. |
excludedKeys |
string OR array | none | Skip these keys from results. Comma-separated or array. |
order_by |
string | — | Sort by this field; matches are buffered and sorted before pagination. |
order |
"asc" or "desc" |
"asc" |
Sort direction when order_by is set. |
total |
bool | false | When cursor pagination is active, also return the total match count in a single round-trip. Adds O(1) metadata cost. |
format |
"rows" / "csv" / "dict" |
JSON array | See response shapes below. Ignored when join is present (always tabular). |
join |
array | none | See joins. |
Response¶
Default (JSON array of records):
[
{"key":"u1","value":{"name":"Alice","email":"a@x.com","age":30}},
{"key":"u2","value":{"name":"Bob","email":"b@x.com","age":22}}
]
With "format":"rows":
{
"columns": ["key","name","email","age"],
"rows": [
["u1","Alice","a@x.com",30],
["u2","Bob","b@x.com",22]
]
}
Rows form is ~30% smaller on the wire and drops directly into spreadsheets / charting libraries.
With "format":"dict" (new in 2026.05.1):
{
"u1":{"name":"Alice","email":"a@x.com","age":30},
"u2":{"name":"Bob","email":"b@x.com","age":22}
}
Dict form gives O(1) lookup by primary key on the client side and round-trips with bulk-insert's dict shape. Combine with cursor and the envelope becomes {"results":{...},"cursor":...}. Rejected with join (joins force tabular). With order_by, dict iteration order is parser-dependent — use the default array or format:"rows" if you need strict client-side iteration order.
With "format":"csv": raw CSV text (no JSON envelope). See overview.
With "format":"csv" (raw CSV, not JSON):
Plain text body — no JSON wrapping. Optional "delimiter":"|" picks a single-char separator (\t literal for tab; defaults to ,). Values containing the delimiter or " are RFC-4180-quoted with internal " doubled; newlines inside values are replaced with a space so one physical line equals one logical row. Errors still come as JSON {"error":"..."}. Combines with join (2026.05.1+) — emits the same column-prefixed tabular table as the JSON shape, just CSV-encoded. Same shape works on fetch and aggregate.
Criteria shape¶
Each element of criteria is a JSON object:
For between:
Multiple criteria at the top level are AND-combined. Compose OR with a {"or":[...]} sub-node:
"criteria":[
{"field":"status","op":"eq","value":"paid"},
{"or":[
{"field":"region","op":"eq","value":"us"},
{"field":"total","op":"gte","value":"1000"}
]}
]
Reads as: status = 'paid' AND (region = 'us' OR total >= 1000).
{"or":[...]}— branch matches if any child matches.{"and":[...]}— explicit AND sub-branch, useful when nesting.- Flat arrays remain implicit AND (zero change to existing queries).
- Max nesting depth is 16. Empty
or:[]/and:[]returns{"error":"empty or/and"}.
Execution paths¶
The planner selects automatically:
| Shape | Example | Strategy |
|---|---|---|
| Single indexed leaf | [A] where A is indexed |
Primary-indexed-leaf scan. |
| Pure AND, 2+ indexed leaves on rangeable ops | [{a=x}, {b>10}] |
PRIMARY_INTERSECT — walk each leaf's btree into a KeySet, intersect candidate hash sets, skip per-record fetch for count. Walks most-selective-first. Eligible ops: eq, lt, gt, lte, gte, between, in, starts. Caps at MAX_INTERSECT_LEAVES=8. |
| Pure AND, mixed | [{a=x}, {bio contains x}] |
Primary-indexed-leaf for the indexed sibling; the rest post-filter via criteria_match_tree. |
| AND + OR, AND sibling indexed | [{a=x}, {or:[{b},{c}]}] |
AND leaf drives candidates; OR sub-tree evaluated per record. OR children don't need indexes. |
| Pure OR, every child indexed | [{or:[{a=x},{b=y}]}] |
Per-child B+ tree lookups unioned into a lock-free KeySet. Sublinear — no shard scan. Pure-OR count returns \|KeySet\| directly. |
| OR with any non-indexed child | [{or:[{a=x},{b=y}]}] (b not indexed) |
Full parallel shard scan + tree match. |
Hybrid (non-indexed AND + fully-indexed OR sub-tree) uses the KeySet as primary-candidate source and applies the AND siblings as a post-filter.
Per-shard btree layout: each indexed field is sharded into index_splits_for(splits) btree files at <obj>/indexes/<field>/<NNN>.idx (non-linear curve: 8→2, 16→4, 32→4, 64→8, 128→16, 256→16, 512→32, 1024→64, 2048→64, 4096→128). Reads fan out across all idx-shards via the parallel-for pool; writes route by record hash to one shard.
Operators¶
38 operators. Every operator uses an index when the field is indexed (with the exceptions noted below for full-scan ops). Composite indexes (field1+field2) only assist eq, starts, and between-via-prefix; substring/range ops on composites fall back to leaf scan.
Equality, range, set membership¶
| Operator | Description | Applies to | Example |
|---|---|---|---|
eq / equal |
Exact match | all types | {"field":"status","op":"eq","value":"paid"} |
neq / not_equal |
Not equal | all | {"field":"status","op":"neq","value":"void"} |
lt / less |
Strictly less than | numeric, date, datetime | {"field":"age","op":"lt","value":"65"} |
gt / greater |
Strictly greater than | numeric, date, datetime | {"field":"age","op":"gt","value":"18"} |
lte / less_eq |
<= |
numeric, date, datetime | {"field":"score","op":"lte","value":"999"} |
gte / greater_eq |
>= |
numeric, date, datetime | {"field":"score","op":"gte","value":"100"} |
between |
Inclusive range | numeric, date, datetime, varchar (lexicographic) | {"field":"age","op":"between","value":"18","value2":"65"} |
in |
Value in CSV set | all | {"field":"status","op":"in","value":"active,pending"} |
nin / not_in |
Value not in set | all | {"field":"role","op":"nin","value":"bot,test"} |
exists |
Field present / non-empty | all | {"field":"phone","op":"exists"} |
nexists / not_exists |
Field missing / empty | all | {"field":"deleted_at","op":"nexists"} (forces full scan — missing fields aren't in the index) |
Varchar matching — case-sensitive (default)¶
| Operator | Description | Notes |
|---|---|---|
like |
Wildcard — % or * |
Indexed shortcut: "foo" (no %) → point lookup; "foo%" → prefix range; "%foo" / "%foo%" → leaf scan |
nlike / not_like |
Wildcard negated | Leaf scan |
contains |
Substring match | Leaf scan with per-entry filter |
ncontains / not_contains |
Not substring | Leaf scan |
starts / starts_with |
Prefix match | Indexed prefix range scan |
ends / ends_with |
Suffix match | Leaf scan |
Varchar matching — case-insensitive (ASCII tolower)¶
ilike, not_ilike, icontains, not_icontains, istarts, iends — same semantics as their case-sensitive counterparts but tolower per byte before compare. Always leaf scan (no prefix shortcut).
Length filters (varchar only — answered from btree leaf entry's vlen, no record fetch)¶
| Operator | Example |
|---|---|
len_eq |
{"field":"name","op":"len_eq","value":"5"} |
len_neq |
{"field":"name","op":"len_neq","value":"5"} |
len_lt, len_gt, len_lte, len_gte |
{"field":"bio","op":"len_lt","value":"10"} |
len_between |
{"field":"name","op":"len_between","value":"3","value2":"8"} |
Field-vs-field (compare two fields on the same record — full scan only)¶
| Operator | Example |
|---|---|
eq_field, neq_field |
{"field":"createdAt","op":"eq_field","value":"updatedAt"} |
lt_field, gt_field, lte_field, gte_field |
{"field":"used","op":"gt_field","value":"limit"} |
The RHS is per-record so no btree shortcut is possible — these always full-scan.
Regex (POSIX extended regex on varchar — full scan only)¶
| Operator | Example |
|---|---|
regex |
{"field":"sku","op":"regex","value":"^[A-Z]{2}-[0-9]{4}$"} |
not_regex |
{"field":"phone","op":"not_regex","value":"^\\+1"} |
Compiled once at criteria-parse time; matched per record with REG_STARTEND on the hot path. Indexed fields could in theory walk leaves only, but the per-entry regexec cost dominates the saving from skipping record fetch — kept on the full-scan path.
Value formatting¶
- varchar — raw string.
- int / long / short — numeric string (
"30","1000"). - double — numeric string with optional decimal (
"3.14"). - bool —
"true"/"false". - date —
"yyyyMMdd"(e.g.,"20260418"). - datetime —
"yyyyMMddHHmmss"(e.g.,"20260418153012"). - numeric — decimal string matching the declared scale (
"1500.75"fornumeric:12,2).
Types are enforced: passing a non-numeric string to gt on an int field returns an error.
Sorting¶
order_by buffers every match into memory, sorts, then slices by offset / limit. Cost is O(matches * log matches) — fine for thousands of matches, not for millions.
- Numeric types (
int,long,short,double,numeric,date,datetime,bool,byte) sort numerically. varcharsorts lexicographically.order: "asc"is the default;"desc"flips.- Not compatible with
join— tabular join output doesn't sort.
For streaming through large results in an arbitrary order, use fetch with keyset cursor pagination instead.
Cursor pagination (keyset)¶
For deep pagination on large result sets, offset-based paging pays O(matches) per page (the buffer-sort path above). A cursor driven off an indexed order_by field is O(limit) regardless of page depth.
Basic cursor usage¶
// Page 1 — signal cursor pagination with cursor:null (or cursor:{})
{"mode":"find","dir":"t","object":"orders",
"criteria":[{"field":"status","op":"eq","value":"paid"}],
"order_by":"amount","order":"asc","limit":100,"cursor":null}
// Response wraps rows and emits the next-page cursor
{"rows":[...], "cursor":{"amount":"500.00","key":"ord_4912"}}
// Page N+1 — hand back the previous page's cursor verbatim
{"mode":"find", ..., "order_by":"amount","limit":100,
"cursor":{"amount":"500.00","key":"ord_4912"}}
// Last page returns "cursor":null
Cursor with total count (total)¶
To get both the page results and the total match count in a single request:
// Page 1 — add "total":true to get the total count
{"mode":"find","dir":"t","object":"orders",
"criteria":[{"field":"status","op":"eq","value":"paid"}],
"order_by":"amount","order":"asc","limit":100,"cursor":null,"total":true}
// Response includes both the page and the total count
{"rows":[...], "cursor":{"amount":"500.00","key":"ord_4912"}, "total":5234}
The total field contains the complete match count against the criteria, allowing UI pagination controls to render "Page 1 of N" immediately. total has negligible cost (O(1) metadata read) and only applies when cursor pagination is active (cursor:null or cursor:{...}).
Cursor rules¶
order_byfield must be indexed — cursor queries reject otherwise with a clear error.- Cursor tie-breaks on
hash16(primary_key)when multiple rows share the sameorder_byvalue, so pagination is stable across ties. cursor:nullorcursor:{}in the request opts into cursor mode (page 1, walks from start/end).- Omitting
cursorentirely keeps backward-compat behaviour (unwrapped array, buffer-sort fororder_by). format:"csv"andjoinare not supported with cursor — use the non-cursorfindpath for those.- ASC + DESC both supported; the server-side k-way merge across per-shard btree iterators reconstructs global order.
totalis only meaningful with active cursor pagination; it is ignored in non-cursor mode.
Projection¶
fields narrows the returned columns. Supports either CSV or JSON array:
key is always included in the default form; in rows format, key is the first column.
Excluding keys¶
excludedKeys drops matching records by key. Useful for "everything except these":
{"mode":"find","dir":"default","object":"users",
"criteria":[{"field":"active","op":"eq","value":"true"}],
"excludedKeys":"bot1,bot2,system"}
Indexed vs full-scan¶
The planner picks an index automatically. Rules of thumb:
- Indexed path — used when any criterion's field has a matching index (single-field or the leading component of a composite). B+ tree scan + record re-filter. 1–3 ms on 1 M records.
- Full scan — used when no criterion is indexed. Parallel per-kf-shard slot walk; each live slot's segment payload is fetched only when the in-kf hash +
match_typed()agree on a candidate. The kf slot array stays in page cache so the bulk of the scan is hash-rejection without touching segments.
See Concepts → Indexes for the selection logic.
Recipes¶
Pagination with filter and projection¶
{"mode":"find","dir":"acme","object":"invoices",
"criteria":[
{"field":"status","op":"eq","value":"paid"},
{"field":"paid_at","op":"gte","value":"20260318000000"}
],
"fields":["id","customer","total","paid_at"],
"order_by":"paid_at","order":"desc",
"offset":0,
"limit":50}
"Not in" a set¶
{"mode":"find","dir":"default","object":"users",
"criteria":[{"field":"role","op":"nin","value":"bot,spam,deleted"}]}
Field presence¶
{"mode":"find","dir":"default","object":"users",
"criteria":[
{"field":"phone","op":"exists"},
{"field":"deleted_at","op":"nexists"}
]}
Count only (no records returned)¶
Use count for this — same criteria shape, zero record materialization.
Joins¶
Any find query can enrich results by joining other objects. See joins.