pgvector¶
VectorField¶
dorm.contrib.pgvector.VectorField
¶
Bases: Field[list]
Column storing a fixed-length float vector.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dimensions
|
int
|
required vector length. The column is declared
|
required |
The Python type is list[float] on read; on write we accept
list / tuple / numpy.ndarray / pgvector's own
Vector — anything iterable that yields numeric values. The
value goes out as the right wire format for the active backend:
- pgvector:
"[v1,v2,…]"text form. - sqlite-vec: packed little-endian float32 BLOB.
get_db_prep_value(value: Any) -> Any
¶
Adapt value to the wire format the active backend expects.
Detects the backend by peeking at the model's default
connection wrapper (dorm.db.connection.get_connection()).
That works for the common case of a single DATABASES
alias; multi-database setups should make sure
:class:VectorField is only used on tables routed to a
consistent backend.
Returns:
| Type | Description |
|---|---|
Any
|
|
Any
|
|
Distance expressions¶
dorm.contrib.pgvector.L2Distance
¶
Bases: _VectorDistance
Euclidean (L2) distance.
- pgvector:
col <-> %s. Pair withvector_l2_ops. - sqlite-vec:
vec_distance_L2(col, %s). - libsql:
vector_distance_l2(col, vector32(?)).
Smaller = more similar.
dorm.contrib.pgvector.CosineDistance
¶
Bases: _VectorDistance
Cosine distance (1 - cosine_similarity).
- pgvector:
col <=> %s. Pair withvector_cosine_ops. - sqlite-vec:
vec_distance_cosine(col, %s). - libsql:
vector_distance_cos(col, vector32(?)).
Smaller = more similar. On L2-normalised embeddings this is
equivalent to :class:MaxInnerProduct and works on every
backend.
dorm.contrib.pgvector.MaxInnerProduct
¶
Bases: _VectorDistance
Negated inner product.
- pgvector:
col <#> %s. Pair withvector_ip_ops. - sqlite-vec: not supported — sqlite-vec doesn't ship a negated-inner-product function.
- libsql: not supported today — fall back to
:class:
CosineDistanceover L2-normalised embeddings.
pgvector returns -inner_product so that ORDER BY ASC
still puts the most-similar rows first.
Index helpers¶
dorm.contrib.pgvector.HnswIndex
¶
Bases: _VectorIndexBase
HNSW (Hierarchical Navigable Small World) index for pgvector.
Tuning knobs you'll most often touch:
m=— graph fan-out. Default 16. Higher = better recall + bigger index. Range typically 4-64.ef_construction=— build-time search depth. Default 64. Higher = better recall + slower build.- Query-time recall vs latency is controlled by
SET hnsw.ef_search = N(default 40); not part of the index definition itself.
Build time is roughly linear in row count + ef_construction;
expect minutes for a million rows even on fast hardware.
dorm.contrib.pgvector.IvfflatIndex
¶
Bases: _VectorIndexBase
IVFFlat (Inverted File with Flat compression) index for pgvector.
Required tuning knob:
lists=— number of cluster centroids. Rule of thumb:rows / 1000for under 1M rows,sqrt(rows)for larger tables. The index needs at least one row per list at build time, so populate the table before creating the index.
Query-time recall is tuned via SET ivfflat.probes = N (1
to lists; default 1). Higher = better recall + slower.
Build is faster and the on-disk footprint smaller than HNSW, but recall plateaus lower. Use HNSW unless build time is a real constraint.
VectorExtension¶
dorm.contrib.pgvector.VectorExtension
¶
Bases: Operation
Migration operation that enables vector search on the target DB — pgvector on PostgreSQL, sqlite-vec on SQLite.
Runs idempotently:
- PostgreSQL —
CREATE EXTENSION IF NOT EXISTS "vector"forwards,DROP EXTENSION IF EXISTS "vector"backwards. The extension persists on the server. - SQLite — loads sqlite-vec into the migration's
connection AND registers a hook on the wrapper so every
future connection (re-opens, new threads, new processes
after restart) loads it automatically. The hook key
(
_vec_extension_enabled) is a wrapper attribute, not a DB row, so a process restart needs to hit the hook again — either by re-running the migration, or by importing :func:load_sqlite_vec_extensionfrom app startup. The generated migration file is the recommended trigger because it lives in source control next to the model.
Typical layout::
# 0001_enable_pgvector.py
from dorm.contrib.pgvector import VectorExtension
operations = [VectorExtension()]
Generate with dorm makemigrations --enable-pgvector <app>.