Skip to content

pgvector

VectorField

dorm.contrib.pgvector.VectorField

Bases: Field[list]

Column storing a fixed-length float vector.

Parameters:

Name Type Description Default
dimensions int

required vector length. The column is declared vector(dimensions) on PostgreSQL and BLOB on SQLite (the size is enforced in Python on both backends because SQLite's BLOB has no length constraint).

required

The Python type is list[float] on read; on write we accept list / tuple / numpy.ndarray / pgvector's own Vector — anything iterable that yields numeric values. The value goes out as the right wire format for the active backend:

  • pgvector: "[v1,v2,…]" text form.
  • sqlite-vec: packed little-endian float32 BLOB.

get_db_prep_value(value: Any) -> Any

Adapt value to the wire format the active backend expects.

Detects the backend by peeking at the model's default connection wrapper (dorm.db.connection.get_connection()). That works for the common case of a single DATABASES alias; multi-database setups should make sure :class:VectorField is only used on tables routed to a consistent backend.

Returns:

Type Description
Any
  • bytes for SQLite — packed little-endian float32, what sqlite-vec stores natively.
Any
  • str for PostgreSQL — [v1,…] text form, what pgvector accepts even without the pgvector Python package installed.

Distance expressions

dorm.contrib.pgvector.L2Distance

Bases: _VectorDistance

Euclidean (L2) distance.

  • pgvector: col <-> %s. Pair with vector_l2_ops.
  • sqlite-vec: vec_distance_L2(col, %s).
  • libsql: vector_distance_l2(col, vector32(?)).

Smaller = more similar.

dorm.contrib.pgvector.CosineDistance

Bases: _VectorDistance

Cosine distance (1 - cosine_similarity).

  • pgvector: col <=> %s. Pair with vector_cosine_ops.
  • sqlite-vec: vec_distance_cosine(col, %s).
  • libsql: vector_distance_cos(col, vector32(?)).

Smaller = more similar. On L2-normalised embeddings this is equivalent to :class:MaxInnerProduct and works on every backend.

dorm.contrib.pgvector.MaxInnerProduct

Bases: _VectorDistance

Negated inner product.

  • pgvector: col <#> %s. Pair with vector_ip_ops.
  • sqlite-vec: not supported — sqlite-vec doesn't ship a negated-inner-product function.
  • libsql: not supported today — fall back to :class:CosineDistance over L2-normalised embeddings.

pgvector returns -inner_product so that ORDER BY ASC still puts the most-similar rows first.

Index helpers

dorm.contrib.pgvector.HnswIndex

Bases: _VectorIndexBase

HNSW (Hierarchical Navigable Small World) index for pgvector.

Tuning knobs you'll most often touch:

  • m= — graph fan-out. Default 16. Higher = better recall + bigger index. Range typically 4-64.
  • ef_construction= — build-time search depth. Default 64. Higher = better recall + slower build.
  • Query-time recall vs latency is controlled by SET hnsw.ef_search = N (default 40); not part of the index definition itself.

Build time is roughly linear in row count + ef_construction; expect minutes for a million rows even on fast hardware.

dorm.contrib.pgvector.IvfflatIndex

Bases: _VectorIndexBase

IVFFlat (Inverted File with Flat compression) index for pgvector.

Required tuning knob:

  • lists= — number of cluster centroids. Rule of thumb: rows / 1000 for under 1M rows, sqrt(rows) for larger tables. The index needs at least one row per list at build time, so populate the table before creating the index.

Query-time recall is tuned via SET ivfflat.probes = N (1 to lists; default 1). Higher = better recall + slower.

Build is faster and the on-disk footprint smaller than HNSW, but recall plateaus lower. Use HNSW unless build time is a real constraint.

VectorExtension

dorm.contrib.pgvector.VectorExtension

Bases: Operation

Migration operation that enables vector search on the target DB — pgvector on PostgreSQL, sqlite-vec on SQLite.

Runs idempotently:

  • PostgreSQLCREATE EXTENSION IF NOT EXISTS "vector" forwards, DROP EXTENSION IF EXISTS "vector" backwards. The extension persists on the server.
  • SQLite — loads sqlite-vec into the migration's connection AND registers a hook on the wrapper so every future connection (re-opens, new threads, new processes after restart) loads it automatically. The hook key (_vec_extension_enabled) is a wrapper attribute, not a DB row, so a process restart needs to hit the hook again — either by re-running the migration, or by importing :func:load_sqlite_vec_extension from app startup. The generated migration file is the recommended trigger because it lives in source control next to the model.

Typical layout::

# 0001_enable_pgvector.py
from dorm.contrib.pgvector import VectorExtension
operations = [VectorExtension()]

Generate with dorm makemigrations --enable-pgvector <app>.