Saltar a contenido

dorm.contrib.sharding

Hash-based horizontal sharding.

See Sharding for recipes.

API

dorm.contrib.sharding.HashShardRouter

Routing entry for settings.DATABASE_ROUTERS.

Configuration::

from dorm.contrib.sharding import HashShardRouter
from myapp.models import Order, Customer

DATABASES = {
    "default": {...},
    "shard_0": {...},
    "shard_1": {...},
    "shard_2": {...},
    "shard_3": {...},
}
DATABASE_ROUTERS = [
    HashShardRouter(num_shards=4, shard_models={Order, Customer}),
]

Inside the request handler::

from dorm.contrib.sharding import with_shard_key

with with_shard_key(request.user.tenant_id):
    order = Order.objects.create(...)

Sharded models without a pinned shard key raise RuntimeError — silently routing to default would scatter rows across shards inconsistently. Non-sharded models are passed through to the next router by returning None.

dorm.contrib.sharding.with_shard_key(key: Any)

Pin key as the active shard key for the enclosing block.

Any sharded query issued inside the block (sync or async) is routed to shard_for(key, …). The pin is per-task (asyncio) / per-thread context — it does not bleed between requests.

dorm.contrib.sharding.get_shard_key() -> Any | None

Return the currently-pinned shard key, or None.

dorm.contrib.sharding.shard_for(key: Any, num_shards: int, *, aliases: list[str] | None = None, salt: bytes = _DEFAULT_SALT) -> str

Return the database alias for key across num_shards shards.

Default alias names are shard_0shard_<N-1>; pass aliases to override (length must equal num_shards). The hash is deterministic across processes / Python versions because it uses a keyed BLAKE2b digest, not Python's built-in hash().

dorm.contrib.sharding.for_each_shard(func: Callable[[str], Any], *, num_shards: int, aliases: list[str] | None = None) -> dict[str, Any]

Run func(alias) against every shard alias in turn and return {alias: result}. Use for fan-out queries (count() of a global table that lives on every shard, etc.). Sequential — wrap the body in threads/asyncio yourself if parallelism is needed.