Skip to content

Seeds

Seeds provide reproducible random number generation across experiments. The seed system ensures that runs with the same seed bundle produce identical results.

metalab.seeds

Seeds module: Explicit RNG control with deterministic derivation.

Provides:

  • SeedBundle: Manages root seed and derived sub-seeds
  • SeedPlan: Generates seed bundles for replicates
  • seeds(): Factory for creating seed plans

SeedBundle dataclass

SeedBundle(root_seed: int, replicate_index: int | None = None)

Bundle of seeds for reproducible random number generation.

All randomness in an operation should be derived from this bundle, ensuring reproducibility given the same SeedBundle.

Attributes:

Name Type Description
root_seed int

The base seed for all derivations.

replicate_index int | None

The replicate number (if part of a SeedPlan).

derive

derive(name: str) -> int

Derive a sub-seed deterministically from root + name + replicate.

Uses SHA-256 for cross-platform stability.

Parameters:

Name Type Description Default
name str

A unique name for this sub-seed (e.g., "sampling", "init").

required

Returns:

Type Description
int

A 64-bit integer seed.

Example:

bundle = SeedBundle(root_seed=42, replicate_index=0)
seed1 = bundle.derive("sampling")
seed2 = bundle.derive("initialization")

for_preprocessing classmethod

for_preprocessing(base_seed: int) -> SeedBundle

Create a SeedBundle for preprocessing steps.

Use this when you need reproducible randomness during data preprocessing, before the experiment runs. The preprocessing seed is derived from the base seed using a "preprocessing" namespace, ensuring it doesn't collide with replicate seeds.

Include the seed in the preprocessed filename so changing it automatically triggers new preprocessing (cache miss).

Parameters:

Name Type Description Default
base_seed int

The experiment's base seed (same value you pass to metalab.seeds(base=...)). This ensures preprocessing and experiment runs share the same seed hierarchy.

required

Returns:

Type Description
SeedBundle

A SeedBundle for use in preprocessing code.

Example:

BASE_SEED = 42

# Use for preprocessing (before metalab.run)
seeds = SeedBundle.for_preprocessing(BASE_SEED)
rng = seeds.numpy("train_test_split")
train, test = my_split(data, rng=rng)

# Include seed in filename for automatic cache invalidation
output_path = f"./cache/processed_seed{BASE_SEED}.h5ad"

# Same base seed for experiment
exp = metalab.Experiment(
    context=MyContext(data=metalab.FilePath(output_path)),
    seeds=metalab.seeds(base=BASE_SEED, replicates=5),
    ...
)

from_dict classmethod

from_dict(data: dict[str, Any]) -> SeedBundle

Create a SeedBundle from a dictionary.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary with root_seed and optional replicate_index.

required

Returns:

Type Description
SeedBundle

A SeedBundle instance.

numpy

numpy(name: str = 'default') -> Any

Create a NumPy Generator instance seeded from this bundle.

Requires numpy to be installed (optional dependency).

Parameters:

Name Type Description Default
name str

Name for the derived seed (default: "default").

'default'

Returns:

Type Description
Any

A seeded numpy.random.Generator instance.

Raises:

Type Description
ImportError

If numpy is not installed.

Example:

bundle = SeedBundle(root_seed=42)
rng = bundle.numpy("sampling")
values = rng.random(100)

numpy_seed

numpy_seed(name: str = 'default') -> int

Return a NumPy-safe integer seed derived from this bundle.

Parameters:

Name Type Description Default
name str

Name for the derived seed (default: "default").

'default'

Returns:

Type Description
int

A 32-bit non-negative integer seed suitable for NumPy.

rng

rng(name: str = 'default') -> stdlib_random.Random

Create a stdlib Random instance seeded from this bundle.

Parameters:

Name Type Description Default
name str

Name for the derived seed (default: "default").

'default'

Returns:

Type Description
Random

A seeded random.Random instance.

Example:

bundle = SeedBundle(root_seed=42)
rng = bundle.rng("sampling")
value = rng.random()

to_dict

to_dict() -> dict[str, Any]

Serialize the bundle to a dictionary.

Returns:

Type Description
dict[str, Any]

A dictionary representation of the bundle.

SeedPlan

SeedPlan(base: int, replicates: int = 1)

Plan for generating seed bundles across replicates.

Each replicate gets a unique SeedBundle with a distinct replicate_index, allowing for reproducible replication studies.

Example:

plan = SeedPlan(base=42, replicates=3)
for bundle in plan:
    # bundle.replicate_index: 0, 1, 2
    rng = bundle.numpy()
    ...

Initialize the seed plan.

Parameters:

Name Type Description Default
base int

The base seed for all bundles.

required
replicates int

Number of replicates to generate.

1

base_seed property

base_seed: int

The base seed for this plan.

replicates property

replicates: int

The number of replicates in this plan.

__getitem__

__getitem__(index: int) -> SeedBundle

Get the SeedBundle for a specific replicate index.

__iter__

__iter__() -> Iterator[SeedBundle]

Yield a SeedBundle for each replicate.

__len__

__len__() -> int

Return the number of replicates.

from_manifest_dict classmethod

from_manifest_dict(manifest: dict[str, Any]) -> 'SeedPlan'

Reconstruct SeedPlan from manifest dict.

Parameters:

Name Type Description Default
manifest dict[str, Any]

Dict with "base" and "replicates" fields.

required

Returns:

Type Description
'SeedPlan'

A SeedPlan with the same configuration.

to_manifest_dict

to_manifest_dict() -> dict[str, Any]

Return a JSON-serializable dict representation for experiment manifests.

seeds

seeds(base: int, replicates: int = 1) -> SeedPlan

Create a SeedPlan for generating seed bundles.

Parameters:

Name Type Description Default
base int

The base seed for reproducibility.

required
replicates int

Number of replicates (default: 1).

1

Returns:

Type Description
SeedPlan

A SeedPlan that yields SeedBundle instances.

Example:

seed_plan = seeds(base=42, replicates=3)

# Use with an experiment
exp = Experiment(
    name="my_exp",
    seeds=seed_plan,
    ...
)

# Or iterate directly
for bundle in seed_plan:
    rng = bundle.numpy("sampling")
    ...

metalab.SeedBundle dataclass

SeedBundle(root_seed: int, replicate_index: int | None = None)

Bundle of seeds for reproducible random number generation.

All randomness in an operation should be derived from this bundle, ensuring reproducibility given the same SeedBundle.

Attributes:

Name Type Description
root_seed int

The base seed for all derivations.

replicate_index int | None

The replicate number (if part of a SeedPlan).

derive

derive(name: str) -> int

Derive a sub-seed deterministically from root + name + replicate.

Uses SHA-256 for cross-platform stability.

Parameters:

Name Type Description Default
name str

A unique name for this sub-seed (e.g., "sampling", "init").

required

Returns:

Type Description
int

A 64-bit integer seed.

Example:

bundle = SeedBundle(root_seed=42, replicate_index=0)
seed1 = bundle.derive("sampling")
seed2 = bundle.derive("initialization")

for_preprocessing classmethod

for_preprocessing(base_seed: int) -> SeedBundle

Create a SeedBundle for preprocessing steps.

Use this when you need reproducible randomness during data preprocessing, before the experiment runs. The preprocessing seed is derived from the base seed using a "preprocessing" namespace, ensuring it doesn't collide with replicate seeds.

Include the seed in the preprocessed filename so changing it automatically triggers new preprocessing (cache miss).

Parameters:

Name Type Description Default
base_seed int

The experiment's base seed (same value you pass to metalab.seeds(base=...)). This ensures preprocessing and experiment runs share the same seed hierarchy.

required

Returns:

Type Description
SeedBundle

A SeedBundle for use in preprocessing code.

Example:

BASE_SEED = 42

# Use for preprocessing (before metalab.run)
seeds = SeedBundle.for_preprocessing(BASE_SEED)
rng = seeds.numpy("train_test_split")
train, test = my_split(data, rng=rng)

# Include seed in filename for automatic cache invalidation
output_path = f"./cache/processed_seed{BASE_SEED}.h5ad"

# Same base seed for experiment
exp = metalab.Experiment(
    context=MyContext(data=metalab.FilePath(output_path)),
    seeds=metalab.seeds(base=BASE_SEED, replicates=5),
    ...
)

from_dict classmethod

from_dict(data: dict[str, Any]) -> SeedBundle

Create a SeedBundle from a dictionary.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary with root_seed and optional replicate_index.

required

Returns:

Type Description
SeedBundle

A SeedBundle instance.

numpy

numpy(name: str = 'default') -> Any

Create a NumPy Generator instance seeded from this bundle.

Requires numpy to be installed (optional dependency).

Parameters:

Name Type Description Default
name str

Name for the derived seed (default: "default").

'default'

Returns:

Type Description
Any

A seeded numpy.random.Generator instance.

Raises:

Type Description
ImportError

If numpy is not installed.

Example:

bundle = SeedBundle(root_seed=42)
rng = bundle.numpy("sampling")
values = rng.random(100)

numpy_seed

numpy_seed(name: str = 'default') -> int

Return a NumPy-safe integer seed derived from this bundle.

Parameters:

Name Type Description Default
name str

Name for the derived seed (default: "default").

'default'

Returns:

Type Description
int

A 32-bit non-negative integer seed suitable for NumPy.

rng

rng(name: str = 'default') -> stdlib_random.Random

Create a stdlib Random instance seeded from this bundle.

Parameters:

Name Type Description Default
name str

Name for the derived seed (default: "default").

'default'

Returns:

Type Description
Random

A seeded random.Random instance.

Example:

bundle = SeedBundle(root_seed=42)
rng = bundle.rng("sampling")
value = rng.random()

to_dict

to_dict() -> dict[str, Any]

Serialize the bundle to a dictionary.

Returns:

Type Description
dict[str, Any]

A dictionary representation of the bundle.

metalab.SeedPlan

SeedPlan(base: int, replicates: int = 1)

Plan for generating seed bundles across replicates.

Each replicate gets a unique SeedBundle with a distinct replicate_index, allowing for reproducible replication studies.

Example:

plan = SeedPlan(base=42, replicates=3)
for bundle in plan:
    # bundle.replicate_index: 0, 1, 2
    rng = bundle.numpy()
    ...

Initialize the seed plan.

Parameters:

Name Type Description Default
base int

The base seed for all bundles.

required
replicates int

Number of replicates to generate.

1

base_seed property

base_seed: int

The base seed for this plan.

replicates property

replicates: int

The number of replicates in this plan.

__getitem__

__getitem__(index: int) -> SeedBundle

Get the SeedBundle for a specific replicate index.

__iter__

__iter__() -> Iterator[SeedBundle]

Yield a SeedBundle for each replicate.

__len__

__len__() -> int

Return the number of replicates.

from_manifest_dict classmethod

from_manifest_dict(manifest: dict[str, Any]) -> 'SeedPlan'

Reconstruct SeedPlan from manifest dict.

Parameters:

Name Type Description Default
manifest dict[str, Any]

Dict with "base" and "replicates" fields.

required

Returns:

Type Description
'SeedPlan'

A SeedPlan with the same configuration.

to_manifest_dict

to_manifest_dict() -> dict[str, Any]

Return a JSON-serializable dict representation for experiment manifests.