Seeds¶
Seeds provide reproducible random number generation across experiments. The seed system ensures that runs with the same seed bundle produce identical results.
metalab.seeds ¶
Seeds module: Explicit RNG control with deterministic derivation.
Provides:
- SeedBundle: Manages root seed and derived sub-seeds
- SeedPlan: Generates seed bundles for replicates
- seeds(): Factory for creating seed plans
SeedBundle
dataclass
¶
Bundle of seeds for reproducible random number generation.
All randomness in an operation should be derived from this bundle, ensuring reproducibility given the same SeedBundle.
Attributes:
| Name | Type | Description |
|---|---|---|
root_seed |
int
|
The base seed for all derivations. |
replicate_index |
int | None
|
The replicate number (if part of a SeedPlan). |
derive ¶
Derive a sub-seed deterministically from root + name + replicate.
Uses SHA-256 for cross-platform stability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
A unique name for this sub-seed (e.g., "sampling", "init"). |
required |
Returns:
| Type | Description |
|---|---|
int
|
A 64-bit integer seed. |
Example:
for_preprocessing
classmethod
¶
Create a SeedBundle for preprocessing steps.
Use this when you need reproducible randomness during data preprocessing, before the experiment runs. The preprocessing seed is derived from the base seed using a "preprocessing" namespace, ensuring it doesn't collide with replicate seeds.
Include the seed in the preprocessed filename so changing it automatically triggers new preprocessing (cache miss).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_seed
|
int
|
The experiment's base seed (same value you pass to metalab.seeds(base=...)). This ensures preprocessing and experiment runs share the same seed hierarchy. |
required |
Returns:
| Type | Description |
|---|---|
SeedBundle
|
A SeedBundle for use in preprocessing code. |
Example:
BASE_SEED = 42
# Use for preprocessing (before metalab.run)
seeds = SeedBundle.for_preprocessing(BASE_SEED)
rng = seeds.numpy("train_test_split")
train, test = my_split(data, rng=rng)
# Include seed in filename for automatic cache invalidation
output_path = f"./cache/processed_seed{BASE_SEED}.h5ad"
# Same base seed for experiment
exp = metalab.Experiment(
context=MyContext(data=metalab.FilePath(output_path)),
seeds=metalab.seeds(base=BASE_SEED, replicates=5),
...
)
from_dict
classmethod
¶
Create a SeedBundle from a dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Dictionary with root_seed and optional replicate_index. |
required |
Returns:
| Type | Description |
|---|---|
SeedBundle
|
A SeedBundle instance. |
numpy ¶
Create a NumPy Generator instance seeded from this bundle.
Requires numpy to be installed (optional dependency).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name for the derived seed (default: "default"). |
'default'
|
Returns:
| Type | Description |
|---|---|
Any
|
A seeded numpy.random.Generator instance. |
Raises:
| Type | Description |
|---|---|
ImportError
|
If numpy is not installed. |
Example:
numpy_seed ¶
Return a NumPy-safe integer seed derived from this bundle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name for the derived seed (default: "default"). |
'default'
|
Returns:
| Type | Description |
|---|---|
int
|
A 32-bit non-negative integer seed suitable for NumPy. |
rng ¶
Create a stdlib Random instance seeded from this bundle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name for the derived seed (default: "default"). |
'default'
|
Returns:
| Type | Description |
|---|---|
Random
|
A seeded random.Random instance. |
Example:
to_dict ¶
Serialize the bundle to a dictionary.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
A dictionary representation of the bundle. |
SeedPlan ¶
Plan for generating seed bundles across replicates.
Each replicate gets a unique SeedBundle with a distinct replicate_index, allowing for reproducible replication studies.
Example:
plan = SeedPlan(base=42, replicates=3)
for bundle in plan:
# bundle.replicate_index: 0, 1, 2
rng = bundle.numpy()
...
Initialize the seed plan.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base
|
int
|
The base seed for all bundles. |
required |
replicates
|
int
|
Number of replicates to generate. |
1
|
__getitem__ ¶
Get the SeedBundle for a specific replicate index.
from_manifest_dict
classmethod
¶
Reconstruct SeedPlan from manifest dict.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
manifest
|
dict[str, Any]
|
Dict with "base" and "replicates" fields. |
required |
Returns:
| Type | Description |
|---|---|
'SeedPlan'
|
A SeedPlan with the same configuration. |
to_manifest_dict ¶
Return a JSON-serializable dict representation for experiment manifests.
metalab.SeedBundle
dataclass
¶
Bundle of seeds for reproducible random number generation.
All randomness in an operation should be derived from this bundle, ensuring reproducibility given the same SeedBundle.
Attributes:
| Name | Type | Description |
|---|---|---|
root_seed |
int
|
The base seed for all derivations. |
replicate_index |
int | None
|
The replicate number (if part of a SeedPlan). |
derive ¶
Derive a sub-seed deterministically from root + name + replicate.
Uses SHA-256 for cross-platform stability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
A unique name for this sub-seed (e.g., "sampling", "init"). |
required |
Returns:
| Type | Description |
|---|---|
int
|
A 64-bit integer seed. |
Example:
for_preprocessing
classmethod
¶
Create a SeedBundle for preprocessing steps.
Use this when you need reproducible randomness during data preprocessing, before the experiment runs. The preprocessing seed is derived from the base seed using a "preprocessing" namespace, ensuring it doesn't collide with replicate seeds.
Include the seed in the preprocessed filename so changing it automatically triggers new preprocessing (cache miss).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_seed
|
int
|
The experiment's base seed (same value you pass to metalab.seeds(base=...)). This ensures preprocessing and experiment runs share the same seed hierarchy. |
required |
Returns:
| Type | Description |
|---|---|
SeedBundle
|
A SeedBundle for use in preprocessing code. |
Example:
BASE_SEED = 42
# Use for preprocessing (before metalab.run)
seeds = SeedBundle.for_preprocessing(BASE_SEED)
rng = seeds.numpy("train_test_split")
train, test = my_split(data, rng=rng)
# Include seed in filename for automatic cache invalidation
output_path = f"./cache/processed_seed{BASE_SEED}.h5ad"
# Same base seed for experiment
exp = metalab.Experiment(
context=MyContext(data=metalab.FilePath(output_path)),
seeds=metalab.seeds(base=BASE_SEED, replicates=5),
...
)
from_dict
classmethod
¶
Create a SeedBundle from a dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Dictionary with root_seed and optional replicate_index. |
required |
Returns:
| Type | Description |
|---|---|
SeedBundle
|
A SeedBundle instance. |
numpy ¶
Create a NumPy Generator instance seeded from this bundle.
Requires numpy to be installed (optional dependency).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name for the derived seed (default: "default"). |
'default'
|
Returns:
| Type | Description |
|---|---|
Any
|
A seeded numpy.random.Generator instance. |
Raises:
| Type | Description |
|---|---|
ImportError
|
If numpy is not installed. |
Example:
numpy_seed ¶
Return a NumPy-safe integer seed derived from this bundle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name for the derived seed (default: "default"). |
'default'
|
Returns:
| Type | Description |
|---|---|
int
|
A 32-bit non-negative integer seed suitable for NumPy. |
rng ¶
Create a stdlib Random instance seeded from this bundle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name for the derived seed (default: "default"). |
'default'
|
Returns:
| Type | Description |
|---|---|
Random
|
A seeded random.Random instance. |
Example:
to_dict ¶
Serialize the bundle to a dictionary.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
A dictionary representation of the bundle. |
metalab.SeedPlan ¶
Plan for generating seed bundles across replicates.
Each replicate gets a unique SeedBundle with a distinct replicate_index, allowing for reproducible replication studies.
Example:
plan = SeedPlan(base=42, replicates=3)
for bundle in plan:
# bundle.replicate_index: 0, 1, 2
rng = bundle.numpy()
...
Initialize the seed plan.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base
|
int
|
The base seed for all bundles. |
required |
replicates
|
int
|
Number of replicates to generate. |
1
|
__getitem__ ¶
Get the SeedBundle for a specific replicate index.
from_manifest_dict
classmethod
¶
Reconstruct SeedPlan from manifest dict.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
manifest
|
dict[str, Any]
|
Dict with "base" and "replicates" fields. |
required |
Returns:
| Type | Description |
|---|---|
'SeedPlan'
|
A SeedPlan with the same configuration. |
to_manifest_dict ¶
Return a JSON-serializable dict representation for experiment manifests.