Results¶

The results module provides interfaces for querying and analyzing experiment outcomes. Use these classes to access run records, metrics, and artifacts after execution completes.

metalab.Results ¶

Results(store: Store, records: list[RunRecord])

Collection of experiment runs with querying and access capabilities.

Results wraps a Store and provides convenient access to: - Individual Run objects via indexing - Tabular view of results - Filtering by status, tags, or parameters

Example:

result = metalab.run(experiment)  # stores in ./runs/{name} by default

# Access individual runs
run = result[0]
print(run.metrics)
artifact = run.artifact("summary")

# Get tabular view
df = result.table(as_dataframe=True)

# Filter results
successful = result.successful
filtered = result.filter(gene="KLF1")

# Export
result.to_csv("./output/results.csv")

# Display summary
result.display()

Initialize the Results collection.

Parameters:

Name	Type	Description	Default
`store`	`Store`	The store containing artifacts.	required
`records`	`list[RunRecord]`	List of RunRecords from the experiment.	required

failed `property` ¶

failed: Results

Get only failed runs.

records `property` ¶

records: list[RunRecord]

Get all run records (raw dataclass form).

runs `property` ¶

runs: list[Run]

Get all runs as Run objects.

store `property` ¶

store: 'Store'

The store containing the run data.

successful `property` ¶

successful: Results

Get only successful runs.

getitem ¶

__getitem__(index: int) -> Run

__getitem__(index: slice) -> Results

__getitem__(index: int | slice) -> Run | Results

Get a run by index or a slice of results.

Parameters:

Name	Type	Description	Default
`index`	`int \| slice`	Integer index or slice.	required

Returns:

Type	Description
`Run \| Results`	Run for integer index, Results for slice.

iter ¶

__iter__() -> Iterator[Run]

Iterate over runs.

len ¶

__len__() -> int

Return the number of runs.

compute_derived ¶

compute_derived(metrics: 'list[DerivedMetricFn]', *, overwrite: bool = False, progress: bool = False) -> None

Compute derived metrics for all runs and persist to store.

This method computes derived metrics from run artifacts/params/metrics and stores them in /derived/{run_id}.json for each run.

Parameters:

Name	Type	Description	Default
`metrics`	`'list[DerivedMetricFn]'`	List of derived metric functions. Each function receives a Run object and returns dict[str, Metric].	required
`overwrite`	`bool`	If True, recompute even if derived metrics exist.	`False`
`progress`	`bool`	Show progress bar (requires rich).	`False`

Example

def final_loss(run: Run) -> dict[str, Metric]: loss_history = run.artifact("loss_history") return {"final_loss": float(loss_history[-1])}

results.compute_derived([final_loss])

display ¶

display(*, group_by: list[str] | None = None, show_summary: bool = True) -> None

Display results summary to console.

Uses rich if available, falls back to plain text.

Parameters:

Name	Type	Description	Default
`group_by`	`list[str] \| None`	Optional metric keys to group results by.	`None`
`show_summary`	`bool`	Show overall summary statistics.	`True`

Example

result.display() result.display(group_by=["gene", "perturbation_value"])

filter ¶

filter(status: str | Status | None = None, tags: list[str] | None = None, **params: Any) -> Results

Filter results by criteria.

Parameters:

Name	Type	Description	Default
`status`	`str \| Status \| None`	Filter by status ("success", "failed", "cancelled").	`None`
`tags`	`list[str] \| None`	Filter by tags (all must be present).	`None`
`**params`	`Any`	Filter by metric values.	`{}`

Returns:

Type	Description
`Results`	A new Results with filtered runs.

Example

Filter by status¶

successful = result.filter(status="success")

Filter by metric values¶

filtered = result.filter(gene="KLF1", perturbation_value=100)

Chain filters¶

runs = result.filter(status="success").filter(gene="KLF1")

from_store `classmethod` ¶

from_store(store: Store, experiment_id: str | None = None) -> Results

Load results from a store.

Parameters:

Name	Type	Description	Default
`store`	`Store`	The store to load from.	required
`experiment_id`	`str \| None`	Optional filter by experiment ID.	`None`

Returns:

Type	Description
`Results`	Results containing the loaded runs.

Example

from metalab.store import FileStore

store = FileStore("./runs/my_experiment") results = Results.from_store(store)

load ¶

load(run_id: str, artifact_name: str) -> Any

Load an artifact from a run by run_id.

Prefer using run.artifact(name) for cleaner access:

result[0].artifact("summary")

Parameters:

Name	Type	Description	Default
`run_id`	`str`	The run identifier.	required
`artifact_name`	`str`	The name of the artifact.	required

Returns:

Type	Description
`Any`	The deserialized artifact.

Raises:

Type	Description
`FileNotFoundError`	If the artifact doesn't exist.

summary ¶

summary() -> dict[str, Any]

Get a summary of the results.

Returns:

Type	Description
`dict[str, Any]`	Dict with counts and basic statistics.

table ¶

table(as_dataframe: bool = False) -> list[dict[str, Any]] | Any

Get results as a table.

Parameters:

Name	Type	Description	Default
`as_dataframe`	`bool`	If True, return a pandas DataFrame (requires pandas).	`False`

Returns:

Type	Description
`list[dict[str, Any]] \| Any`	List of dicts by default, or DataFrame if as_dataframe=True.

Raises:

Type	Description
`ImportError`	If as_dataframe=True but pandas is not installed.

to_csv ¶

to_csv(path: str | Path, *, include_fingerprints: bool = False, timestamp: bool = False) -> Path

Export results to a CSV file.

Parameters:

Name	Type	Description	Default
`path`	`str \| Path`	Output path. If a directory, generates a timestamped filename.	required
`include_fingerprints`	`bool`	Include fingerprint columns (default: False).	`False`
`timestamp`	`bool`	Add timestamp to filename if path is a file (default: False).	`False`

Returns:

Type	Description
`Path`	Path to the written CSV file.

Raises:

Type	Description
`ImportError`	If pandas is not installed.

to_dataframe ¶

to_dataframe(*, include_params: bool = True, include_metrics: bool = True, include_record: bool = True, include_derived: bool = False, derived_metrics: 'list[DerivedMetricFn] | None' = None, artifact_reducers: dict[str, ArtifactReducer | ContextAwareReducer] | None = None, progress: bool = False) -> Any

Export results to a pandas DataFrame with optional artifact reduction.

This method provides flexible DataFrame export with: - Resolved parameters (prefixed with 'param_') - Captured metrics - Record metadata (run_id, status, duration, etc.) - Persisted derived metrics (from /derived/{run_id}.json) - On-the-fly derived metrics via reducer functions

Parameters:

Name	Type	Description	Default
`include_params`	`bool`	Include params_resolved columns (prefixed with 'param_').	`True`
`include_metrics`	`bool`	Include metrics columns.	`True`
`include_record`	`bool`	Include record fields (run_id, status, duration, etc.).	`True`
`include_derived`	`bool`	Include persisted derived metrics from /derived/.	`False`
`derived_metrics`	`'list[DerivedMetricFn] \| None'`	List of derived metric functions for on-the-fly computation (not persisted). Each function receives a Run object and returns dict[str, Metric].	`None`
`artifact_reducers`	`dict[str, ArtifactReducer \| ContextAwareReducer] \| None`	(Deprecated, use derived_metrics) Dict mapping artifact name to reducer function.	`None`
`progress`	`bool`	Show progress bar when loading artifacts (requires rich).	`False`

Returns:

Type	Description
`Any`	pandas DataFrame with the requested columns.

Raises:

Type	Description
`ImportError`	If pandas is not installed.

Example (include persisted derived metrics): df = results.to_dataframe(include_derived=True)

Example (on-the-fly derived metric): def final_loss(run): return {"final_loss": run.artifact("loss_history")[-1]}

df = results.to_dataframe(derived_metrics=[final_loss])

Example (artifact reducer - legacy): def reduce_history(arr): return {"final": arr[:, -1].mean(), "best": arr.min()}

df = results.to_dataframe(
    artifact_reducers={"history": reduce_history}
)

metalab.Run ¶

Run(record: RunRecord, store: Store)

A single experiment run with access to its metrics and artifacts.

The Run object wraps a RunRecord and provides convenient access to: - Run metadata (run_id, status, timestamps) - Metrics captured during the run - Artifacts stored for the run - Experiment-level metadata via the experiment property

Example:

result = metalab.run(experiment)
run = result[0]  # Get first run

# Access metrics
print(run.metrics)
print(run.status)

# Access experiment metadata
print(run.experiment.metadata)

# Load artifacts
summary = run.artifact("summary")
for desc in run.artifacts():
    print(f"  {desc.name}: {desc.kind}")

Initialize the Run wrapper.

Parameters:

Name	Type	Description	Default
`record`	`RunRecord`	The underlying RunRecord.	required
`store`	`Store`	The store containing artifacts.	required

context_fingerprint `property` ¶

context_fingerprint: str

Fingerprint of the context used.

derived `property` ¶

derived: dict[str, Any]

Derived metrics computed post-hoc.

These are stored separately from the run record in /derived/{run_id}.json. Returns an empty dict if no derived metrics exist.

duration_ms `property` ¶

duration_ms: int

Run duration in milliseconds.

error `property` ¶

error: dict[str, Any] | None

Error information if the run failed.

experiment `property` ¶

experiment: ExperimentInfo

Experiment-level information including metadata.

Lazily loads from the experiment manifest on first access. If the manifest is not found, returns minimal info extracted from the experiment_id.

Returns:

Type	Description
`ExperimentInfo`	ExperimentInfo with experiment metadata.

Example:

# Access user-defined metadata
group_labels = run.experiment.metadata.get("group_labels")
markov_iter = run.experiment.metadata.get("markov_iter", 3)

experiment_id `property` ¶

experiment_id: str

The experiment identifier.

finished_at `property` ¶

finished_at: datetime

When the run finished.

metrics `property` ¶

metrics: dict[str, Any]

Metrics captured during the run.

params `property` ¶

params: dict[str, Any]

Resolved parameters for this run.

params_fingerprint `property` ¶

params_fingerprint: str

Fingerprint of the parameters used.

record `property` ¶

record: RunRecord

Access the underlying RunRecord.

run_id `property` ¶

run_id: str

The unique run identifier.

seed_fingerprint `property` ¶

seed_fingerprint: str

Fingerprint of the seeds used.

started_at `property` ¶

started_at: datetime

When the run started.

status `property` ¶

status: Status

The run status (success, failed, cancelled).

tags `property` ¶

tags: list[str]

Tags associated with the run.

artifact ¶

artifact(name: str) -> Any

Load an artifact by name.

Parameters:

Name	Type	Description	Default
`name`	`str`	The artifact name.	required

Returns:

Type	Description
`Any`	The deserialized artifact.

Raises:

Type	Description
`FileNotFoundError`	If the artifact doesn't exist.

artifacts ¶

artifacts() -> list[ArtifactDescriptor]

List available artifacts for this run.

Returns:

Type	Description
`list[ArtifactDescriptor]`	List of artifact descriptors.

data ¶

data(name: str) -> Any

Load structured result data by name.

Structured data is stored via capture.data() and is optimized for fast access by derived metric functions. With PostgresStore, data is stored inline in the database (as JSON). With FileStore, data is stored in JSON files at results/{run_id}/{name}.json.

Parameters:

Name	Type	Description	Default
`name`	`str`	The data name.	required

Returns:

Type	Description
`Any`	The data object. Arrays are returned as numpy arrays if
`Any`	shape/dtype metadata is available.

Raises:

Type	Description
`KeyError`	If the data doesn't exist.

Example:

# In a derived metric function
def compute_metrics(run: Run) -> dict:
    matrix = run.data("transition_matrix")
    return {"sparsity": np.count_nonzero(matrix) / matrix.size}

list_data ¶

list_data() -> list[str]

List available structured data names for this run.

Returns:

Type	Description
`list[str]`	List of data names.

metalab.ExperimentInfo `dataclass` ¶

ExperimentInfo(experiment_id: str, name: str = '', version: str = '', description: str = '', metadata: dict[str, Any] = dict(), tags: list[str] = list())

Experiment-level information accessible from a Run.

This provides access to experiment metadata without needing to load the full experiment manifest repeatedly.

Attributes:

Name	Type	Description
`experiment_id`	`str`	The experiment identifier (name:version).
`name`	`str`	The experiment name.
`version`	`str`	The experiment version.
`description`	`str`	Human-readable description.
`metadata`	`dict[str, Any]`	User-defined metadata dict.
`tags`	`list[str]`	List of tags for categorization.

from_experiment_id `classmethod` ¶

from_experiment_id(experiment_id: str) -> ExperimentInfo

Create minimal ExperimentInfo from just an experiment_id.

Used as a fallback when the manifest is not available.

Parameters:

Name	Type	Description	Default
`experiment_id`	`str`	The experiment identifier (name:version).	required

Returns:

Type	Description
`ExperimentInfo`	ExperimentInfo with minimal data extracted from the ID.

from_manifest `classmethod` ¶

from_manifest(manifest: dict[str, Any]) -> ExperimentInfo

Create ExperimentInfo from an experiment manifest dict.

Parameters:

Name	Type	Description	Default
`manifest`	`dict[str, Any]`	The experiment manifest dictionary.	required

Returns:

Type	Description
`ExperimentInfo`	ExperimentInfo populated from the manifest.

Results¶

metalab.Results ¶

failed property ¶

records property ¶

runs property ¶

store property ¶

successful property ¶

__getitem__ ¶

__iter__ ¶

__len__ ¶

compute_derived ¶

display ¶

filter ¶

Filter by status¶

Filter by metric values¶

Chain filters¶

from_store classmethod ¶

load ¶

summary ¶

table ¶

to_csv ¶

to_dataframe ¶

metalab.Run ¶

context_fingerprint property ¶

derived property ¶

duration_ms property ¶

error property ¶

experiment property ¶

experiment_id property ¶

finished_at property ¶

metrics property ¶

params property ¶

params_fingerprint property ¶

record property ¶

run_id property ¶

seed_fingerprint property ¶

started_at property ¶

status property ¶

tags property ¶

artifact ¶

artifacts ¶

data ¶

list_data ¶

metalab.ExperimentInfo dataclass ¶

from_experiment_id classmethod ¶

from_manifest classmethod ¶

failed `property` ¶

records `property` ¶

runs `property` ¶

store `property` ¶

successful `property` ¶

getitem ¶

iter ¶

len ¶

from_store `classmethod` ¶

context_fingerprint `property` ¶

derived `property` ¶

duration_ms `property` ¶

error `property` ¶

experiment `property` ¶

experiment_id `property` ¶

finished_at `property` ¶

metrics `property` ¶

params `property` ¶

params_fingerprint `property` ¶

record `property` ¶

run_id `property` ¶

seed_fingerprint `property` ¶

started_at `property` ¶

status `property` ¶

tags `property` ¶

metalab.ExperimentInfo `dataclass` ¶

from_experiment_id `classmethod` ¶

from_manifest `classmethod` ¶