Skip to content

Capture

The Capture interface allows operations to emit metrics, artifacts, files, and logs during execution. All captured data is automatically associated with the current run.

metalab.Capture

Capture(store: Store, run_id: str, registry: SerializerRegistry | None = None, allow_pickle: bool = False, artifact_dir: Path | None = None, worker_id: str | None = None)

Interface for capturing metrics, artifacts, and logs during a run.

The Capture object is created by the executor and passed to the operation. Logs are streamed in real-time to the store.

Example:

@metalab.operation
def my_operation(context, params, seeds, capture, runtime):
    capture.log("Starting operation")

    # Subscribe to third-party library logs
    capture.subscribe_logger("dynamo")
    capture.subscribe_logger("sklearn", level=logging.WARNING)

    # Capture scalar metrics
    capture.metric("accuracy", 0.95)
    capture.log_metrics({"loss": 0.05, "epoch": 10})

    # Capture artifacts (serialized automatically)
    capture.artifact("predictions", predictions_array, kind="numpy")

    # Capture a file you generated
    capture.file("plot", "/tmp/plot.png", kind="image")

    capture.log("Operation completed")
    # No return needed - success is implicit

Initialize the capture interface.

Parameters:

Name Type Description Default
store Store

The store to persist artifacts to.

required
run_id str

The ID of the current run.

required
registry SerializerRegistry | None

Serializer registry (created if None).

None
allow_pickle bool

Whether to allow pickle serialization.

False
artifact_dir Path | None

Directory for temporary artifact files.

None
worker_id str | None

Worker identifier for log messages (e.g., "thread:2", "process:3").

None

artifacts property

artifacts: list[ArtifactDescriptor]

Get the captured artifact descriptors.

logger property

logger: Logger

Get the Python logger for this run.

Use this for direct access to Python's logging API.

Example:

capture.logger.info("Using standard logging API")
capture.logger.exception("Caught error", exc_info=True)

metrics property

metrics: dict[str, Any]

Get the captured metrics.

results property

results: list[dict[str, Any]]

Get the captured structured results.

__enter__

__enter__() -> 'Capture'

Enter context manager.

__exit__

__exit__(exc_type: Any, exc_val: Any, exc_tb: Any) -> None

Exit context manager, ensuring finalize() is called.

artifact

artifact(name: str, obj: Any, *, kind: str | None = None, format: str | None = None, metadata: dict[str, Any] | None = None) -> ArtifactDescriptor

Capture an artifact by serializing an object.

The serializer is selected automatically based on the object type, unless kind is explicitly specified.

Parameters:

Name Type Description Default
name str

The artifact name.

required
obj Any

The object to serialize.

required
kind str | None

Explicit serializer kind (e.g., "json", "numpy", "pickle").

None
format str | None

Explicit format (usually inferred from serializer).

None
metadata dict[str, Any] | None

Additional metadata to attach.

None

Returns:

Type Description
ArtifactDescriptor

The ArtifactDescriptor for the saved artifact.

data

data(name: str, obj: Any, *, metadata: dict[str, Any] | None = None) -> None

Capture structured result data.

Data is stored in Postgres for fast access by derived metrics and future Atlas visualization. Unlike artifacts, data is stored inline in the database (as JSON), not as separate files.

Parameters:

Name Type Description Default
name str

The data name.

required
obj Any

The data object (numpy array, list, dict, or JSON-serializable).

required
metadata dict[str, Any] | None

Optional metadata to attach.

None
Supported types
  • numpy arrays (converted to nested lists, shape/dtype preserved)
  • lists, nested lists
  • dicts (JSON-serializable)
  • scalars (int, float, str, bool)

Example:

# Store a transition matrix for derived metric computation
capture.data("transition_matrix", matrix)

# Store a dictionary of scores
capture.data("gene_scores", {"TP53": 0.8, "BRCA1": 0.6})

# Store intermediate data with metadata
capture.data("embeddings", embeddings, metadata={"dim": 128})

figure

figure(name: str, fig: Any, *, format: str = 'png', dpi: int = 150, bbox_inches: str = 'tight', metadata: dict[str, Any] | None = None, close: bool = True) -> ArtifactDescriptor

Capture a matplotlib figure as an image artifact.

This is a convenience method that handles saving the figure to a temporary file and capturing it, eliminating boilerplate.

Parameters:

Name Type Description Default
name str

The artifact name.

required
fig Any

A matplotlib Figure object.

required
format str

Image format (default: "png"). Options: png, pdf, svg, jpg.

'png'
dpi int

Resolution in dots per inch (default: 150).

150
bbox_inches str

Bounding box (default: "tight").

'tight'
metadata dict[str, Any] | None

Additional metadata to attach.

None
close bool

Whether to close the figure after saving (default: True).

True

Returns:

Type Description
ArtifactDescriptor

The ArtifactDescriptor for the saved artifact.

Example:

import matplotlib.pyplot as plt

fig, ax = plt.subplots()
ax.plot([1, 2, 3], [1, 4, 9])
ax.set_title("My Plot")

capture.figure("my_plot", fig)  # Saves and closes figure

file

file(name: str, path: str | Path, *, kind: str | None = None, metadata: dict[str, Any] | None = None) -> ArtifactDescriptor

Capture a file that was generated by the operation.

Use this for files you've already written (e.g., plots, reports).

Parameters:

Name Type Description Default
name str

The artifact name.

required
path str | Path

Path to the file.

required
kind str | None

The kind of artifact.

None
metadata dict[str, Any] | None

Additional metadata.

None

Returns:

Type Description
ArtifactDescriptor

The ArtifactDescriptor for the saved artifact.

finalize

finalize() -> dict[str, Any]

Finalize capture and return collected data.

This is called by the executor in a finally block to ensure partial results are captured even on failure. Cleans up logging handlers, converts stepped metrics to data entries, and uploads logs for remote stores.

Returns:

Type Description
dict[str, Any]

Dict containing metrics, artifacts, results, and stepped_metrics.

flush

flush() -> None

Flush any buffered log content to disk.

log

log(message: str, level: str = 'info') -> None

Log a message for this run.

Messages are streamed immediately to the run's log file, enabling real-time visibility (e.g., via tail -f).

Parameters:

Name Type Description Default
message str

The log message.

required
level str

Log level - "debug", "info", "warning", "error" (default: "info").

'info'

Example:

capture.log("Starting optimization")
capture.log(f"Iteration {i}: loss={loss:.4f}")
capture.log("Convergence failed", level="warning")

log_metrics

log_metrics(values: dict[str, Any], step: int | None = None) -> None

Capture multiple metrics at once.

Parameters:

Name Type Description Default
values dict[str, Any]

Dict of metric names to values.

required
step int | None

Optional step number (for time-series metrics).

None

metric

metric(name: str, value: float | int | str | bool, step: int | None = None) -> None

Capture a scalar metric.

Parameters:

Name Type Description Default
name str

The metric name.

required
value float | int | str | bool

The metric value (must be a scalar).

required
step int | None

Optional step number (for time-series metrics).

None

subscribe_logger

subscribe_logger(name: str, level: int = logging.DEBUG) -> None

Subscribe to a named logger to capture its output.

This attaches the capture's file handler to the specified logger, capturing all its log messages to the run's log file. Works even if the logger has propagate=False (like dynamo-release).

Parameters:

Name Type Description Default
name str

The logger name (e.g., "dynamo", "sklearn", "tensorflow").

required
level int

Minimum log level to capture (default: DEBUG).

DEBUG

Example:

# Capture all dynamo logs
capture.subscribe_logger("dynamo")

# Capture sklearn warnings and above
capture.subscribe_logger("sklearn", level=logging.WARNING)

# Now any logging from these libraries is captured
import dynamo as dyn
dyn.tl.dynamics(adata)  # Logs captured automatically

unsubscribe_logger

unsubscribe_logger(name: str) -> None

Unsubscribe from a named logger.

Parameters:

Name Type Description Default
name str

The logger name to unsubscribe from.

required