Skip to content

Running Tools for one experiment¤

apebench.run_experiment ¤

run_experiment(
    *,
    scenario: str,
    task: str,
    net: str,
    train: str,
    start_seed: int,
    num_seeds: int,
    **scenario_kwargs
) -> tuple[pd.DataFrame, eqx.Module]

Execute a single experiment.

Only accepts keyword arguments, requires some main arguments and additional arguments that can depend on the chosen scenario.

Arguments:

  • scenario: The scenario to run, must be a key in apebench.scenarios.scenario_dict.
  • task: The task to run, can be "predict" or "correct;XX" where "XX" is the correction mode, e.g., "correct;sequential".
  • net: The network to use, must be a compatible descriptor of a network architecture, e.g., "Conv;34;10;relu".
  • train: The training methodology to use, i.e., how reference solver and emulator interact during training. One-step supervised training is achieved by "one".
  • start_seed: The integer at which the list of seeds starts from.
  • num_seeds: The number of seeds to run (in parallel). For many 1D scenarios at realistic resolutions (num_points ~ 200), doing ten seeds in parallel is usually fine for modern GPUs. For scenarios in 2D and 3D at realistic resolutions, this likely has to be set to 1 and seed processing must be done sequentially via run_study.

Returns:

  • data: The pandas.DataFrame containing the results of the experiment. Will contain the columns seed, scenario, task, net, train, scenario_kwargs, the metrics, losses and sample rollouts. Can be further post-processed, e.g., via apebench.melt_metrics.
  • trained_neural_stepper_s: Equinox modules containing the trained neural emulators. Note that if num_seeds is greater than 1, weight arrays have a leading (seed-)batch axis.
Source code in apebench/_run.py
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
def run_experiment(
    *,
    scenario: str,
    task: str,
    net: str,
    train: str,
    start_seed: int,
    num_seeds: int,
    **scenario_kwargs,
) -> tuple[pd.DataFrame, eqx.Module]:
    """
    Execute a single experiment.

    Only accepts keyword arguments, requires some main arguments and additional
    arguments that can depend on the chosen scenario.

    **Arguments:**

    * `scenario`: The scenario to run, must be a key in
        `apebench.scenarios.scenario_dict`.
    * `task`: The task to run, can be `"predict"` or `"correct;XX"`
        where `"XX"` is the correction mode, e.g., `"correct;sequential"`.
    * `net`: The network to use, must be a compatible descriptor of a
        network architecture, e.g., `"Conv;34;10;relu"`.
    * `train`: The training methodology to use, i.e., how reference
        solver and emulator interact during training. One-step supervised
        training is achieved by `"one"`.
    * `start_seed`: The integer at which the list of seeds starts from.
    * `num_seeds`: The number of seeds to run (in parallel). For many 1D
        scenarios at realistic resolutions (`num_points` ~ 200), doing ten seeds
        in parallel is usually fine for modern GPUs. For scenarios in 2D and 3D
        at realistic resolutions, this likely has to be set to 1 and seed
        processing must be done sequentially via `run_study`.

    **Returns:**

    * `data`: The `pandas.DataFrame` containing the results of the
        experiment. Will contain the columns `seed`, `scenario`, `task`, `net`,
        `train`, `scenario_kwargs`, the metrics, losses and sample rollouts. Can
        be further post-processed, e.g., via `apebench.melt_metrics`.
    * `trained_neural_stepper_s`: Equinox modules containing the
        trained neural emulators. Note that if `num_seeds` is greater than 1,
        weight arrays have a leading (seed-)batch axis.
    """
    scenario = scenario_dict[scenario](**scenario_kwargs)

    data, trained_neural_stepper_s = scenario(
        task_config=task,
        network_config=net,
        train_config=train,
        start_seed=start_seed,
        num_seeds=num_seeds,
    )

    if len(scenario_kwargs) == 0:
        data["scenario_kwargs"] = "{}"
    else:
        data["scenario_kwargs"] = str(scenario_kwargs)

    return data, trained_neural_stepper_s

apebench.get_experiment_name ¤

get_experiment_name(
    *,
    scenario: str,
    task: str,
    net: str,
    train: str,
    start_seed: int,
    num_seeds: int,
    **scenario_kwargs
) -> str

Produce a unique name for an experiment.

Source code in apebench/_run.py
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
def get_experiment_name(
    *,
    scenario: str,
    task: str,
    net: str,
    train: str,
    start_seed: int,
    num_seeds: int,
    **scenario_kwargs,
) -> str:
    """
    Produce a unique name for an experiment.
    """
    additional_infos = []
    for key, value in scenario_kwargs.items():
        additional_infos.append(f"{key}={value}")
    if len(additional_infos) > 0:
        additional_infos = ",".join(additional_infos)
        additional_infos = f"__{additional_infos}__"
    else:
        additional_infos = "__"

    end_seed = start_seed + num_seeds
    experiment_name = f"{scenario}{additional_infos}{task}__{net}__{train}__{start_seed}-{end_seed - 1}"
    return experiment_name