Utilities for Scraping Datasets from APEBench¤
Use these functions if you want to procedurally scrape datasets from APEBench to then use outside of the APEBench ecosystem, e.g., for training/testing supervised models in PyTorch or in JAX with other deep learning frameworks than Equinox.
APEBench is designed to procedurally generate its data with fixed random seeds by relying on JAX' explicit treatment of randomness.
However, this determinism can only be relied on if the code is executed with the same JAX version number and on the same backend (likely also using the same driver version). Beyond that, some low-level routines within CUDA experience some non-determinism (for performance reasons) which can be deactivated.
apebench.scraper.scrape_data_and_metadata
¤
scrape_data_and_metadata(
folder: str = None,
*,
scenario: str,
name: str = "auto",
**scenario_kwargs
)
Produce train data, test data, and metadata for a given scenario. Optionally write them to disk.
Arguments:
folder
(str, optional): Folder to save the data and metadata to. If None, returns the data and metadata as jax arrays and a dictionary, respectively.scenario
(str): Name of the scenario to produce data for. Must be one ofapebench.scenarios.scenario_dict
.name
(str, optional): Name of the scenario. If "auto", the name is automatically generated based on the scenario and its additional arguments.**scenario_kwargs
: Additional arguments to pass to the scenario. All attributes of a scenario can be modified by passing them as keyword arguments.
Source code in apebench/_scraper.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
|