Custom Data
To use pre-split data from external sources, use the custom_data method.
The custom_data method allows you to provide pre-split training and validation datasets.
Usage Examples
Click the below button to run this example in Colab:
Note: If using Colab, please see the runtime setup guide.
Splitter:
external_split:
method: custom_data
filepath:
ori: benchmark://adult-income_ori # Training set
control: benchmark://adult-income_control # Validation set
schema:
ori: benchmark://adult-income_schema
control: benchmark://adult-income_schemaThis example demonstrates custom_data usage with other modules:
- Splitter: Uses
custom_datato load pre-split datasets - Other modules: Loader, Synthesizer, and Evaluator are used together for complete evaluation workflow
ℹ️
The
filepath parameter supports all Loader formats, including benchmark:// protocol and regular file paths.