LoaderAdapter

LoaderAdapter handles data loading and automatically processes benchmark:// protocol for benchmark dataset and schema file downloads.

Class Architecture

classDiagram

    class LoaderAdapter {
        +config: dict
        +loader: Loader
        +benchmarker: Benchmarker
        +__init__(config)
        +run() tuple[DataFrame, Schema]
        -_handle_benchmark_protocol()
        -_init_loader()
    }

    class Loader {
        +filepath: str
        +schema: Schema
        +load() tuple[DataFrame, Schema]
    }

    class Benchmarker {
        +config: BenchmarkerConfig
        +download()
        +get_filepath()
    }

    class BenchmarkerConfig {
        +benchmark_name: str
        +filepath_raw: str
    }

    class BenchmarkerRequests {
        +download()
    }

    LoaderAdapter ..> Loader : uses for data loading
    LoaderAdapter ..> Benchmarker : uses for benchmark protocol
    Benchmarker --> BenchmarkerConfig : has
    Benchmarker ..> BenchmarkerRequests : creates

    %% Style definitions
    class LoaderAdapter {
        <<Main Class>>
    }
    style LoaderAdapter fill:#E6E6FA

    class Loader {
        <<Core Module>>
    }
    style Loader fill:#4169E1,color:#fff

    class Benchmarker {
        <<Benchmark Handler>>
    }
    style Benchmarker fill:#9370DB,color:#fff

    style BenchmarkerConfig fill:#FFE4E1
    style BenchmarkerRequests fill:#FFE4E1

    note for LoaderAdapter "1. Detects benchmark:// protocol\n2. Uses Benchmarker to download\n3. Uses Loader with local path"

Legend:
Light purple box: LoaderAdapter main class
Blue box: Core loading module
Purple box: Benchmark dataset handler module
Light pink box: Configuration classes
..>: Dependency relationship
-->: Has relationship

Main Features

Unified interface for data loading
Automatic detection and handling of benchmark:// protocol for both data and schema
Integration of Loader and Benchmarker functionality
Returns data and Schema metadata
Supports CSV data files and YAML schema files

Method Reference

`init(config: dict)`

Initializes LoaderAdapter instance with automatic benchmark:// protocol handling.

Parameters:

config: dict, required
- Configuration parameter dictionary
- Must contain filepath key
- Supports benchmark:// protocol
- Optional parameters include:
  - schema: Schema file path
  - nrows: Load only specified number of rows (for quick testing)
  - delimiter, encoding, header, etc. (pandas read parameters)

`run(input: dict)`

Executes data loading, including automatic benchmark dataset download.

Parameters:

input: dict, required
- Input parameter dictionary
- LoaderAdapter typically receives empty dictionary {}

Returns: No direct return value. Use get_result() and get_metadata() to get results.

`get_result()`

Gets the loaded data.

Returns:

pd.DataFrame: Loaded data

`get_metadata()`

Gets the data’s Schema metadata.

Returns:

Schema: Data metadata

Usage Example

from petsard.adapter import LoaderAdapter

# Regular file loading
adapter = LoaderAdapter({
    "filepath": "data/users.csv",
    "schema": "schemas/user.yaml"
})

# Using nrows parameter for quick testing
adapter = LoaderAdapter({
    "filepath": "data/large_dataset.csv",
    "schema": "schemas/data.yaml",
    "nrows": 1000  # Load only first 1000 rows
})

# Or using benchmark:// protocol
# adapter = LoaderAdapter({
#     "filepath": "benchmark://adult-income",
#     "schema": "benchmark://adult-income_schema"
# })

# Execute loading
adapter.run({})

# Get results
data = adapter.get_result()
metadata = adapter.get_metadata()

Workflow

Protocol Detection: Check if filepath/schema uses benchmark:// protocol
Benchmarker Processing (for benchmark protocol)
- Download files locally
- Verify SHA-256 (warning on mismatch)
- Convert paths to local paths
Data Loading: Load data and metadata

Notes

This is an internal API, not recommended for direct use
Prefer using YAML configuration files and Executor
Benchmark files are cached after first download
Results are cached until next run() call

SplitterAdapter