Evaluator API
Synthetic data quality evaluation module, providing privacy risk measurement, data quality assessment, and machine learning utility analysis.
Class Architecture
classDiagram class Evaluator { EvaluatorConfig config string method __init__(method, **kwargs) create() eval(data) EvalResult } class EvaluatorConfig { string method dict params string module_path string class_name } class EvalResult { float global dict details DataFrame report } %% Privacy Risk Evaluators class Anonymeter { int n_attacks int n_cols eval() EvalResult } class SinglingOutEvaluator { eval() EvalResult } class LinkabilityEvaluator { list aux_cols eval() EvalResult } class InferenceEvaluator { string secret list aux_cols eval() EvalResult } %% Data Quality Evaluators class SDMetrics { string report_type eval() EvalResult } class DiagnosticReport { eval() EvalResult } class QualityReport { eval() EvalResult } %% ML Utility Evaluators class MLUtility { string task_type string target string experiment_design string resampling eval() EvalResult } class ClassificationUtility { list metrics eval() EvalResult } class RegressionUtility { list metrics eval() EvalResult } class ClusteringUtility { int n_clusters eval() EvalResult } %% Statistical Evaluator class StatsEvaluator { list stats_method string compare_method eval() EvalResult } %% Custom Evaluator class CustomEvaluator { string module_path string class_name eval() EvalResult } %% Input Data class InputData { DataFrame ori DataFrame syn DataFrame control } %% Relationships Evaluator *-- EvaluatorConfig Evaluator ..> EvalResult %% Inheritance for Privacy Anonymeter <|-- SinglingOutEvaluator Anonymeter <|-- LinkabilityEvaluator Anonymeter <|-- InferenceEvaluator %% Inheritance for Quality SDMetrics <|-- DiagnosticReport SDMetrics <|-- QualityReport %% Inheritance for ML Utility MLUtility <|-- ClassificationUtility MLUtility <|-- RegressionUtility MLUtility <|-- ClusteringUtility %% Dependencies Evaluator ..> Anonymeter Evaluator ..> SDMetrics Evaluator ..> MLUtility Evaluator ..> StatsEvaluator Evaluator ..> CustomEvaluator %% Data flow InputData ..> Evaluator %% Styling style Evaluator fill:#e6f3ff,stroke:#4a90e2,stroke-width:3px style EvaluatorConfig fill:#f3e6ff,stroke:#9966cc,stroke-width:2px style EvalResult fill:#f3e6ff,stroke:#9966cc,stroke-width:2px style Anonymeter fill:#fff2e6,stroke:#ff9800,stroke-width:2px style SinglingOutEvaluator fill:#fff2e6,stroke:#ff9800,stroke-width:2px style LinkabilityEvaluator fill:#fff2e6,stroke:#ff9800,stroke-width:2px style InferenceEvaluator fill:#fff2e6,stroke:#ff9800,stroke-width:2px style SDMetrics fill:#fff2e6,stroke:#ff9800,stroke-width:2px style DiagnosticReport fill:#fff2e6,stroke:#ff9800,stroke-width:2px style QualityReport fill:#fff2e6,stroke:#ff9800,stroke-width:2px style MLUtility fill:#fff2e6,stroke:#ff9800,stroke-width:2px style ClassificationUtility fill:#fff2e6,stroke:#ff9800,stroke-width:2px style RegressionUtility fill:#fff2e6,stroke:#ff9800,stroke-width:2px style ClusteringUtility fill:#fff2e6,stroke:#ff9800,stroke-width:2px style StatsEvaluator fill:#fff2e6,stroke:#ff9800,stroke-width:2px style CustomEvaluator fill:#fff2e6,stroke:#ff9800,stroke-width:2px style InputData fill:#e6ffe6,stroke:#66cc66,stroke-width:2px
Legend:
- Blue boxes: Main classes
- Orange boxes: Subclass implementations
- Light purple boxes: Configuration and data classes
- Light green boxes: Input data
<|--
: Inheritance relationship*--
: Composition relationship..>
: Dependency relationship-->
: Data flow
Basic Usage
from petsard import Evaluator
# Privacy risk assessment
evaluator = Evaluator('anonymeter-singlingout')
evaluator.create()
eval_result = evaluator.eval({
'ori': train_data,
'syn': synthetic_data,
'control': test_data
})
privacy_risk = eval_result['global']
# Data quality assessment
evaluator = Evaluator('sdmetrics-qualityreport')
evaluator.create()
eval_result = evaluator.eval({
'ori': train_data,
'syn': synthetic_data
})
quality_score = eval_result['global']
# Machine learning utility assessment (new version)
evaluator = Evaluator('mlutility', task_type='classification', target='income')
evaluator.create()
eval_result = evaluator.eval({
'ori': train_data,
'syn': synthetic_data,
'control': test_data
})
ml_utility = eval_result['global']
Constructor (init)
Initialize evaluator instance.
Syntax
def __init__(
method: str,
**kwargs
)
Parameters
method : str, required
- Evaluation method name
- Required parameter
- Supported methods:
- Privacy Risk Assessment:
'anonymeter-singlingout'
: Singling out risk'anonymeter-linkability'
: Linkability risk'anonymeter-inference'
: Inference risk
- Data Quality Assessment:
'sdmetrics-diagnosticreport'
: Data diagnostic report'sdmetrics-qualityreport'
: Data quality report
- Machine Learning Utility Assessment (Legacy):
'mlutility-classification'
: Classification utility (multiple models)'mlutility-regression'
: Regression utility (multiple models)'mlutility-cluster'
: Clustering utility (K-means)
- Machine Learning Utility Assessment (New, Recommended):
'mlutility'
: Unified interface (requires task_type parameter)
- Statistical Assessment:
'stats'
: Statistical difference comparison
- Default Method:
'default'
: Uses sdmetrics-qualityreport
- Custom Method:
'custom_method'
: Custom evaluator
- Privacy Risk Assessment:
kwargs : dict, optional
- Additional parameters for specific evaluators
- May include depending on evaluation method:
- MLUtility Parameters:
task_type
: Task type (‘classification’, ‘regression’, ‘clustering’)target
: Target column nameexperiment_design
: Experiment design approachresampling
: Imbalanced data handling method
- Anonymeter Parameters:
n_attacks
: Number of attack attemptsn_cols
: Number of columns per querysecret
: Column to be inferred (inference risk)aux_cols
: Auxiliary information columns (linkability risk)
- Custom Method Parameters:
module_path
: Custom module pathclass_name
: Custom class name
- MLUtility Parameters:
Return Value
- Evaluator
- Initialized evaluator instance
Usage Examples
from petsard import Evaluator
# Default evaluation
evaluator = Evaluator('default')
evaluator.create()
eval_result = evaluator.eval({
'ori': original_data,
'syn': synthetic_data
})
Supported Evaluation Types
Please refer to PETsARD YAML documentation for details.
Notes
- Method Selection: Choose evaluation method suitable for your needs, different methods focus on different aspects
- Data Requirements: Different evaluation methods require different input data combinations
- Anonymeter and MLUtility: Require ori, syn, control three datasets
- SDMetrics and Stats: Only require ori and syn two datasets
- Best Practice: Use YAML configuration files rather than direct Python API
- Method Call Order: Must call
create()
before callingeval()
- MLUtility Version: Recommend using new MLUtility (with task_type) rather than legacy separate interfaces
- Documentation Note: This documentation is for internal development team reference only, backward compatibility is not guaranteed