Field Combinations
Define value pairing relationships between different fields using allowlist approach (only affects explicitly specified value combinations).
Usage Examples
Click the button below to run examples in Colab:
field_combinations:
-
- education: income # Define field mapping: education (source) -> income (target)
-
Doctorate: # When education is Doctorate
- '>50K' # income can only be '>50K'
Masters: # When education is Masters
- '>50K' # income can be '>50K'
- '<=50K' # or '<=50K'
Allowlist Effect Example:
Given rule:
field_combinations:
-
- education: income
- Doctorate:
- '>50K'
education | income | Result |
---|---|---|
Doctorate | >50K | ✅ Keep (matches rule) |
Doctorate | <=50K | ❌ Filter (violates rule) |
Masters | >50K | ➖ Unaffected (rule not applicable) |
Masters | <=50K | ➖ Unaffected (rule not applicable) |
Bachelor | >50K | ➖ Unaffected (rule not applicable) |
Bachelor | <=50K | ➖ Unaffected (rule not applicable) |
Important: Only affects explicitly specified value combinations (Doctorate), all other combinations are retained.
Syntax Format
Field combination constraints allow you to define value domain relationships between different fields, ensuring that field combinations in synthetic data conform to real-world logical specifications.
Supported Combination Types
- Single Field Mapping: Constraints based on a single field’s value
- Multi-Field Mapping: More complex constraints considering multiple fields’ values simultaneously
Allowlist Mechanism Explanation
Based on the above example (education
→ income
):
- Constrained values:
- When
education = 'Doctorate'
,income
can only be'>50K'
- When
education = 'Masters'
,income
can be'>50K'
or'<=50K'
- When
- Unconstrained values:
income
for othereducation
values like'Bachelors'
,'HS-grad'
etc. are not restricted- Data with education other than Doctorate and Masters are always retained, regardless of their
income
value
ℹ️
Implementation Limitations: In the current implementation, field combination constraints use an allowlist approach and only support explicitly listed value combinations. Numeric fields can be enumerated for valid values, but logical comparisons using comparison operators (
>
, <
, >=
, <=
) like in field constraints are not yet supported.Single Field Mapping Syntax
-
- source_field_name: target_field_name
-
source_value1:
- target_value1
- target_value2
source_value2:
- target_value3
Multi-Field Mapping Syntax
-
-
- source_field_name1
- source_field_name2
: target_field_name
-
- source_value1
- source_value2
:
- target_value1
- target_value2
Important Notes
- Uses allowlist approach: only checks explicitly listed value combinations
- Null values (NA) are not affected by rules
- String values require exact matching (case-sensitive)
- Target values must use list format:
[value]
or[value1, value2]