validator Module#
Data validation functions for ensuring data quality in epidemiological analyses.
This module provides comprehensive validation functions to prevent common errors and ensure data meets required standards.
Exceptions#
- exception episia.core.validator.ValidationError[source]#
Bases:
ValueErrorCustom exception for validation errors.
Functions#
- episia.core.validator.validate_2x2_table(a, b, c, d, allow_zero=True)[source]#
Validate 2x2 contingency table values.
- episia.core.validator.validate_proportion(value, name='proportion', allow_boundary=True)[source]#
Validate that a value is a valid proportion (0-1).
- Parameters:
- Returns:
Validated proportion
- Raises:
ValidationError – If value is invalid
- Return type:
- episia.core.validator.validate_confidence_level(confidence, name='confidence level')[source]#
Validate confidence level (0 < confidence < 1).
- Parameters:
- Returns:
Validated confidence level
- Raises:
ValidationError – If confidence is invalid
- Return type:
- episia.core.validator.validate_sample_size(n, name='sample size', min_size=1)[source]#
Validate sample size.
- Parameters:
- Returns:
Validated sample size
- Raises:
ValidationError – If sample size is invalid
- Return type:
- episia.core.validator.validate_dataframe(df, required_columns=None, min_rows=1, allow_nan=False)[source]#
Validate pandas DataFrame for epidemiological analysis.
- Parameters:
- Returns:
Validated DataFrame
- Raises:
ValidationError – If DataFrame is invalid
- Return type:
DataFrame
- episia.core.validator.validate_binary_variable(series, name='binary variable')[source]#
Validate that a series contains only binary values (0/1 or True/False).
- Parameters:
- Returns:
Validated series
- Raises:
ValidationError – If series is invalid
- Return type:
Series
- episia.core.validator.validate_date_series(dates, name='date series')[source]#
Validate date series for time series analysis.
- Parameters:
- Returns:
Validated DatetimeIndex
- Raises:
ValidationError – If dates are invalid
- Return type:
DatetimeIndex
- episia.core.validator.validate_numeric_array(array, name='numeric array', min_length=1, allow_nan=False, allow_inf=False)[source]#
Validate numeric array.
- Parameters:
- Returns:
Validated numpy array
- Raises:
ValidationError – If array is invalid
- Return type:
- episia.core.validator.validate_model_parameters(params, required_params, param_types)[source]#
Validate model parameters.
- episia.core.validator.check_convergence(values, tolerance=1e-06, max_iterations=1000, iteration=0)[source]#
Check if iterative algorithm has converged.
- episia.core.validator.validate_positive(value, name='value', strict=True)[source]#
Validate that a value is positive.
- Parameters:
- Returns:
Validated positive value
- Raises:
ValidationError – If value is not positive
- Return type:
Examples#
Validating a 2x2 contingency table:
from episia.core.validator import validate_2x2_table
# Valid table
a, b, c, d = validate_2x2_table(40, 10, 20, 30)
# This would raise ValidationError
# validate_2x2_table(-1, 10, 20, 30) # Negative value
Validating a proportion:
from episia.core.validator import validate_proportion
p = validate_proportion(0.75, name="attack rate")
# p = validate_proportion(1.2) # Would raise error
Validating a DataFrame:
import pandas as pd
from episia.core.validator import validate_dataframe
df = pd.DataFrame({'cases': [10, 20, 30], 'date': ['2023-01-01', '2023-01-02', '2023-01-03']})
df = validate_dataframe(df, required_columns=['cases', 'date'])