samplesize Module#
Sample size and power calculations for epidemiological studies.
This module provides functions for calculating required sample sizes and statistical power for common epidemiological study designs: cohort studies, case-control studies, cross-sectional studies, and diagnostic test studies.
Classes#
- class episia.stats.samplesize.StudyDesign(value)[source]#
Bases:
EnumTypes of epidemiological study designs.
- CASE_CONTROL = 'case_control'#
- COHORT = 'cohort'#
- CROSS_SECTIONAL = 'cross_sectional'#
- DIAGNOSTIC = 'diagnostic'#
- class episia.stats.samplesize.TestType(value)[source]#
Bases:
EnumTypes of statistical tests.
- ONE_SIDED = 'one_sided'#
- TWO_SIDED = 'two_sided'#
Functions#
- episia.stats.samplesize.sample_size_risk_ratio(risk_unexposed=None, risk_ratio=None, power=0.8, alpha=0.05, test_type=TestType.TWO_SIDED, r=1.0, design_effect=1.0, *, p0=None, rr_expected=None, **kwargs)[source]#
- episia.stats.samplesize.sample_size_risk_difference(risk_unexposed, risk_difference, power=0.8, alpha=0.05, test_type=TestType.TWO_SIDED, r=1.0, **kwargs)[source]#
Calculate sample size for cohort study based on risk difference.
- Parameters:
- Returns:
SampleSizeResult object
- Return type:
- episia.stats.samplesize.sample_size_odds_ratio(proportion_exposed_controls, odds_ratio, power=0.8, alpha=0.05, test_type=TestType.TWO_SIDED, r=1.0, **kwargs)[source]#
Calculate sample size for case-control study (odds ratio).
- Parameters:
- Returns:
SampleSizeResult object
- Return type:
Example
>>> # Case-control study: OR=2.0, 30% exposure in controls >>> result = sample_size_odds_ratio(0.3, 2.0) >>> print(result.n_cases) 146 # cases needed
- episia.stats.samplesize.sample_size_sensitivity_specificity(expected_sens, expected_spec, precision, alpha=0.05, prevalence=None, which='both', **kwargs)[source]#
Calculate sample size for diagnostic test studies.
- Parameters:
expected_sens (float) – Expected sensitivity
expected_spec (float) – Expected specificity
precision (float) – Desired width of confidence interval (half-width)
alpha (float) – Type I error rate
prevalence (float | None) – Disease prevalence (required for sensitivity/specificity)
which (str) – ‘sensitivity’, ‘specificity’, or ‘both’
- Returns:
SampleSizeResult object
- Return type:
Example
>>> # Validate test with sens=0.9, spec=0.85, CI width ±0.05 >>> result = sample_size_sensitivity_specificity(0.9, 0.85, 0.05, prevalence=0.1) >>> print(result.n_total) 246 # total subjects needed
- episia.stats.samplesize.sample_size_single_proportion(expected_proportion, precision, alpha=0.05, design_effect=1.0, **kwargs)[source]#
Calculate sample size for estimating a single proportion.
Used for cross-sectional studies, prevalence surveys, etc.
- Parameters:
- Returns:
SampleSizeResult object
- Return type:
- episia.stats.samplesize.power_calculation(n_per_group=None, n_cases=None, n_controls=None, risk_unexposed=None, risk_ratio=None, odds_ratio=None, proportion_exposed_controls=None, alpha=0.05, test_type=TestType.TWO_SIDED, r=1.0, design=StudyDesign.COHORT, **kwargs)[source]#
Calculate statistical power for a given sample size.
- Parameters:
n_per_group (float | None) – Sample size per group (for cohort studies)
n_cases (float | None) – Number of cases (for case-control)
n_controls (float | None) – Number of controls (for case-control)
risk_unexposed (float | None) – Risk in unexposed group
risk_ratio (float | None) – Expected risk ratio
odds_ratio (float | None) – Expected odds ratio
proportion_exposed_controls (float | None) – Proportion exposed in controls
alpha (float) – Type I error rate
test_type (TestType) – ‘two_sided’ or ‘one_sided’ test
r (float) – Group ratio
design (StudyDesign) – Study design
- Returns:
SampleSizeResult object with calculated power
- Return type:
- episia.stats.samplesize.fleiss_correction(n_uncorrected, continuity_correction=True)[source]#
Apply Fleiss continuity correction to sample size.
- episia.stats.samplesize.design_effect_deff(intraclass_correlation, average_cluster_size)[source]#
Calculate design effect for cluster randomized trials.
DEFF = 1 + (m - 1) * ρ
- episia.stats.samplesize.calculate_sample_size(design, parameters, **kwargs)[source]#
Comprehensive sample size calculation function.
- Parameters:
design (str | StudyDesign) – Study design (‘cohort’, ‘case_control’, etc.)
parameters (Dict) – Dictionary of parameters specific to the design
**kwargs – Additional arguments passed to specific functions
- Returns:
SampleSizeResult object
- Return type:
Example
>>> params = { ... 'risk_unexposed': 0.1, ... 'risk_ratio': 2.0, ... 'power': 0.8, ... 'alpha': 0.05 ... } >>> result = calculate_sample_size('cohort', params)
Examples#
Cohort study sample size:
from episia.stats.samplesize import sample_size_risk_ratio, TestType
# Detect RR=2.0 with baseline risk=0.1, power=0.8, α=0.05
result = sample_size_risk_ratio(
risk_unexposed=0.1,
risk_ratio=2.0,
power=0.8,
alpha=0.05,
test_type=TestType.TWO_SIDED
)
print(result) # Sample size: 199 per group
print(f"Total participants needed: {result.n_total:.0f}")
Case-control study:
from episia.stats.samplesize import sample_size_odds_ratio
result = sample_size_odds_ratio(
proportion_exposed_controls=0.3,
odds_ratio=2.0,
power=0.8,
r=2 # Two controls per case
)
print(f"Cases needed: {result.n_cases:.0f}")
print(f"Controls needed: {result.n_controls:.0f}")
Diagnostic test study:
from episia.stats.samplesize import sample_size_sensitivity_specificity
result = sample_size_sensitivity_specificity(
expected_sens=0.9,
expected_spec=0.85,
precision=0.05,
prevalence=0.1
)
print(f"Total subjects: {result.n_total:.0f}")
Cross-sectional survey:
result = sample_size_single_proportion(
expected_proportion=0.5,
precision=0.03,
design_effect=1.5
)
print(f"Sample size: {result.n_total:.0f}")
Power calculation:
from episia.stats.samplesize import power_calculation, StudyDesign
power_result = power_calculation(
n_per_group=150,
risk_unexposed=0.1,
risk_ratio=2.0,
design=StudyDesign.COHORT
)
print(f"Achieved power: {power_result.power:.3f}")