io Module
Input/output functions for epidemiological data.
This module provides functions for reading and writing epidemiological
data in various formats with automatic format detection and validation.
Functions
-
episia.data.io.read_csv(path, low_memory=True, **kwargs)[source]
Read CSV file into Dataset.
- Parameters:
path (str | Path) – Path to CSV file
low_memory (bool) – Optimize memory usage
**kwargs – Additional arguments for pd.read_csv
- Returns:
Dataset object
- Return type:
Dataset
-
episia.data.io.read_excel(path, sheet_name=0, low_memory=True, **kwargs)[source]
Read Excel file into Dataset.
- Parameters:
path (str | Path) – Path to Excel file
sheet_name (str | int | List | None) – Sheet to read
low_memory (bool) – Optimize memory usage
**kwargs – Additional arguments for pd.read_excel
- Returns:
Dataset object
- Return type:
Dataset
-
episia.data.io.read_parquet(path, low_memory=True, **kwargs)[source]
Read Parquet file into Dataset.
- Parameters:
path (str | Path) – Path to Parquet file
low_memory (bool) – Optimize memory usage
**kwargs – Additional arguments for pd.read_parquet
- Returns:
Dataset object
- Return type:
Dataset
-
episia.data.io.from_pandas(df, low_memory=True)[source]
Create Dataset from pandas DataFrame.
- Parameters:
-
- Returns:
Dataset object
- Return type:
Dataset
-
episia.data.io.from_dict(data, low_memory=True, **kwargs)[source]
Create Dataset from dictionary.
- Parameters:
data (Dict) – Dictionary of data
low_memory (bool) – Optimize memory usage
**kwargs – Additional arguments for pd.DataFrame
- Returns:
Dataset object
- Return type:
Dataset
-
episia.data.io.from_records(records, low_memory=True, **kwargs)[source]
Create Dataset from list of records.
- Parameters:
records (List[Dict]) – List of dictionaries
low_memory (bool) – Optimize memory usage
**kwargs – Additional arguments for pd.DataFrame.from_records
- Returns:
Dataset object
- Return type:
Dataset
-
episia.data.io.read_surveillance_format(path, format_type='auto', low_memory=True, **kwargs)[source]
Read surveillance data in standard formats.
- Parameters:
path (str | Path) – Path to surveillance data file
format_type (str) – Format type (‘sidesp’, ‘who’, ‘ecdc’, ‘auto’)
low_memory (bool) – Optimize memory usage
**kwargs – Additional arguments
- Returns:
Dataset object
- Return type:
Dataset
-
episia.data.io.detect_format(path)[source]
Detect file format from extension or content.
- Parameters:
path (str | Path) – Path to file
- Returns:
Detected format string
- Return type:
str
-
episia.data.io.export_dataset(dataset, path, format='auto', **kwargs)[source]
Export Dataset to file.
- Parameters:
dataset (Dataset) – Dataset to export
path (str | Path) – Output path
format (str) – Output format (‘csv’, ‘excel’, ‘parquet’, ‘auto’)
**kwargs – Additional arguments for writer
- Return type:
None
Examples
Reading data:
from episia.data.io import read_csv, read_excel, from_pandas
# Read CSV
ds = read_csv("surveillance_data.csv")
# Read Excel
ds = read_excel("surveillance_data.xlsx", sheet_name="Weekly")
# Create from pandas DataFrame
import pandas as pd
df = pd.DataFrame({'cases': [10, 20, 30]})
ds = from_pandas(df)
# Create from dictionary
data = {'date': ['2023-01-01', '2023-01-02'], 'cases': [10, 15]}
ds = from_dict(data)
Exporting data:
from episia.data.io import export_dataset
# Export to CSV
export_dataset(ds, "output.csv")
# Export to Excel with options
export_dataset(ds, "output.xlsx", sheet_name="Results", index=False)
Format detection:
from episia.data.io import detect_format
fmt = detect_format("data.csv") # Returns 'csv'
fmt = detect_format("data.xlsx") # Returns 'excel'