Epidemiological Indicators

In most epidemiological analyses we must start with simple measurements, or indicators about the raw data. In the epigraphhub.epistats module, some useful functions to calculate such indicators are provided.

import pandas as pd
from epigraphhub.analysis import epistats as es
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
/tmp/ipykernel_201/4276641218.py in <module>
      1 import pandas as pd
----> 2 from epigraphhub.analysis import epistats as es

ModuleNotFoundError: No module named 'epigraphhub'

Calculating the posterior prevalence distribution

The prevalence of a disease in a population can be modeled as a binomial random variable :math:Bin(n,p), where a fraction of the cases in the population is the parameter :math:p and the population size is the parameter :math:n. If we have a prior guess on the proportion, which we can represent as a :math:Beta(a,b) distribution, we can obtain the posterior distribution of the prevalence for a point in time as a Beta(a+cases, b+population size-cases) as a result of a conjugate Bayesian analysis.

# Using a vague prior Beta(1,1)
pdist = es.posterior_prevalence(pop_size=1e6,positives=10000,a=1,b=1)
pdist.mean()
0.010000979998040003
pdist.std()
9.950342051571256e-05

Incidence rate

incidence is defined as the number of new cases in a population over a period of time, typically 1 year. The incidence rate is also usually scale to 100k people to facilitate comparisons between localities with different populations.

ir = es.incidence_rate([1000, 5000, 10000], [5, 5, 5])
pd.DataFrame({'population': [1000, 5000, 10000],
              'cases': [5, 5, 5],
              'incidence_ratio': ir
              })
population cases incidence_ratio
0 1000 5 500.0
1 5000 5 100.0
2 10000 5 50.0

Relative Risk or Risk ratio

Is the risk of contracting a disease given exposure to a risk factor. It is calculated from the results of a controlled experiment with Exposed and control groups.

result = es.risk_ratio(exposed_cases=27, exposed_total=122, control_cases=44, control_total=487)
result
RelativeRiskResult(relative_risk=2.4495156482861398, exposed_cases=27, exposed_total=122, control_cases=44, control_total=487)
result.confidence_interval(confidence_level=0.95)
ConfidenceInterval(low=1.5836990926700116, high=3.7886786315466354)