Epidemiological Indicators¶
In most epidemiological analyses we must start with simple measurements, or indicators about the raw data. In the epigraphhub.epistats
module, some useful functions to calculate such indicators are provided.
import pandas as pd
from epigraphhub.analysis import epistats as es
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
/tmp/ipykernel_201/4276641218.py in <module>
1 import pandas as pd
----> 2 from epigraphhub.analysis import epistats as es
ModuleNotFoundError: No module named 'epigraphhub'
Calculating the posterior prevalence distribution¶
The prevalence of a disease in a population can be modeled as a binomial random variable :math:Bin(n,p)
, where a fraction of the cases in the population is the parameter :math:p
and the population size is the parameter :math:n
. If we have a prior guess on the proportion, which we can represent as a :math:Beta(a,b)
distribution, we can obtain the posterior distribution of the prevalence for a point in time as a Beta(a+cases, b+population size-cases) as a result of a conjugate Bayesian analysis.
# Using a vague prior Beta(1,1)
pdist = es.posterior_prevalence(pop_size=1e6,positives=10000,a=1,b=1)
pdist.mean()
0.010000979998040003
pdist.std()
9.950342051571256e-05
Incidence rate¶
incidence is defined as the number of new cases in a population over a period of time, typically 1 year. The incidence rate is also usually scale to 100k people to facilitate comparisons between localities with different populations.
ir = es.incidence_rate([1000, 5000, 10000], [5, 5, 5])
pd.DataFrame({'population': [1000, 5000, 10000],
'cases': [5, 5, 5],
'incidence_ratio': ir
})
population | cases | incidence_ratio | |
---|---|---|---|
0 | 1000 | 5 | 500.0 |
1 | 5000 | 5 | 100.0 |
2 | 10000 | 5 | 50.0 |
Relative Risk or Risk ratio¶
Is the risk of contracting a disease given exposure to a risk factor. It is calculated from the results of a controlled experiment with Exposed and control groups.
result = es.risk_ratio(exposed_cases=27, exposed_total=122, control_cases=44, control_total=487)
result
RelativeRiskResult(relative_risk=2.4495156482861398, exposed_cases=27, exposed_total=122, control_cases=44, control_total=487)
result.confidence_interval(confidence_level=0.95)
ConfidenceInterval(low=1.5836990926700116, high=3.7886786315466354)