Calculate binomial confidence intervals in summary table - lvphj/epydemiology GitHub Wiki

phjCalculateBinomialConfInts()

df = epy.phjCalculateBinomialConfInts(phjDF,
                                      phjSuccVarName = None,
                                      phjFailVarName = None,
                                      phjTotalVarName = None,
                                      phjBinomialConfIntMethod = 'normal',
                                      phjAlpha = 0.05,
                                      phjPrintResults = False)

Description

This function calculates the proportion and confidence intervals in summary table. It was originally written to be accessed by phjCalculateBinomialProportions() function but can also be assessed directly if summary table already exists.

Function parameters

Exceptions Raised

Returns

Other Notes

None

Example

phjTempDF = pd.DataFrame({'year':[2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017,2018],
                          'success':[109,77,80,57,29,31,29,19,10,16,6,8,4,0],
                          'failure':[784-109,840-77,715-80,780-57,743-29,743-31,752-29,645-19,509-10,562-16,471-6,471-8,472-4,0-0],
                          #'total':[784,840,715,780,743,743,752,645,509,562,471,471,472,0]
                         })

print('Original dataframe\n')
print(phjTempDF)
print('\n')

phjPropDF = epy.phjCalculateBinomialConfInts(phjDF = phjTempDF,
                                             phjSuccVarName = 'success',
                                             phjFailVarName = 'failure',
                                             phjTotalVarName = None,
                                             phjBinomialConfIntMethod = 'normal',
                                             phjAlpha = 0.05,
                                             phjPrintResults = False)
 
print('Dataframe of confidence intervals\n')
print(phjPropDF)

Produces the following output:

Original dataframe

    year  success  failure
0   2005      109      675
1   2006       77      763
2   2007       80      635
3   2008       57      723
4   2009       29      714
5   2010       31      712
6   2011       29      723
7   2012       19      626
8   2013       10      499
9   2014       16      546
10  2015        6      465
11  2016        8      463
12  2017        4      468
13  2018        0        0


Dataframe of confidence intervals

    year  success  failure  obs      prop  95CI_llim  95CI_ulim  95CI_lint  \
0   2005      109      675  784  0.139031   0.114813   0.163249   0.024218   
1   2006       77      763  840  0.091667   0.072153   0.111180   0.019514   
2   2007       80      635  715  0.111888   0.088782   0.134994   0.023106   
3   2008       57      723  780  0.073077   0.054812   0.091342   0.018265   
4   2009       29      714  743  0.039031   0.025105   0.052957   0.013926   
5   2010       31      712  743  0.041723   0.027345   0.056100   0.014378   
6   2011       29      723  752  0.038564   0.024802   0.052326   0.013762   
7   2012       19      626  645  0.029457   0.016409   0.042506   0.013049   
8   2013       10      499  509  0.019646   0.007590   0.031703   0.012057   
9   2014       16      546  562  0.028470   0.014720   0.042220   0.013750   
10  2015        6      465  471  0.012739   0.002611   0.022867   0.010128   
11  2016        8      463  471  0.016985   0.005316   0.028655   0.011669   
12  2017        4      468  472  0.008475   0.000205   0.016744   0.008270   
13  2018        0        0    0       NaN        NaN        NaN        NaN   

    95CI_uint  
0    0.024218  
1    0.019514  
2    0.023106  
3    0.018265  
4    0.013926  
5    0.014378  
6    0.013762  
7    0.013049  
8    0.012057  
9    0.013750  
10   0.010128  
11   0.011669  
12   0.008270  
13        NaN