# DBCI
**Repository Path**: deep-psp/DBCI
## Basic Information
- **Project Name**: DBCI
- **Description**: No description available
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2023-11-08
- **Last Updated**: 2024-07-23
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# Confidence Intervals for Difference of Binomial Proportions
[](https://github.com/DeepPSP/DBCI/actions/workflows/run-pytest.yml)
[](https://github.com/DeepPSP/DBCI/actions/workflows/random-test.yml)
[](https://codecov.io/gh/DeepPSP/DBCI)
[](https://pypi.org/project/diff-binom-confint/)
[](https://dbci.readthedocs.io/en/latest/?badge=latest)
[](https://github.com/DeepPSP/DBCI/actions/workflows/docs-publish.yml)
[](https://pypistats.org/packages/diff-binom-confint)
[](license)


[](https://diff-binom-confint.streamlit.app/)
Computation of confidence intervals for binomial proportions and for difference of binomial proportions.
\[[GitHub Pages](https://deeppsp.github.io/DBCI/)\] \[[Read the Docs](http://dbci.readthedocs.io/)\]
:rocket: **NEW** :rocket: **Streamlit** support! See [here](https://diff-binom-confint.streamlit.app/) for an app deployed on [Streamlit Community Cloud](https://share.streamlit.io/).
## Installation
Run
```bash
python -m pip install diff-binom-confint
```
or install the latest version in [GitHub](https://github.com/DeepPSP/DBCI/) using
```bash
python -m pip install git+https://github.com/DeepPSP/DBCI.git
```
or git clone this repository and install locally via
```bash
cd DBCI
python -m pip install .
```
## `Numba` accelerated version
Install using
```bash
python -m pip install diff-binom-confint[acc]
```
## Usage examples
```python
from diff_binom_confint import compute_difference_confidence_interval
n_positive, n_total = 84, 101
ref_positive, ref_total = 89, 105
confint = compute_difference_confidence_interval(
n_positive,
n_total,
ref_positive,
ref_total,
conf_level=0.95,
method="wilson",
)
```
## Implemented methods
### Confidence intervals for binomial proportions
Click to view!
| Method (type) | Implemented |
|-------------------|--------------------|
| wilson | :heavy_check_mark: |
| wilson-cc | :heavy_check_mark: |
| wald | :heavy_check_mark: |
| wald-cc | :heavy_check_mark: |
| agresti-coull | :heavy_check_mark: |
| jeffreys | :heavy_check_mark: |
| clopper-pearson | :heavy_check_mark: |
| arcsine | :heavy_check_mark: |
| logit | :heavy_check_mark: |
| pratt | :heavy_check_mark: |
| witting | :heavy_check_mark: |
| mid-p | :heavy_check_mark: |
| lik | :heavy_check_mark: |
| blaker | :heavy_check_mark: |
| modified-wilson | :heavy_check_mark: |
| modified-jeffreys | :heavy_check_mark: |
### Confidence intervals for difference of binomial proportions
Click to view!
| Method (type) | Implemented |
|-----------------------------|--------------------|
| wilson | :heavy_check_mark: |
| wilson-cc | :heavy_check_mark: |
| wald | :heavy_check_mark: |
| wald-cc | :heavy_check_mark: |
| haldane | :heavy_check_mark: |
| jeffreys-perks | :heavy_check_mark: |
| mee | :heavy_check_mark: |
| miettinen-nurminen | :heavy_check_mark: |
| true-profile | :heavy_check_mark: |
| hauck-anderson | :heavy_check_mark: |
| agresti-caffo | :heavy_check_mark: |
| carlin-louis | :heavy_check_mark: |
| brown-li | :heavy_check_mark: |
| brown-li-jeffrey | :heavy_check_mark: |
| miettinen-nurminen-brown-li | :heavy_check_mark: |
| exact | :x: |
| mid-p | :x: |
| santner-snell | :x: |
| chan-zhang | :x: |
| agresti-min | :x: |
| wang | :x: |
| pradhan-banerjee | :x: |
## Creating report
One can use the `make_risk_report` function to create a report of the confidence intervals for difference of binomial proportions.
```python
from diff_binom_confint import make_risk_report
# df_train and df_test are pandas.DataFrame providing the data
table = make_risk_report((df_train, df_test), target = "binary_target")
# or if df_data is a pandas.DataFrame containing both training and testing data
table = make_risk_report(df_data, target = "binary_target")
```
For more details, see corresponding documenation. The produced table is similar to the following:
Click to view!

## References
1. [SAS](https://www.lexjansen.com/wuss/2016/127_Final_Paper_PDF.pdf)
2. [PASS](https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/PASS/Confidence_Intervals_for_the_Difference_Between_Two_Proportions.pdf)
3. [statsmodels.stats.proportion](https://www.statsmodels.org/devel/_modules/statsmodels/stats/proportion.html)
4. [scipy.stats._binomtest](https://github.com/scipy/scipy/blob/main/scipy/stats/_binomtest.py)
5. [corplingstats](https://corplingstats.wordpress.com/2019/04/27/correcting-for-continuity/)
6. [DescTools.StatsAndCIs](https://github.com/AndriSignorell/DescTools/blob/master/R/StatsAndCIs.r)
7. [Newcombee](https://onlinelibrary.wiley.com/doi/10.1002/(SICI)1097-0258(19980430)17:8%3C873::AID-SIM779%3E3.0.CO;2-I)
## NOTE
[Reference 1](#ref1) has errors in the description of the methods `Wilson CC`, `Mee`, `Miettinen-Nurminen`.
The correct computation of `Wilson CC` is given in [Reference 5](#ref5).
The correct computation of `Mee`, `Miettinen-Nurminen` are given in the **code blocks** in [Reference 1](#ref1)
## Test data
[Test data](test/test-data/) are
1. taken (with slight modification, e.g. the `upper_bound` of `miettinen-nurminen-brown-li` method in the [edge case file](test/test-data/example-10-10-vs-0-20.csv)) from [Reference 1](#ref1) for automatic test of the correctness of the implementation of the algorithms.
2. generated using [DescTools.StatsAndCIs](#ref6) via
```R
library("DescTools")
library("data.table")
results = data.table()
for (m in c("wilson", "wald", "waldcc", "agresti-coull", "jeffreys",
"modified wilson", "wilsoncc", "modified jeffreys",
"clopper-pearson", "arcsine", "logit", "witting", "pratt",
"midp", "lik", "blaker")){
ci = BinomCI(84,101,method = m)
new_row = data.table("method" = m, "ratio"=ci[1], "lower_bound" = ci[2], "upper_bound" = ci[3])
results = rbindlist(list(results, new_row))
}
fwrite(results, "./test/test-data/example-84-101.csv") # with manual slight adjustment of method names
```
3. taken from [Reference 7](#ref7) (Table II).
The filenames has the following pattern:
```python
# for computing confidence interval for difference of binomial proportions
"example-(?P[\\d]+)-(?P[\\d]+)-vs-(?P[\\d]+)-(?P[\\d]+)\\.csv"
# for computing confidence interval for binomial proportions
"example-(?P[\\d]+)-(?P[\\d]+)\\.csv"
```
Note that the out-of-range values (e.g. `> 1`) are left as empty values in the `.csv` files.
## Known Issues
1. Edge cases incorrect for the method `true-profile`.