The non-profit making medication information freely accessible and reliable using AI.

About The Project

What is OpenPIL

OpenPIL is a non-profit organisation with an AI at its core. The AI, maintained and developed by Malik Ahmed (MPharm), extracts essential drug information from Summary of Product Characteristics (SmPC) documents. These are drug documents which hold all the important information that doctors and pharmacists use to make decisions about prescribing medicine. OpenPIL AI requires the user to write one line of code, and a path to the SmPC .pdf file. It then processes the natural language in the document using datasets curated by Malik, sourced from copyright-free libraries (see references below), to show information on active-substances, active-excipients, formulation, drug-drug interactions, and drug-class interactions. It took the OpenPIL team of clinical advisors about 1 hour on average to extract that information into an excel spreadsheet manually per SmPC; the AI run time is approx. 4 minutes for a medium length SmPC document, so it's pretty fast, especially considering the volume of data it's processing through.

Why is this important

Currently this essential clinical medication information is highly-privatised, which restricts access to healthcare technology developers who need it to create ground-breaking products for patients. This restriction limits the current state of healthcare-technology, and indirectly is putting peoples health at greater risk. This is particularly of concern for those in developing and war-torn countries, whose access to up-to-date medicinal information is limited, even though it doesn't have to be. The aim of making the OpenPIL AI open-source is to accelerate the development of affordable drug-databases and healthcare technology around the world!

(back to top)

Getting Started

These are the instructions to install the OpenPIL AI locally and get started with analysing those Summary of Product Characteristics Documents (.pdf). NOTE: The AI currently only works for SmPC's in European format.

Installation

The OpenPIL AI is really easy to install. Simply type the below command into your terminal.

pip install OpenPIL

If this doesn't work, make sure you have the dependencies, as can be seen below.

Dependencies

You will need the latest version of python.

pip install --upgrade python

You will need the following modules (nltk, PyPDF2, pdftotext):

pip install nltk

pip install PyPDF2

pip install pdftotext

All other modules should come pre-installed with Python3, they are as follows incase you are missing any:

re
string
math
ctypes
sys
platform

Usage

The OpenPIL AI requires only one line of code to run, so it's really easy! Here is how to set it up in a python environment.

from OpenPIL import OpenPIL

date = OpenPIL.AI("/path/to/the/SmPC.pdf")

print(data)

and approx. 4 minutes later, you should see this in your python terminal!

Compiling positive class interactions...
Compiling negative class interactions...
Compiling caution classes...
Compiling caution drugs...
Compiling positive interaction drugs...
Compiling negative interaction drugs...
SmPC Complete!
{
    'SMPC NAME': '/path/to/the/SmPC.pdf', 
    'BRAND NAME': 'drug's brand name', 
    'ACTIVE SUBSTANCE(S)': ['array of all active substances in drug'], 
    'ACTIVE EXCIPIENT(S)': ['array of all active excipients in drug'], 
    'FORMULATION': ['form of drug e.g. tablet'], 
    'INTERACTIVE DRUG CLASSES': ['array of any drug-classes that interact with the drug'], 
    'INTERACTIVE DRUGS': ['comprehensive array of all drug's that interact, including those contained within each drug-class that interacts'], 
    'CAUTIONS': ['array of drugs that are cautioned for use']
}

And that's it! Get a group of summary of product characteristic documents in the .pdf format stored locally, run a simple for-loop through them, sit back 🪑😎, wait, and then BOOM 💥🤯! You're very own clinical drug-information database!

Please note, that the accuracy and reliability hasn't been fully tested yet, although, OpenPIL are working on a research paper to publish that will verify the current results. So, OpenPIL makes no guarantees to the safety of the information extracted, and does not recommend its use in clinical practice. The Apache License 2.0 applies.

(back to top)

Datasets

The datasets used for the OpenPIL AI were curated by Malik Ahmed and they are as follows:

(back to top)

Development

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Fork the Project
Create your Feature Branch (git checkout -b feature/CoolFeature)
Commit your Changes (git commit -m 'Add some CoolFeature')
Push to the Branch (git push origin feature/CoolFeature)
Open a Pull Request

(back to top)

License

Distributed under the under the Apache License 2.0. See LICENSE.txt for more information.

(back to top)

Contact

Malik Ahmed - [email protected]

Project Link: https://github.com/OpenPIL/OpenPIL

(back to top)

References

Below are all the resources listed that were used to compile the OpenPIL AI Datasets, with their respective licensing information as of January 27 2022.

drugNameDataset.py was compiled by extracting the drug and supplement names listed under the European Medicines Agency, OpenFDA NDC (CC0) and Drugs@FDA (CC0), NHS BSA (Open Government License), Netherlands Medicines Agency (Re-use of Government Information Act).
drugClassSynonymDataset.py was compiled using the ChEBI, listed under 'CC0' for 'Synonyms' in the User Manual.
drugClassDataset.py was compiled using the OpenFDA NDC API (CC0) and the OpenFDA Drugs@FDA API (CC0). The malik_similarity_algorithm.c includes two sources of external code: the jaro winkler distance algorithm (GNU General Public License V3 or Later) and the ratcliff obershelp distance algorithm (terms of unlicense).

All project code other than that mentioned above, was written by Malik Ahmed, and is hereby placed under the Apache License 2.0.

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
Assets		Assets
OpenPIL		OpenPIL
.DS_Store		.DS_Store
._.DS_Store		._.DS_Store
._.git		._.git
._setup.cfg		._setup.cfg
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The non-profit making medication information freely accessible and reliable using AI.

Table of Contents

About The Project

What is OpenPIL

Why is this important

Getting Started

Installation

Dependencies

Usage

Datasets

Development

Contributing

License

Contact

References

About

Releases

Packages

Languages

License

itacrp/OpenPIL

Folders and files

Latest commit

History

Repository files navigation

The non-profit making medication information freely accessible and reliable using AI.

Table of Contents

About The Project

What is OpenPIL

Why is this important

Getting Started

Installation

Dependencies

Usage

Datasets

Development

Contributing

License

Contact

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages