Skip to content

OpenPIL is an AI that extracts clinical drug information from Summary of Product Characteristic documentation (official drug information documents). It compiles a list of active-substances, active-excipients, formulation, drug-drug interactions and drug-class interactions. It is used to compile the worlds largest and most up-to-date drug informa…

License

Notifications You must be signed in to change notification settings

gobbletown/OpenPIL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation



Logo

The non-profit making medication information freely accessible and reliable using AI.



Table of Contents

  1. About The Project
  2. Getting Started
  3. Datasets
  4. Development
  5. Contributing
  6. License
  7. Contact
  8. References

About The Project

What is OpenPIL

OpenPIL is a non-profit organisation with an AI at its core. The AI, maintained and developed by Malik Ahmed (MPharm), extracts essential drug information from Summary of Product Characteristics (SmPC) documents. These are drug documents which hold all the important information that doctors and pharmacists use to make decisions about prescribing medicine. OpenPIL AI requires the user to write one line of code, and a path to the SmPC .pdf file. It then processes the natural language in the document using datasets curated by Malik, sourced from copyright-free libraries (see references below), to show information on active-substances, active-excipients, formulation, drug-drug interactions, and drug-class interactions. It took the OpenPIL team of clinical advisors about 1 hour on average to extract that information into an excel spreadsheet manually per SmPC; the AI run time is approx. 4 minutes for a medium length SmPC document, so it's pretty fast, especially considering the volume of data it's processing through.

Why is this important

Currently this essential clinical medication information is highly-privatised, which restricts access to healthcare technology developers who need it to create ground-breaking products for patients. This restriction limits the current state of healthcare-technology, and indirectly is putting peoples health at greater risk. This is particularly of concern for those in developing and war-torn countries, whose access to up-to-date medicinal information is limited, even though it doesn't have to be. The aim of making the OpenPIL AI open-source is to accelerate the development of affordable drug-databases and healthcare technology around the world!

(back to top)

Getting Started

These are the instructions to install the OpenPIL AI locally and get started with analysing those Summary of Product Characteristics Documents (.pdf). NOTE: The AI currently only works for SmPC's in European format.

Installation

The OpenPIL AI is really easy install. Simply type the below command into your terminal.

pip install OpenPIL

If this doesn't work, make sure you have the dependencies, as can be seen below.

Dependencies

You will need the latest version of python.

pip install --upgrade python

You will need the following modules (nltk, PyPDF2, pdftotext):

pip install nltk
pip install PyPDF2
pip install pdftotext

All other modules should come pre-installed with Python3, they are as follows incase you are missing any:

  • re
  • string
  • math
  • ctypes
  • sys
  • platform

Usage

The OpenPIL AI requires only one line of code to run, so it's really easy! Here is how to set it up in a python environment.

from OpenPIL import OpenPIL

date = OpenPIL.AI("/path/to/the/SmPC.pdf")

print(data)

and approx. 4 minutes later, you should see this in your python terminal!

Compiling positive class interactions...
Compiling negative class interactions...
Compiling caution classes...
Compiling caution drugs...
Compiling positive interaction drugs...
Compiling negative interaction drugs...
SmPC Complete!
{
    'SMPC NAME': '/path/to/the/SmPC.pdf', 
    'BRAND NAME': 'drug's brand name', 
    'ACTIVE SUBSTANCE(S)': ['array of all active substances in drug'], 
    'ACTIVE EXCIPIENT(S)': ['array of all active excipients in drug'], 
    'FORMULATION': ['form of drug e.g. tablet'], 
    'INTERACTIVE DRUG CLASSES': ['array of any drug-classes that interact with the drug'], 
    'INTERACTIVE DRUGS': ['comprehensive array of all drug's that interact, including those contained within each drug-class that interacts'], 
    'CAUTIONS': ['array of drugs that are cautioned for use']
}

And that's it! Get a group of summary of product characteristic documents in the .pdf format stored locally, run a simple for-loop through them, sit back 🪑😎, wait, and then BOOM 💥🤯! You're very own clinical drug-information database!

Please note, that the accuracy and reliability hasn't been fully tested yet, although, OpenPIL are working on a research paper to publish that will verify the current results. So, OpenPIL makes no guarantees to the safety of the information extracted, and does not recommend its use in clinical practice. The Apache License 2.0 applies.

(back to top)

Datasets

The datasets used for the OpenPIL AI were curated by Malik Ahmed and they are as follows:

(back to top)

Development

  • Add Active Substance Detection
  • Add Active Excipient Detection
  • Add Formulation Detection
  • Add Drug-Class Interaction Detection
  • Add Drug-Drug Interaction Detection
  • Replace python similarity algorithm with C to improve performance from ~40 minutes/SmPC to ~4 minutes/SmPC
  • Launch OpenPIL AI open source!
  • Add Side-Effects Detection
  • Add Use in Pregnancy and Breastfeeding Detection
  • Add Storage Conditions Detection
  • Publish peer-reviewed research to validate the accuracy and reliability of the AI

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/CoolFeature)
  3. Commit your Changes (git commit -m 'Add some CoolFeature')
  4. Push to the Branch (git push origin feature/CoolFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the under the Apache License 2.0. See LICENSE.txt for more information.

(back to top)

Contact

Malik Ahmed - [email protected]

Project Link: https://github.com/OpenPIL/OpenPIL

(back to top)

References

Below are all the resources listed that were used to compile the OpenPIL AI Datasets, with their respective licensing information as of January 27 2022.

All project code other than that mentioned above, was written by Malik Ahmed, and is hereby placed under the Apache License 2.0.

(back to top)

About

OpenPIL is an AI that extracts clinical drug information from Summary of Product Characteristic documentation (official drug information documents). It compiles a list of active-substances, active-excipients, formulation, drug-drug interactions and drug-class interactions. It is used to compile the worlds largest and most up-to-date drug informa…

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

 

Packages

No packages published