Skip to content

deardurham/ciprs-reader

Repository files navigation

CIPRS Reader

Build Status

Setup and Run:

Add pdf file to parse in /ignore folder then run:

docker build -t ciprs-reader .
docker run --rm -v /$(pwd):/usr/src/app ciprs-reader python ciprs-reader.py ignore/cypress-example.pdf

Example output:

[
    {
        "General": {
            "County": "DURHAM",
            "File No": "00GR000000"
        },
        "Case Information": {
            "Case Status": "DISPOSED",
            "Offense Date": "2018-01-01T20:00:00"
        },
        "Defendant": {
            "Date of Birth/Estimated Age": "1990-01-01",
            "Name": "DOE,JON,BOJACK",
            "Race": "WHITE",
            "Sex": "MALE"
        },
        "District Court Offense Information": [
            {
                "Records": [
                    {
                        "Action": "CHARGED",
                        "Description": "SPEEDING(70 mph in a 50 mph zone)",
                        "Severity": "TRAFFIC",
                        "Law": "20-141(J1)"
                    }
                ],
                "Disposed On": "2010-01-01",
                "Disposition Method": "DISMISSAL WITHOUT LEAVE BY DA"
            }
        ],
        "Superior Court Offense Information": [],
    }
]

Local Setup

Pre-requisites:

Mac

brew cask install pdftotext

Ubuntu

sudo apt-get install -y poppler-utils
wget --no-check-certificate https://dl.xpdfreader.com/xpdf-tools-linux-4.04.tar.gz \
    && tar -xvf xpdf-tools-linux-4.04.tar.gz \
    && cp xpdf-tools-linux-4.04/bin64/pdftotext /usr/local/bin/pdftotext-4

Setup:

pip install -r requirements.txt
pip install -e .

Read CIPRS PDF:

python ciprs_reader.py ./cypress-example.pdf

Run Jupyter:

jupyter-lab

Run tests:

pytest --pylint

Code for Durham