Easy Bulk export, no cap

This repository provides scripts and notebooks that make it easy to export data in bulk from CourtListener's freely available downloads.

Create first version of notebook suitable for Data Scientists
- Create the appropriate dtypes to optimize panda storage
- Select necessary cols usecols, for example 'created_by' date field indicating a database insert isn't necessary
- Read the opinions.csv (190+gb) chunk at a time from disk while converting into JSON
Create a standalone script that can be piped to other tools
- Create PyPi library using Poetry: package
- Output script using json lines format
Improve speed by using DASK DataFrame

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
lil_nocap		lil_nocap
notebooks		notebooks
tests		tests
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback