-
Notifications
You must be signed in to change notification settings - Fork 56
0005 integrate extract as backend library
Bruno Thomas edited this page Nov 23, 2021
·
1 revision
Date: 2021-11-22
Accepted
extract has been developed and used for previous leak projects (panama papers, swiss leaks, luxembourg leaks) based on :
- Tika
- tesseract OCR
Reuse extract separating :
- a library that is used by Datashare and
- the existing extract command line interface
There is a jar library published on maven central repositories from which depend :
- datashare
- extract-cli