is the lightweight console app to recursively scan your directory tree with two purposes:
- count the document files
- count the pages inside those document files
At the moment you can parse .docx and .pdf files, all other files ignored.
Application is written in java and build using the maven.
- apache commons-cli - for parsing args
- apache commons-io - to separate file extensions
- apache poi - to parse .docx files
- apache pdfbox - to parse .pdf files
usage: derectoryparser -p path
-h,--help show this message
-p,--path <path> root parsing directory tree
Feel free to add modules to parse additional file types. To do so you need:
- add file type to enum com/greenjack/FileTypes.java
- implement the com/greenjack/pagecounters/PageCounter.java interface using appropriate library to access desired files
- include the library dependency in pom.xml