Skip to content

Utility script to convert MS Word doc(x) files to clean HTML/Markdown.

License

Notifications You must be signed in to change notification settings

elrolito/docxconv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

docxconv

Utility script to convert MS Word doc(x) files to clean HTML, using DOM cleanup and HTML Tidy.

Usage

docxconv [-fq] -o <path> <file> ... [--watch] [--watch-path=<path>]
         [--tidy] [--cleanup.tidy] [--cleanup.pandoc] [--stdout]

Options:
  -f, --format   conversion format <html|markdown>       [default: "html"]
  -o, --output   output destination <path>               [required]
  -q, --workers  queue worker concurrency <int>          [default: 4]
  --cleanup      Booleans: cleanup.tidy, cleanup.pandoc
  --stdout       Do not write file, use stdout instead
  --watch        watch for new documents
  --watch-path   <path> to watch

Requirements

See unoconv requirements.

Known Issues

Unoconv does not seem to currently work with LibreOffice version 4 and above. Haven’t tried with OpenOffice.

Tested and working with LibreOffice v3.6.7.

About

Utility script to convert MS Word doc(x) files to clean HTML/Markdown.

Resources

License

Stars

Watchers

Forks

Packages

No packages published