-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for importing .md, .yml, and .bib documents from .docx documents #20
Conversation
As discussed in the acceptance criterion 3 from <istqborg/istqb_shared_documents#104 (comment)>.
@danopolan, I am happy to quibble about the artifact names: Perhaps the following would be easier to read and more consistent?
|
@danopolan: Will we we want this merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry for the late review. I am OK with the changes, but I am afraid they are going to interfere with #49
@danopolan I was wrong to assume that this PR was related to #80 in our earlier chat. Instead, this PR makes it possible to convert DOCX back to Markdown. This conversion is not perfect, since not all structure of Markdown is preserved in DOCX, so manual review of the resulting Markdown files would be required regardless. Furthermore, as you said, this PR will conflict with the PR for ticket #49. Unless there is a current demand to converting Markdown files to DOCX, translating them, and converting them back to Markdown, let's postpone this PR and revisit it later? |
This feature is not needed now, but it could be beneficial in the future, so I would like to have it merged. Of course, there is no pressure to deliver this, so we can come back to this later. |
The implementation in this PR is provisional and conflicts with #49 and #80. Therefore, it seems best to come back and cherry-pick the implementation after #49 and #80 have been closed and when we have concrete acceptance criteria. The existence of this PR should make the future implementation easier. |
This PR continues https://github.com/istqborg/istqb_shared_documents/pull/116 and closes https://github.com/istqborg/istqb_shared_documents/issues/104 by implementing the third acceptance criterion from https://github.com/istqborg/istqb_shared_documents/issues/104: Import .docx documents from product translators into .md, .yml, and .bib documents.
After this PR, all documents with the extensions .md.docx, .yml.docx, and .bib.docx in the repository are converted back to .md, .yml, and .bib documents and uploaded as the artifact
DOCX-to-MD-YAML-and-BIB.zip
. While the conversion of .yml and .bib documents is lossless, the conversion of .md documents is lossy and may require hand-editing, as discussed in https://github.com/istqborg/istqb_shared_documents/issues/104#issuecomment-1946386489.For example, here is the round trip for the .md, .yml, and .bib documents from this repository:
MD-YAML-and-BIB-to-DOCX.zip
contains the .docx documents exported from the .md, .yml, and .bib documents from the directoryexample-document/
in this repository. We would pass these .docx documents to product translators.DOCX-to-MD-YAML-and-BIB.zip
contains the .md, .yml, and .bib documents imported from the .docx documents. In this example, the .docx documents originate from the artifactMD-YAML-and-BIB-to-DOCX.zip
. In reality, we would receive these .docx documents from product translators.