Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for importing .md, .yml, and .bib documents from .docx documents #20

Closed
wants to merge 1 commit into from

Conversation

Witiko
Copy link
Contributor

@Witiko Witiko commented Mar 26, 2024

This PR continues https://github.com/istqborg/istqb_shared_documents/pull/116 and closes https://github.com/istqborg/istqb_shared_documents/issues/104 by implementing the third acceptance criterion from https://github.com/istqborg/istqb_shared_documents/issues/104: Import .docx documents from product translators into .md, .yml, and .bib documents.

After this PR, all documents with the extensions .md.docx, .yml.docx, and .bib.docx in the repository are converted back to .md, .yml, and .bib documents and uploaded as the artifact DOCX-to-MD-YAML-and-BIB.zip. While the conversion of .yml and .bib documents is lossless, the conversion of .md documents is lossy and may require hand-editing, as discussed in https://github.com/istqborg/istqb_shared_documents/issues/104#issuecomment-1946386489.

For example, here is the round trip for the .md, .yml, and .bib documents from this repository:

  • Artifact MD-YAML-and-BIB-to-DOCX.zip contains the .docx documents exported from the .md, .yml, and .bib documents from the directory example-document/ in this repository. We would pass these .docx documents to product translators.
  • Artifact DOCX-to-MD-YAML-and-BIB.zip contains the .md, .yml, and .bib documents imported from the .docx documents. In this example, the .docx documents originate from the artifact MD-YAML-and-BIB-to-DOCX.zip. In reality, we would receive these .docx documents from product translators.

@Witiko Witiko requested a review from danopolan March 26, 2024 14:45
@Witiko Witiko self-assigned this Mar 26, 2024
@Witiko
Copy link
Contributor Author

Witiko commented Mar 26, 2024

@danopolan, I am happy to quibble about the artifact names:

image

Perhaps the following would be easier to read and more consistent?

  • MD-YAML-and-BIB-to-DOCX.zipDOCX (export).zip
  • DOCX-to-MD-YAML-and-BIB.zipDOCX (import).zip
  • EPUB.zipEPUB (export).zip
  • HTML.zipHTML (export).zip
  • PDF.zipPDF (export).zip

@Witiko
Copy link
Contributor Author

Witiko commented Apr 10, 2024

@danopolan: Will we we want this merged?

Copy link
Contributor

@danopolan danopolan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry for the late review. I am OK with the changes, but I am afraid they are going to interfere with #49

@Witiko
Copy link
Contributor Author

Witiko commented Jul 11, 2024

I am sorry for the late review. I am OK with the changes, but I am afraid they are going to interfere with #49

@danopolan I was wrong to assume that this PR was related to #80 in our earlier chat. Instead, this PR makes it possible to convert DOCX back to Markdown. This conversion is not perfect, since not all structure of Markdown is preserved in DOCX, so manual review of the resulting Markdown files would be required regardless. Furthermore, as you said, this PR will conflict with the PR for ticket #49.

Unless there is a current demand to converting Markdown files to DOCX, translating them, and converting them back to Markdown, let's postpone this PR and revisit it later?

@danopolan
Copy link
Contributor

This feature is not needed now, but it could be beneficial in the future, so I would like to have it merged. Of course, there is no pressure to deliver this, so we can come back to this later.

@Witiko
Copy link
Contributor Author

Witiko commented Jul 12, 2024

This feature is not needed now, but it could be beneficial in the future, so I would like to have it merged. Of course, there is no pressure to deliver this, so we can come back to this later.

The implementation in this PR is provisional and conflicts with #49 and #80. Therefore, it seems best to come back and cherry-pick the implementation after #49 and #80 have been closed and when we have concrete acceptance criteria. The existence of this PR should make the future implementation easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants