[Bible/Data] Restructure the Input Paths and Formats #193

pishoyg · 2024-08-15T13:25:43Z

In #130 and #131, we declare our intention to modify the Bohairic Bible text.
In #123, we defined our data/ directory conventions.

Regarding directory names:

We should read all data under data/raw/, since these will be displayed as is.
We will probably only read the Bohairic from data/input/ because that is the only language that we are interested in editing at the moment.
Since, for languages other than Bohairic, the data under data/raw/ and data/input/ will likely remain identical, then they should be deleted from input/ to avoid confusion, and simply read from raw/. An input/ copy should be created only if we have an intention to edit the data, otherwise it would cause confusion.

Regarding format:

To facilitate this editing, we should probably take the input from a TSV that synchronizes with a Google Sheet. JSON is not particularly user-friendly. If you can edit the text in a website or a more user-friendly UI, that would be even better. But since that's not feasible in the time being, let's just do a TSV and a Google Sheet, as we did with the Crum appendices ([Crum] Non-media Content (including Appendices) should Be Viewable in a Single GSheet #78).

The text was updated successfully, but these errors were encountered:

This essentially reverts 72c791f and 5714157. We no longer intend to use JSON as the input (as opposed to raw) format. To make the pipeline simpler, we will use TSV for input. We also don't intend to edit all languages. We will only edit Bohairic.

This will make it possible to keep the data private. The current setup, which uses `curl`, forces us to make the data public. Retrieve `JSON_KEYFILE_NAME` from the environment variables.

pishoyg · 2025-03-02T11:38:37Z

Status:

Abandoned:
everything else

TODO:

Allow reading input data from a Google Docs spreadsheet.

pishoyg added the user Why: User convenience label Aug 15, 2024

pishoyg added this to the Bible Pipeline milestone Aug 15, 2024

pishoyg mentioned this issue Aug 15, 2024

[Bible] Rewrite the Bible with Correct Spacing, and Punctuation #131

Open

pishoyg self-assigned this Aug 16, 2024

pishoyg mentioned this issue Aug 17, 2024

[Bible/HTML] Use JavaScript to turn Dialects On and Off #179

Open

pishoyg modified the milestones: Pipeline: Bible, Data Collection v1.0, Bible v1.0 Aug 26, 2024

pishoyg added the data Why: Data label Aug 31, 2024

pishoyg removed their assignment Sep 2, 2024

pishoyg added this to coptic Sep 11, 2024

pishoyg modified the milestones: Bible v1.0, Pipeline: Bible Sep 22, 2024

pishoyg modified the milestones: Bible: Backlog, Bible: End-to-end Pipeline Mar 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bible/Data] Restructure the Input Paths and Formats #193

[Bible/Data] Restructure the Input Paths and Formats #193

pishoyg commented Aug 15, 2024

pishoyg commented Mar 2, 2025 •

edited

Loading

[Bible/Data] Restructure the Input Paths and Formats #193

[Bible/Data] Restructure the Input Paths and Formats #193

Comments

pishoyg commented Aug 15, 2024

pishoyg commented Mar 2, 2025 • edited Loading

pishoyg commented Mar 2, 2025 •

edited

Loading