-
-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
haravijaya proofreading #38
Comments
Namaste! Very happy to know that you are proofreading the text. I've converted the text to markdown and separated chapters into sections for convenience. Please keep sending periodic pull requests. Of course, leaving out variant readings is an option. Otherwise you can do one of the following:
I personally prefer this modification of convention followed at sanskritdocuments website:
This renders as |
Looks like TEI was adapted and that the text is available at - https://github.com/ppasedach/ratnakara-tei.git |
No, this is not what has happened. I did not yet get to further working on your OCRed text. What you see in the ratnakara-tei repository, or, properly displayed using Charles Li's upama engine is for the major part an old e-text produced by Diwakar and Rabi Acharya. I converted it from velthuis encoding to IAST, and added TEI markup. But it lacks the commentary, and a few cantos. Some other cantos have been recently typed in from various manuscript sources, which is an ongoing process. Particularly for those cantos missing in the old e-text I should sometime soon create something similar from your raw file, and I'd then like to do that in such a way that corrections which are made can then be reintegrated into your repository, which is one reason that has stopped me from doing it so far. It is much easier to just perform some conversions and corrections on a piece of text, and forget about the original source. If one wants to incorporate the changes to the original, one will need a more thought-out approach. |
Ah I see - so I presume that you will add the missing canto-s to your TEI repo, and we can then use our regular TEI-to-markdown scripts to update our text. Please update this thread to notify me once this can be done. Curious to know your name, BTW. |
You can call me Peter. https://www.aai.uni-hamburg.de/indtib/personen/pasedach.html . Yes, that would probably be an easier approach, at least on my end. But my TEI will be encoded as IAST, if that's not a problem for you? In Upama you can switch to Devanāgarī display though, but I'm afraid not for export. Do you actually train your OCR with corrections? |
Pleased to e-meet you!
No problem - my script will transliterate.
No - just whatever I get with Google Vision or Google Drive. |
I would like to correct a few cantos of the Haravijaya OCR text, if that is welcome. I would start with canto 49, which is the last one missing in the incomplete e-text produced many years ago by Diwakar and Rabi Acharya. If it goes well, I would in due course correct a few more cantos. I am planning to then convert them to IAST, and integrate them into my electronic digital critical edition of the Haravijaya (work in progress of course). Anyway, I have started with the first few verses, but am not sure how I should encode the footnote markers, which, I am afraid, usually break in OCR. Should I maybe just leave them out, and whoever is interested in the variant readings will have to consult the scan of the edition, or later the electronic critical edition, anyways? Adding the numbers in the running text would make these specific words ungreppable sort of. Or do you have any convention for that?
Here the ru together with footnote marker 3 was OCRed as sai.
The text was updated successfully, but these errors were encountered: