You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some podcasts are multilingual, where each episode might use a different language, or even where a single episode may switch between multiple languages.
It is already possible to list multiple languages in the channel of the RSS feed (e.g. <language>en,es</language> on the channel), and perhaps there should also be a similar optional tag on each item that defaults to the channel language, because that may be helpful when each episode is in a different language.
But when a single episode contains multiple languages, we also need a way to tag which text belong to which language within the transcript.
I am not sure if there is an obvious way to do it in every format, but for JSON, we can add an optional language property to each segment which defaults to the item's language in the RSS feed, as follows:
For WebVTT, maybe this information could be placed in a comment.
For SRT, maybe this information could be encoded in parentheses or some other type of brackets.
For HTML, maybe this can use the lang attribute.
The text was updated successfully, but these errors were encountered:
The RSS spec defines a language tag, the HTML spec defines a lang attribute. I have followed the existing style in each format, where for JSON, it uses unabbreviated words (otherwise we could certainly make most of the other JSON names shorter, but there is no difference after compression.)
perhaps there should also be a similar optional tag on each item that defaults to the channel language, because that may be helpful when each episode is in a different language.
For this part of the problem, we can actually use the existing xml:lang attribute:
Technically these existing language/lang tags/attributes are specified to hold only one language, although in practice creators of multilingual podcasts do use comma-delimited lists in the language tag to hold multiple languages
Some podcasts are multilingual, where each episode might use a different language, or even where a single episode may switch between multiple languages.
It is already possible to list multiple languages in the channel of the RSS feed (e.g.
<language>en,es</language>
on the channel), and perhaps there should also be a similar optional tag on each item that defaults to the channel language, because that may be helpful when each episode is in a different language.But when a single episode contains multiple languages, we also need a way to tag which text belong to which language within the transcript.
I am not sure if there is an obvious way to do it in every format, but for JSON, we can add an optional
language
property to each segment which defaults to the item's language in the RSS feed, as follows:For WebVTT, maybe this information could be placed in a comment.
For SRT, maybe this information could be encoded in parentheses or some other type of brackets.
For HTML, maybe this can use the lang attribute.
The text was updated successfully, but these errors were encountered: