-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Harvesting client fails : Zenodo "Couperin" community / error with "language" field mapping. #7638
Comments
Thanks @tjouneau - we'll take a look and see what we can do. |
OK, so the good news is that we don't have to touch any code to address this (as I was expecting initially).
which translates into
So I believe the solution should be to modify the Citation metadata block, and add the ISO 639 codes as valid alternatives, either for all the 186 supported language values, or some sensible subset thereof.
with
etc. @tjouneau: if you are feeling adventurous, you can fix it for this particular set by creating 2 entries in the database:
where K is the id of the |
This makes sense to me as the best option (though I didn't do it for country codes in the metrics PR - that use case isn't directly related to the citation block/metadata fields, but we do list countries in the citation block and I thought we could store the codes there.) |
@qqmyers I can see how that use case was a bit different. But if we ever come across a piece of metadata that we need to import, and it fails because they are using a 2-letter code in a field where we expect the name of a country, I would not hesitate to add those to the geospatial block as well. |
…titute values in the citation metadata block, where an exact match was available (140 languages total; #7638)
…mote servers offering long lists of OAI sets. (#7638)
Hi |
Hello, As for the other formats, I don't really know what "oai_datacite4" is. It is safe to say that of all the (10?) formats they are offering (https://www.zenodo.org/oai2d?verb=ListMetadataFormats) Looking at an example of oai_datacite4 (https://www.zenodo.org/oai2d?verb=GetRecord&identifier=oai:zenodo.org:204063&metadataPrefix=oai_datacite4), it appears to be simple enough. So it should be very doable to add support for it. But yes, that would definitely need to be handled in a separate issue. (I'll be mostly offline until next week; but will be back on Monday) |
Dear community
I'm taking here the liberty to create an issue based on some previous discussions in the Dataverse-Users group. I'm trying to harvest some Zenodo communities. My client is set up as follows :
Only 5 datasets are harvested out of the 20 present here: https://zenodo.org/communities/couperin/?page=1&size=20
Exception processing getRecord(), oaiUrl=https://zenodo.org/oai2d, identifier=oai:zenodo.org:3773762, edu.harvard.iq.dataverse.api.imports.ImportException, Failed to import harvested dataset: class edu.harvard.iq.dataverse.util.json.ControlledVocabularyException (Value 'eng' does not exist in type 'language')
(also : "Value 'fra' does not exist...")
Harvested for example: https://zenodo.org/record/4266132
Not harvested: https://zenodo.org/record/3948266
An exchange with @qqmyers (thanks!) linked that problem to another one regarding a SWORD Atom file import :
"With a quick look, it appears that that element is getting mapped to the Language field in the citation metadata block, whose values are controlled by the list of languages available. Those entries are all complete words (e.g. “English”) rather than the 2 or 3 letter ISO language codes. I haven’t checked to see if there’s an open issue to add support for language codes – might be something that could be addressed via an external service as being discussed in the CVV MDWG."
So it seems the problem is almost trivial, but it's still blocking the harvesting. Until further development solves this problem, is there a way around this that could be tried? Maybe a trick to ignore the language field altogether?
Best
Thomas
The text was updated successfully, but these errors were encountered: