Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APIS import] references are sometimes missing or the old versions are used while the text is from the new online version #248

Open
sennierer opened this issue Apr 24, 2024 · 4 comments

Comments

@sennierer
Copy link
Collaborator

No description provided.

@b1rger
Copy link
Contributor

b1rger commented Apr 24, 2024

"the text"? which text?

@sennierer
Copy link
Collaborator Author

In the APIS version the main text of the Bio (aka "Haupttext") is sometimes from the online version, while the references is still from the print edition.

@b1rger
Copy link
Contributor

b1rger commented Apr 24, 2024

In the APIS version the main text of the Bio (aka "Haupttext") is sometimes from the online version, while the references is still from the print edition.

Oke, I'll try to parse this:
In the HistoricalPerson thats based on the import from apis.acdh... (the APIS version) the haupttext is sometimes correctly imported from apis.acdh.... (the online version), but the references are based on the import from the XMLs (the print edition)?

@sennierer
Copy link
Collaborator Author

I think the parsing of the data from the APIS API is just fine. I suspect that when we imported the data (in 2017 in the vanilla apis instance) for some reason we got XMLs that contained the recent (online version) biographical text, but the old (from the print edition) references.
I cant think of a way the import script from the apis.acdh... endpoint got the data mixed up. The error is in the apis.acdh... dataset/database. We just need a way to correct these errors now when importing the additional/new data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants