Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better oai #509

Merged
merged 4 commits into from
Jun 17, 2020
Merged

Better oai #509

merged 4 commits into from
Jun 17, 2020

Conversation

bondjimbond
Copy link
Collaborator

Github issue: (#508 )

What does this Pull Request do?

In OAI-PMH fetcher, replaces : with _ in the $identifier so that, when identifiers have colons in their names, they are converted to underscores when writing the temporary .metadata files instead of encoded nonsense. This makes the .metadata filenames match with the $record_key filenames used later on, and prevent errors (trying to look up my_record_identifier.metadata when the actual file is my%3Arecord%3Aidentifier.metadata).

Interested parties

@mjordan

@mjordan
Copy link
Collaborator

mjordan commented Mar 13, 2020

@bondjimbond this works. Files in my temp directory are no longer encoded: oai_digital.lib.sfu.ca_hiv_28.metadata.

I'm getting an error which I think is unrelated:

Commencing MIK.
PHP Fatal error:  Uncaught mik\exceptions\MikErrorException in /home/mark/Documents/hacking/mik/mik:105
Stack trace:
#0 [internal function]: {closure}(2, 'count(): Parame...', '/home/mark/Docu...', 98, Array)
#1 /home/mark/Documents/hacking/mik/src/fetchers/Oaipmh.php(98): count(NULL)
#2 /home/mark/Documents/hacking/mik/mik(182): mik\fetchers\Oaipmh->getRecords(NULL)
#3 {main}
  thrown in /home/mark/Documents/hacking/mik/mik on line 105

I'm willing to merge this as is.

@mjordan
Copy link
Collaborator

mjordan commented Mar 13, 2020

Found problem I saw when I tested this PR. Do you have any fetcher manipulators in your .ini file? I didn't, so https://github.com/MarcusBarnes/mik/blob/master/src/fetchers/Oaipmh.php#L95 failed because https://github.com/MarcusBarnes/mik/blob/master/src/fetchers/Oaipmh.php#L53 sets the default to null. By setting the default to array() line 95 doesn't fail.

If you didn't have any fetchermanipulators registered in your .ini file and it worked, there's something strange going on.

@bondjimbond
Copy link
Collaborator Author

No manipulators.

; MIK configuration file for an OAI-PMH toolchain.

[CONFIG]
config_id = KORA migration
last_updated_on = "2016-11-23"
last_update_by = "bw"

[SYSTEM]
date_default_timezone = 'America/Vancouver'
verify_ca = 0

[FETCHER]
class = Oaipmh
oai_endpoint = "http://thisvancouver.vpl.ca/oai2"
metadata_prefix = mods
;set_spec = islandora_artray_collection
;set_spec = islandora_capturing_vancouver_collection
;set_spec = islandora_carnegie_stories
;set_spec = islandora_chinatown_stories
;set_spec = islandora_cpr_collection
;set_spec = islandora_expo_retro_collection
;set_spec = islandora_hellenic_stories
;set_spec = islandora_hollywood_north_collection
;set_spec = islandora_leonardfrank_collection
;set_spec = islandora_madelinegunterman_collection
;set_spec = islandora_morph_collection
;set_spec = islandora_philip_timms
;set_spec = islandora_remembering_our_dtes_women_collection
;set_spec = islandora_sp_basic_image_collection
;set_spec = islandora_sp_large_image_collection
;set_spec = islandora_story_city_collection
;set_spec = islandora_vancouver_special_collection
;set_spec = islandora_west_end_stories
;set_spec = islandora_what_changes_you_stories

temp_directory = "/Volumes/Arca/tmp/oaitest_temp"

[METADATA_PARSER]
class = mods\OaiToMods

[FILE_GETTER]
class = OaipmhModsXpath
xpath_expression = "//mods:location/mods:url"
temp_directory = "/Volumes/Arca/tmp/oaitest_temp"

[WRITER]
class = Oaipmh
output_directory = "/Volumes/Arca/vpl/root"


[LOGGING]
path_to_log = "/Volumes/Arca/tmp/oaitest_output/mik.log"
path_to_manipulator_log = "/Volumes/Arca/tmp/oaitest_output/manipulator.log"

@bondjimbond
Copy link
Collaborator Author

Oh, I did comment out the subdirectory lines in src/writers/Oaipmh.php... It was creating empty subdirectories, which is clearly undesirable. (I'm thinking that the subdirectory functionality needs refining or rethinking...) But it wasn't causing errors, just an annoyance.

@bondjimbond
Copy link
Collaborator Author

@mjordan @MarcusBarnes Any further tests of this?

@bondjimbond
Copy link
Collaborator Author

@mjordan @MarcusBarnes Just a reminder about this PR. I have been using it successfully to process records from VPL. Any issues with it?

@mjordan
Copy link
Collaborator

mjordan commented Apr 20, 2020

I haven't looked at it since, but have we figured out what I was experiencing in #509 (comment) ?

@bondjimbond
Copy link
Collaborator Author

@mjordan Sorry, I missed the meat of your comment:

If you didn't have any fetchermanipulators registered in your .ini file and it worked, there's something strange going on.

That's the [MANIPULATORS] section?

If so, I do have them defined; will check to see if changing default to array() still works. I imagine it should.

Will report back in a bit.

@bondjimbond
Copy link
Collaborator Author

Any reason still not to merge this?

@MarcusBarnes
Copy link
Owner

As @bondjimbond addressed the concern in #509 (comment) in the last commit, I'll merge now.

@MarcusBarnes MarcusBarnes merged commit 12cdadd into MarcusBarnes:master Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants