-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add backend: ProQuest Federated Search Gateway #3991
base: dev
Are you sure you want to change the base?
Add backend: ProQuest Federated Search Gateway #3991
Conversation
Other than the TODOs, I think this is far enough to be worth a review. Also worth noting that there is no authentication config, it's simply IP-based access. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've spent some time reviewing the test branch and discussed some of my findings with Demian. Here's a first round of comments on items that are probably quick fixes. Thanks, @maccabeelevine!
Checklist:
- Change text for results page after open search
-
Rename Source facet to Database - Sort values in Source facet by number of results
- Remove HTML code from some titles in results browse (getTitle vs. getShortTitle)
- Publication information in record view could be fleshed out
- Searches from clicking Publisher links in item records don't work
- Fix Similar Items tab error
- Staff View tab needs formatting
Change text for results page after open search
An open search just says "No results! Your search - - did not match any resources." Could we change that to custom text asking the user to enter a search term? Demian mentioned that the WorldCat2 backend displays a meaningful message in this situation.
This is the search: http://localhost/vufind_test/ProQuestFSG/Results?lookfor=&type=cql.serverChoice&limit=20&sort=date%2Fascending
Improvements needed to the Source facet
First, can we change the name of this facet to Database instead of Source?
Second, the source facet is sorted in a mysterious order -- it isn't alphabetical nor is it by quantity of hits per database. Can the Source or Database list be sorted by number of results, descending?
HTML code appearing in titles in results browse
For some records, HTML code is visible in the item title in results browse but not in record view. Demian thinks that this is due to the difference between getTitle and getShortTitle in the record driver. Can you fix it? [Additional info: it seems to me that the problematic articles come from the database/source Publicly Available Content Database.]
Example 1:
Results browse:
See also this record.
Publication information display in item records could be improved
Currently, the "Published" field in item record view displays information from the 260 field only. This is usually the publisher and a brief version of the date.
This might be okay for books, but almost everything in PQ is a newspaper or journal article, so the full citation information including publication title, volume, issue, pages, and date should be displayed.
That information is stored in the 773 field. Can we display the 773 field instead, and maybe only show the 260 if the 773 is not present?
Example from the test branch:
Same item as displayed in EDS:
The "Source" information displayed in the EDS record is accurate and is what the patron will want to see. This information is present in the 773 in the ProQuest record on the test branch:
<datafield tag="773" ind1="0" ind2=" "> <subfield code="t">The Kenyon Review</subfield> <subfield code="g">vol. 14, no. 1 (Winter 1992), p. 26-27</subfield>
This is the 260 in the same record:
<datafield tag="260" ind1=" " ind2=" "> <subfield code="b">Kenyon College</subfield> <subfield code="c">Winter 1992</subfield>
Searches from clicking Publisher links in item records don't work
When viewing an item record, you can successfully click a hyperlinked author name or subject term to retrieve other records sharing that term.
When you click a Publisher hyperlink, you get "No results found." This is the URL that fails: http://localhost/vufind_test/ProQuestFSG/Results?type=Publisher&lookfor=Rabbinical%20Council%20of%20America
An advanced search for those terms produces results: http://localhost/vufind_test/ProQuestFSG/Results?join=AND&lookfor0%5B%5D=Rabbinical+Council+of+America&type0%5B%5D=cql.serverChoice&bool0%5B%5D=AND
Demian suggests that you need a custom record driver-specific Publisher link template to fix this. He sent this to help: https://github.com/vufind-org/vufind/blob/dev/themes/bootstrap3/templates/RecordDriver/WorldCat2/link-publisher.phtml
Similar Items tab error
The Similar Items tab in item record view doesn't work; it displays a red box error.
Demian says that you need to create a section for the record driver in RecordTabs.ini and disable 'similar items.'
Staff View tab is not formatted
The contents of the Staff View tab are displayed as a giant scary blob. Demian says that to fix it, you just need to change StaffViewArray to StaffViewMARC in RecordTabs.ini.
@@ -0,0 +1,14 @@ | |||
<?php foreach ($data as $field): ?> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented @sturkel89 's feedback to add the MARC 773 info:
Publication information display in item records could be improved
I did not yet do what was suggested to display either the 773 info OR the 260 as a backup -- they are both being displayed. The publisher field ("Kenyon College" here) is unique to the 260.
The publication date (Winter 1992 here) does seem to be duplicated between the two, so in theory I could skip that on the Published line. But I'm not sure how opinionated to be about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it's a little ugly to repeat the information, I think the 260 and 773 fields are semantically different and won't necessarily always include redundant information. I'd be inclined to continue displaying both in the interest of completeness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree. Could we change the heading "Published" though to "Published by" to distinguish from "Published in"? I know the date part is not technically "by" but I think it could still work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a bad idea, just a question of how to implement it in the most translation-friendly way (i.e. do we revise existing keys or create a new key? Is the existing key used in multiple contexts that might have different semantics?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would have to be a new key; I see 'Published' used for two different meanings in RecordDataFormatterFactory alone, referencing getPublicationDetails and getDateSpan.
All that said, I think this would be a separate PR as it affects other backends and others might want to weigh in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely agree, it's out of scope for this PR but worth looking at separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all makes sense and I appreciate the need to consider language strings/translation. I am going to post follow-up comments with other suggestions for tweaking results browse and record view after I do some more analysis of PQ content and fields.
@@ -62,6 +62,7 @@ | |||
} | |||
|
|||
// Comma-separate formatting | |||
.record .format { display: inline-flex; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This eliminates the whitespace before the comma, which comes from the HTML simply having line breaks in between the spans. I didn't see any negative side effects but couldn't find another backend where it was used to test -- obviously missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally speaking, we go through contortions to avoid having line breaks before commas in our templates because of this problem. If using inline-flex
is a viable solution, then we might be able to make some of our templates less ugly by wrapping lines, etc. :-)
module/VuFind/src/VuFind/View/Helper/Root/RecordDataFormatterFactory.php
Show resolved
Hide resolved
themes/bootstrap5/templates/RecordDriver/ProQuestFSG/data-publicationDetails.phtml
Show resolved
Hide resolved
The API offers two "expanders". I've decided (for now) not to implement either.
|
There seems to be no highlighting support in the API, so dropping that TODO possibility. |
Decode HTML entitiesYou made a change above to strip HTML tags from titles in results browse. That eliminated HTML tags within angle brackets as in the example I gave above. However, other HTML entities are still visible in results browse and in item records, including Examples (first record, second record): and @demiankatz suggests that wherever you use "strip tags," you should also decode HTML entities. Thanks! |
Source lightbox?I'd like to be able to bring up a lightbox so I could sort and filter the list of database sources, as we can sort and filter some facet groups in "regular" VuFind. This would make it easier to include or exclude the groups of historical newspaper sources, for example, and would be extra-great when we have the ability to apply multi-filter selection and deselection. Is it possible? (NB: This functionality doesn't seem to be available for any facet group in EDS, so maybe it's not possible when there are huge lists of facet values.) |
Record display suggestionsDisplay Database in record Display DOI in record Hide date in Published field in record In every case, the 260 field (usually publisher name and publication date) provides identical or less specific date information than the date information that's present in the 773. (260 populates the Published area, and 773 populates the Published In area.) I suggest we try displaying ONLY 260, subfield b for the "Published" field. We'll still get the completeness from including the publisher name as well as the journal/publication date, and eliminate the redundancy and clutter of repeating the date. |
The ProQuest Federated Search Gateway is an SRU API for searching across research databases licensed via ProQuest. It's old (docs last updated 2016) but still works and per ProQuest customer service:
On the plus side, it has a fairly rich CQL syntax for search. On the minus, there are no facets offered other than the constituent databases.
Implementation is patterned after the soon-to-be-deleted WorldCat backend as they both use SRU.
TODO