Creating identifiers #375
Replies: 14 comments 12 replies
-
Some RDA options for manifestation identifiers related to ordering: OPTION (https://access.rdatoolkit.org/en-US_ala-95f6a60f-3d2b-32d8-9486-cf810708d4ba/div_gq5_w5s_dfb) OPTION (https://access.rdatoolkit.org/en-US_ala-95f6a60f-3d2b-32d8-9486-cf810708d4ba/div_trv_s1t_dfb) OPTION (https://access.rdatoolkit.org/en-US_ala-95f6a60f-3d2b-32d8-9486-cf810708d4ba/div_b4f_dxs_dfb) No similar instructions for work/expression (and other RDA entities) identifiers. |
Beta Was this translation helpful? Give feedback.
-
On March 15, 2023 meeting, RDF datatypes were proposed to record the specific type of identifier system (ISBN, DOI, etc.). Similar approaches were brought up in a Sinopia MAP meeting. We were not heading in this direction because Sinopia didn't seem to support displaying or choosing datatypes for literals. (Might need further confirmation from @briesenberg07.) An alternative we talked about was to define new properties for each identifier system as subproperties of more general RDA identifier properties and publish them as UW refinements/extensions for RDA. An example would be: Reasons for this refinement approach:
Questions: @GordonDunsire mentioned "ISBN-10"/"ISBN-13". Could potentially be covered by more granular refinements? Also the different editions of DDC. (082 $2) Would love feedback on this @CECSpecialistI @gerontakos @AdamSchiff @SitaKB @JianPLee @junghaelee @szapoun @lake44me |
Beta Was this translation helpful? Give feedback.
-
Thanks @gerontakos and @GordonDunsire for weighing in. A problem is that there are qualifications you can add to an identifier. "0123467446" Surely these cannot be the same datatype. Or can they? And wouldn't it be better to record an identifier and its qualifications separately? ReificationFor identifiers (NOT subjects!), are we overlooking the 'built-in' reification in RDA, i.e. Nomens? <ex:Man> rdamo:P30004[has identifier for manifestation] <ex:Nomen> . Unsure if these are correct uses of Nomen properties, but look better than all clumped together to me! Definitely cannot be used with subjects since they are not RDA entities! UW extensionsI only recently noticed UW RDA extensions since started working on Sinopia MAPs, and I know they are pre-3R, but we already have properties like these: https://doi.org/10.6069/uwlib.55.d.4#hasLcClassificationPartA Don't know about the design choices back then, but it seems that we went for properties rather than datatypes. Of course, custom properties are unsustainable and we may have changed our minds since then... SinopiaI have not seen an option to list datatypes for a literal as a dropdown/checkbox in Sinopia, only language tags. (Correct me if I'm wrong @briesenberg07.) I agree that we shouldn't create a solution for Source Vocabularies based on the limitations of Sinopia, but still, I would prefer a uniform approach. |
Beta Was this translation helpful? Give feedback.
-
Crystal wrote: “I often add such canceled/invalid identifiers when I think they will aid in discovery/selection.”
This isn’t a choice. If an ISBN appears on a resource it is required to record it.
Adam L. Schiff
Principal Cataloger
University of Washington Libraries
Box 352900
Seattle, WA 98195-2900
aschiff @ uw.edu
…________________________________
From: Crystal Yragui (Clements) ***@***.***>
Sent: Tuesday, April 18, 2023 8:13:10 AM
To: uwlib-cams/MARC2RDA ***@***.***>
Cc: Adam L Schiff ***@***.***>; Mention ***@***.***>
Subject: Re: [uwlib-cams/MARC2RDA] Creating identifiers (Discussion #375)
Possibly unpopular opinion: I think we should keep identifiers marked canceled/invalid. They are often qualified by something that makes it clear to human readers why it is included (identifier is for a different manifestation, made clear by $q (print), for instance) and are also often added because they appear on the manifestation as a result of a publisher error, as Laura points out. In my cataloging work, I often add such canceled/invalid identifiers when I think they will aid in discovery/selection. If we can retain these and express clearly that they are canceled/invalid, I vote we keep them.
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/uwlib-cams/MARC2RDA/discussions/375*discussioncomment-5650411__;Iw!!K-Hz7m0Vt54!mfbUORV0-__NDn6i2EoW4pgQsF06Iil8Cnf-NmvojZsQnJ47JeYFtgM91vcBkWqeJxSz7ABqL7FulE3An-sEhdo$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADFBVB7GJ7BS4Y724QGI723XB2VQNANCNFSM566ZKGZA__;!!K-Hz7m0Vt54!mfbUORV0-__NDn6i2EoW4pgQsF06Iil8Cnf-NmvojZsQnJ47JeYFtgM91vcBkWqeJxSz7ABqL7FulE3A13BUnwU$>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
It doesn't appear that recording all ISBN's is required, but it is standard cataloging practice to record all that appear on a manifestation being cataloged, even those that are for a different format (print vs. ebook).
2.15.1.7: If the manifestation has more than one identifier of the same type, record a brief qualification after the identifier, if considered important for identification.
LC policy (note there isn't a PCC policy) says in LC-PCC PS 2.15.1.7:
LC practice: When transcribing multiple ISBNs, transcribe first the number that is applicable to the manifestation being described; transcribe other numbers in the order presented, with appropriate qualification to distinguish.
Record ISBNs in $z (Canceled/invalid) of MARC field 020[https://original.rdatoolkit.org/images/externallink.png]<https://desktop.loc.gov/saved/Mabibl_020__z> if they clearly represent a different manifestation from the resource being cataloged and would require a separate record (e.g., an ISBN for the large print version, e-book, or teacher’s manual on the record for a regular trade publication). If separate records would not be made (e.g., most cases where ISBNs are given for both the hardback and paperback simultaneously), or in cases of doubt, record the ISBNs in $a (International Standard Book Number) of MARC field 020[https://original.rdatoolkit.org/images/externallink.png]<https://desktop.loc.gov/saved/Maauth_020__a>.
Adam L. Schiff
Principal Cataloger
University of Washington Libraries
Box 352900
Seattle, WA 98195-2900
aschiff @ uw.edu
…________________________________
From: Crystal Yragui (Clements) ***@***.***>
Sent: Tuesday, April 18, 2023 2:45:21 PM
To: uwlib-cams/MARC2RDA ***@***.***>
Cc: Adam L Schiff ***@***.***>; Mention ***@***.***>
Subject: Re: [uwlib-cams/MARC2RDA] Creating identifiers (Discussion #375)
Are ISBN's for other manifestations required as well? I always record if I know about them but always understood that as optional. For example, if a physical book has an incorrect ISBN and the eBook version has an ISBN, I add them both but thought the eBook version was optional to add while the one that appears on the resource is not optional.
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/uwlib-cams/MARC2RDA/discussions/375*discussioncomment-5654091__;Iw!!K-Hz7m0Vt54!nX_XzLyfOIgXrKZjp-ExEx20I8lRv_UdvOtPTfz2oY0MzXf4tBqWHeBOBiOH_hkZa5bU5lfYmj7HE5f30nKTmBI$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADFBVB35Z64L2B5WOIEYMCDXB4DPDANCNFSM566ZKGZA__;!!K-Hz7m0Vt54!nX_XzLyfOIgXrKZjp-ExEx20I8lRv_UdvOtPTfz2oY0MzXf4tBqWHeBOBiOH_hkZa5bU5lfYmj7HE5f3DOBxAzY$>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
The self-reification of a nomen provides advantages in simplifying data provenance, but must be used with caution. Suppose we have a manifestation that has two ISBNs and an ISSN printed on the verso of the title page (e.g. the current consolidated ISBD: "ISBN 978-3-11-026379-4 Official RDA records this data in two places: in a manifestation statement that reflects how the manifestation describes itself, and as an identifier for the manifestation: ex:m1 rdamd:P30286 "ISBN 978-3-11-026379-4, e-ISBN 978-3-11-026380-0, ISSN 1868-8438". // This uses the normalized transcription option for adding punctuation (commas) for clarity. ex:m1 rdamd:P30004 "9783110263794" . // For the printed volume (in a hardback binding) The e-ISBN is not an identifer for the manifestation of the printed volume; the ISSN is not an identifier for the manifestation of the issue of the series (printed or e-book). The distinction is already apparent in the manifestation statement, so there is no need to add a note, etc. Note that the identifier is normalized by removing the hyphens; "ISBN" is not part of the identifier. (On the other hand, "ISSN" is part of the identifier when it is recorded.) An IRI for the ISBN nomen string is the URN, so we can also state: ex:m1 rdamo:P30004 URN:ISBN:978-3-11-026379-4 . There is no guarantee that this IRI will de-reference, or what it will de-reference to (e.g. metadata about the manifestation (what we want), a purchase order form, or a list of forthcoming titles, etc.). But that is not strictly necessary; we can add our own statements: URN:ISBN:978-3-11-026379-4 rdand:P80068 "9783110263794" . // has nomen string If we try and use the LC MARC 21 approach while conforming to RDA/LRM and treating the hardback and e-book as distinct manifestations, we want to say something like: URN:ISBN:978-3-11-026380-0 rdand:P80168 "Invalid". // has status of identification Meanwhile, another agency creates metadata for the e-book, so they say: URN:ISBN:978-3-11-026379-4 rdand:P80168 "Invalid". // has status of identification The problem is that the nomen IRIs refer to two distinct manifestations; a distinction that is acknowledged by the publisher, but not by LC. According to LC: URN:ISBN:978-3-11-026379-4 rdano:P80048 ex:m1 . // hardback => URN:ISBN:978-3-11-026379-4 owl:sameAs URN:ISBN:978-3-11-026380-0 . Further complications ensue depending on how the two URNs de-reference. To avoid this, do not map subfield z for invalid ISBNs. We cannot distinguish the reason for invalidity:
See also: ISBN as URI |
Beta Was this translation helpful? Give feedback.
-
Here's what I hope is a readable table of all the "Identifier" tags I could find (there may be more outside of 01X-09X).
|
Beta Was this translation helpful? Give feedback.
-
Thank you for this work, Laura! It looks pretty ready for discussion, can I put this on the agenda for tomorrow?
From: Laura Akerman ***@***.***>
Sent: Tuesday, May 2, 2023 7:29 AM
To: uwlib-cams/MARC2RDA ***@***.***>
Cc: Crystal Yragui ***@***.***>; Mention ***@***.***>
Subject: Re: [uwlib-cams/MARC2RDA] Creating identifiers (Discussion #375)
Here's what I hope is a readable table of all the "Identifier" tags I could find (there may be more outside of 01X-09X).
My conclusion from doing this is that there may be a few tags that could be fully mapped using a custom datatype, maybe more if we ignore $z invalid identifiers, but most have too many data elements and would be better served by minting a Nomen and using nomen properties to describe it. Your opinion may differ.
Question would be, better/more consistent to just treat all with the same (nomen) structure?
Any advantage to using custom datatype with nomen structure?
MARC identifier tag
Single Custom Datatype could capture all data?
Reified identifier nomen could capture all data?
010 LCCN
Yes; values of subfields as labeled
Yes
013 Patent Control Information
No
Not sure - complex - come back to later
015 National Bibliography Number
No; multiple values to capture
Yes, $q "qualifying information" and $z Candeled/invalid
016 National Bibliographic Agency Control Number
Yes; only $a and $z for invalid + source
Yes
017 Copyright or Legal Deposit Number
No; multiple values to capture
Think so - complex - come back to later
018 Copyright article-fee code
Probably, if we determine how to map
Not sure - complex (indicates aggregation) come back
020 International Standard Book Number
No; multiple values to capture
Think so - qualifying info and price
022 International Standard Serial Number
Probably not; multiple values to capture
Probably - odd indicator values - come back to later
024 Other Standard Identifier
No; multiple values to capture
Yes
025 Overseas Acquisition Number
Yes; datatype would be the tag label
Yes
026 Fingerprint Identifier
No; multiple values to capture
Probably - complex - come back to later
027 Standard Technical Report Number
No; multiple values to capture
Yes - type of number, qualifying info, cancelled/invalid
028 Publisher or Distributor Number
No; multiple values to capture
Yes - type, source, qualifying info
030 CODEN Designation
Maybe; datatype CODEN, $a
Yes - only $a and $z if we map cancelled/invalid
031 Musical Incipits Information
No; multiple values to capture
Not sure - complex - come back later
032 Postal Registration Number
No; multiple values (type and source agency
Not sure - how is this related to an RDA entity?
035 System Control Number
Probably not; type, plus prefix plus number
Not sure - this is MARC data provenance, come back
036 Original Study Number for Computer Data Files
Probably not; type, plus source agency
Yes
074 GPO Item Number
Maybe; datatype GPO Item Number
Yes - only $a and $z if we map cancelled/invalid
088 Report Number
Maybe; datatype Report Number
Yes - only $a and $z if we map cancelled/invalid
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/uwlib-cams/MARC2RDA/discussions/375*discussioncomment-5781715__;Iw!!K-Hz7m0Vt54!n2PJZ-Fds5XHTQbu6ggYIJg1VMFA7OWGIhOI7JaWceEteKE0UQjEM3Qb4XSAz8k2vWAIDMrfYrOESJa3tTz0zBo$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AKJWNZOVOSR2O34RQ5MG6GTXEEK4RANCNFSM566ZKGZA__;!!K-Hz7m0Vt54!n2PJZ-Fds5XHTQbu6ggYIJg1VMFA7OWGIhOI7JaWceEteKE0UQjEM3Qb4XSAz8k2vWAIDMrfYrOESJa3TxHhqqc$>.
You are receiving this because you were mentioned.Message ID: ***@***.******@***.***>>
|
Beta Was this translation helpful? Give feedback.
-
@gerontakos @pan-zhuo Sorry to take so long, I should learn my lesson. Was writing this comment yesterday, power went out, lost it. Always draft somewhere that saves your keystrokes first, like Notepad++. Among the identifier mappings awaiting review - from pan-zhuo - all the mappings would need to be redone because he was using the approach used in note fields to add a prefix indicating the "type" of identifier, which we agreed wasn't appropriate for identifiers. So, whether we mint a nomen URI or not, a custom datatype would be in order to distinguish type of identifier - at least at tag level, or more granular if we can get that info.
I'm skipping 001 and 003 because they work together and this needs to be better accounted for in Sita's mapping. 003 is the source organization for 001. We could decide to make this parenthetical before the identifier string (for the manifestation) in 001, or use a nomen/nomen string and use "Related entity of nomen" property for the institution code... . The 001 field name is Control Number; I'd suggest "MARC record control number" as the datatype. Also skipping Sophia's 024 Other standard identifier, it seems to be missing a $a mapping. Also skipping Sita's 028 Publisher Number or Distributor Number which has a lot of question marks, and includes ind.1 values for type of publisher number, $b for Source (agent), as well as $q and $z ... I haven't done a full review of these mappings, just glanced over, but it'd be better to wait until we resolve the BIG QUESTIONS about how to map identifiers, right? HTH |
Beta Was this translation helpful? Give feedback.
-
The other one that I thought of that isn't in this list is 758 https://www.loc.gov/marc/bibliographic/bd758.html
MARC 21 Format for Bibliographic Data: 758: Resource Identifier (Network Development and MARC Standards Office, Library of Congress) <https://www.loc.gov/marc/bibliographic/bd758.html>
This field contains the (Network Development and MARC Standards Office, Library of Congress)
www.loc.gov
Adam
Adam L. Schiff
Principal Cataloger
University of Washington Libraries
(206) 543-8409
***@***.***
…________________________________
From: Laura Akerman ***@***.***>
Sent: Tuesday, May 2, 2023 7:29 AM
To: uwlib-cams/MARC2RDA ***@***.***>
Cc: Adam L Schiff ***@***.***>; Mention ***@***.***>
Subject: Re: [uwlib-cams/MARC2RDA] Creating identifiers (Discussion #375)
Here's what I hope is a readable table of all the "Identifier" tags I could find (there may be more outside of 01X-09X).
My conclusion from doing this is that there may be a few tags that could be fully mapped using a custom datatype, maybe more if we ignore $z invalid identifiers, but most have too many data elements and would be better served by minting a Nomen and using nomen properties to describe it. Your opinion may differ.
Question would be, better/more consistent to just treat all with the same (nomen) structure?
Any advantage to using custom datatype with nomen structure?
MARC identifier tag Single Custom Datatype could capture all data? Reified identifier nomen could capture all data?
010 LCCN Yes; values of subfields as labeled Yes
013 Patent Control Information No Not sure - complex - come back to later
015 National Bibliography Number No; multiple values to capture Yes, $q "qualifying information" and $z Candeled/invalid
016 National Bibliographic Agency Control Number Yes; only $a and $z for invalid + source Yes
017 Copyright or Legal Deposit Number No; multiple values to capture Think so - complex - come back to later
018 Copyright article-fee code Probably, if we determine how to map Not sure - complex (indicates aggregation) come back
020 International Standard Book Number No; multiple values to capture Think so - qualifying info and price
022 International Standard Serial Number Probably not; multiple values to capture Probably - odd indicator values - come back to later
024 Other Standard Identifier No; multiple values to capture Yes
025 Overseas Acquisition Number Yes; datatype would be the tag label Yes
026 Fingerprint Identifier No; multiple values to capture Probably - complex - come back to later
027 Standard Technical Report Number No; multiple values to capture Yes - type of number, qualifying info, cancelled/invalid
028 Publisher or Distributor Number No; multiple values to capture Yes - type, source, qualifying info
030 CODEN Designation Maybe; datatype CODEN, $a Yes - only $a and $z if we map cancelled/invalid
031 Musical Incipits Information No; multiple values to capture Not sure - complex - come back later
032 Postal Registration Number No; multiple values (type and source agency Not sure - how is this related to an RDA entity?
035 System Control Number Probably not; type, plus prefix plus number Not sure - this is MARC data provenance, come back
036 Original Study Number for Computer Data Files Probably not; type, plus source agency Yes
074 GPO Item Number Maybe; datatype GPO Item Number Yes - only $a and $z if we map cancelled/invalid
088 Report Number Maybe; datatype Report Number Yes - only $a and $z if we map cancelled/invalid
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/uwlib-cams/MARC2RDA/discussions/375*discussioncomment-5781715__;Iw!!K-Hz7m0Vt54!lvYJ1aryDs-oNk6U4mRwWniRgWU_S-djvbmJCs1I8Lbh5o_YhaczXG33wj7kNdssmjsGlyYfDFfV-3AAUERDk00$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADFBVB6DXCNGOWREQU4WTE3XEEK4RANCNFSM566ZKGZA__;!!K-Hz7m0Vt54!lvYJ1aryDs-oNk6U4mRwWniRgWU_S-djvbmJCs1I8Lbh5o_YhaczXG33wj7kNdssmjsGlyYfDFfV-3AAXApcuM0$>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
A very small test dataset implementing different options for ISBNs. Simple literal without qualifiers |
Beta Was this translation helpful? Give feedback.
-
Our decision on $6 when minting nomens is to use [Nomen1] isEquivalentTo ["literal value of 880"]. |
Beta Was this translation helpful? Give feedback.
-
I'm not aware of 0XX using $6, but $6 is defined for most of those fields. I cannot recall ever seeing a paired 0XX field, though there must be some out there.
Adam
Adam L. Schiff
Principal Cataloger
University of Washington Libraries
(206) 543-8409
***@***.***
…________________________________
From: Cypress ***@***.***>
Sent: Tuesday, June 11, 2024 9:37 AM
To: uwlib-cams/MARC2RDA ***@***.***>
Cc: Adam L Schiff ***@***.***>; Mention ***@***.***>
Subject: Re: [uwlib-cams/MARC2RDA] Creating identifiers (Discussion #375)
Our decision on $6 when minting nomens is to use [Nomen1] isEquivalentTo ["literal value of 880"].
Is this something that should be coded for identifiers? Will there be linked 880s for the 0XX fields?
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/uwlib-cams/MARC2RDA/discussions/375*discussioncomment-9741515__;Iw!!K-Hz7m0Vt54!jskXjzzzOlEmPZ6QoWTZeh0ajcsJC_y0mdY4K8kshVjOA0JvHccSBhKesyL7qD_leSDQI4q8yR19u_mnyVI8fSA$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADFBVB2MHAVWRGXTLCM6OF3ZG4RT5AVCNFSM6AAAAABJEWN6FCVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TONBRGUYTK__;!!K-Hz7m0Vt54!jskXjzzzOlEmPZ6QoWTZeh0ajcsJC_y0mdY4K8kshVjOA0JvHccSBhKesyL7qD_leSDQI4q8yR19u_mnIiG2wkU$>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
All identifiers in RDA are nomens with the category 'identifier'. The category is hard-wired into the appellation elements with a top-level property 'has identifier for {entity}'. Properties for specific kinds of identifier are subtypes of the top-level property and are based on legacy elements. The preferred RDA approach is to use nomen data provenance elements such as 'has scheme of nomen' (rdan:P80069). The default entity for an identifier in a MARC 21 record is Manifestation. Some identifier fields may indicate that the identifier is assigned to a different entity, but in the absence of such indications it can be assumed that the identifier is for the instance of Manifestation that is being described. It can be assumed that an identifier that is assigned from a 'bibliography' scheme is associated with a manifestation (the output of publication processes). The same identifier may be assigned to more than one instance of an entity. For example, the hardback and paperback ISBNs are often treated as two identifiers for the same manifestation. It is not a problem if a manifestation description set includes both ISBNs, or just the 'correct' ISBN for its binding. The result is false drops in information retrieval, with a paperback manifestation being included in the hits when the hardback is specified in the search, but this is no worse than the current level of retrievability in MARC 21 systems. One instance of an entity may be assigned more than one identifier. This may occur within a single scheme (as noted above for the ISBN scheme) or between multiple schemes. If the identifier is not associated with the manifestion that is described in a MARC 21 record, the usual problem occurs of determing which of the manifestation's related works or expressions to associate with the identifier. In the past year, we have decided on the use of datatypes, minted nomens, nomen properties for data provenance, IRIS for schemes, etc. for 6XX and other fields. This supersedes the discussion above, thru October 2023. Note that invalid identifiers can be transformed as nomens with a catagory or status of 'invalid' or similar. This is noted in the decisions index. I think the remaining work to be done is to decide which kind of entity is associated with the identifiers in specific tags, with the default being Manifestation. Subfields can be mapped to nomen provenance properties, as with 6xx. |
Beta Was this translation helpful? Give feedback.
-
When creating values for identifiers, sometimes we add a qualifier to the identifier string. For example:
(CODEN) 123456
0-123-46744-6 (hardcover)
First question:
Should there always be a space between the qualifier and the identifier string?
Second question:
Can we determine the order; should the qualifier always follow the identifier string?
Beta Was this translation helpful? Give feedback.
All reactions