-
-
Notifications
You must be signed in to change notification settings - Fork 13
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do we need directly-attached stable part identifiers? #3630
Comments
Thanks @Jegelewicz Yay: every part could have a way of being uniquely identified, I could avoid some semi-expensive joins, having multiple parts in a "base" container wouldn't necessarily mean they can't still be individually identified. Maybe not-so-yay: Barcodes are used for lots of things in addition to part IDs, this could be confusing when those are separated (or unstable - so useful only at limited scale - if that's somehow synchronized/maintained). The "workaround" is containers which exist only for the purposes of serving as part identifiers (for which I'd recommend a dedicated container type). That's nice because it fits into all existing workflows and requires no development, but it also requires some setup (getting the parts into the containers). Perhaps some of that could somehow be automated. |
So we could auto assign all parts to a virtual container with an
autogenerated stable identifier?
…On Wed, Jun 2, 2021, 10:24 AM dustymc ***@***.***> wrote:
* [EXTERNAL]*
Thanks @Jegelewicz <https://github.com/Jegelewicz>
Yay: every part could have a way of being uniquely identified, I could
avoid some semi-expensive joins, having multiple parts in a "base"
container wouldn't necessarily mean they can't still be individually
identified.
Maybe not-so-yay: Barcodes are used for lots of things in addition to part
IDs, this could be confusing when those are separated (or unstable - so
useful only at limited scale - if that's somehow synchronized/maintained).
The "workaround" is containers which exist only for the purposes of
serving as part identifiers (for which I'd recommend a dedicated container
type). That's nice because it fits into all existing workflows and requires
no development, but it also requires some setup (getting the parts into the
containers). Perhaps some of that could somehow be automated.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3630 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBDZPRSQZHYFCD55EQLTQZLNRANCNFSM457B2RAQ>
.
|
Yes.
No, but maybe we could fake it by semi-automating containering uncontainerized parts in your collection(s) or something.
That's up to you, but nothing in Arctos would change your containers (same as any other containers). |
I think we should lean toward autogeneration and stability. Isn't this a path to the ultimate "material sample" that GGBN seeks? |
That's probably an argument for something more specialized than containers. I think my concerns all center around usability, as above. Nothing good documentation can't bridge.... If we're going there, some sort of resolvable ID - URLs, ARKs, some short Arctos alternate URL that we could buy (and which I'd use for things like JSON), or whatever - would be cool.
The "DWC community" seems to remain at least partially convinced that institution_acronym + collection_cde can do something (it can't) so I'm not really holding my breath, but there is a materialSampleID (https://dwc.tdwg.org/terms/#materialSampleID) with a sane definition in the "core" (extension?? IDK, and IDK how to know!).
GGBN apparently has their own thing (https://terms.tdwg.org/wiki/GGBN_Material_Sample_Vocabulary), it does NOT carry an ID (that I can find). In either case, I believe there's at least the presumption of dependence - I don't think it could ever be "correct" to show DWC:MaterialSample data without also showing DWC:Occurrence data (but I'm not DWCologist, maybe I'm not understanding something). Arctos has no such inherent limitations, and it's common (at least in entomology) to just "cite" whatever's scribbled on the tube/pin/part no matter what else has been specified or agreed upon. This could be an opportunity for us to make "whatever's scribbled on the tube" something that browsers can use to get to the catalog record (or a subset of it). That comes back to the usability question - are CM's going to be able to use barcodes up to some point and then switch to "part IDs," or can we find a way to sync those so they don't have to (and what's that do for the possibility of buying pre-printed containers if so), or ??????????????????? |
We have an incoming collection that wants to assign guids and separate part
identifiers in the field at time of collection. They want to know if they
can use their tissue identifiers for part barcodes. It would be ideal if
we could somehow incorporate this, giving a stable material sample ID at
collection, associated with a guide/url organism ID and occurrence ID . . .
Right now the closet thing we have for this is barcodes, and they mostly
work. But they are not / cannot be universally applied due to cost and
resources. If Arctos could provide a list of stable part identifiers that
could be downloaded and made into labels in advance and applied in the
field, and linked to an organism ID, maybe we could bypass NK numbers and
externally supplied barcodes?
…On Wed, Jun 2, 2021, 11:33 AM dustymc ***@***.***> wrote:
* [EXTERNAL]*
autogeneration and stability.
That's probably an argument for something more specialized than
containers. I think my concerns all center around usability, as above.
Nothing good documentation can't bridge....
If we're going there, some sort of resolvable ID - URLs, ARKs, some short
Arctos alternate URL that we could buy (and which I'd use for things like
JSON), or whatever - would be cool.
material sample
The "DWC community" seems to remain at least partially convinced that
institution_acronym + collection_cde can do something (it can't) so I'm not
really holding my breath, but there is a materialSampleID (
https://dwc.tdwg.org/terms/#materialSampleID) with a sane definition in
the "core" (extension?? IDK, and IDK how to know!).
GGBN
GGBN apparently has their own thing (
https://terms.tdwg.org/wiki/GGBN_Material_Sample_Vocabulary), it does NOT
carry an ID (that I can find).
In either case, I believe there's at least the presumption of dependence -
I don't think it could ever be "correct" to show DWC:MaterialSample data
without also showing DWC:Occurrence data (but I'm not DWCologist, maybe I'm
not understanding something).
Arctos has no such inherent limitations, and it's common (at least in
entomology) to just "cite" whatever's scribbled on the tube/pin/part no
matter what else has been specified or agreed upon. This could be an
opportunity for us to make "whatever's scribbled on the tube" something
that browsers can use to get to the catalog record (or a subset of it).
That comes back to the usability question - are CM's going to be able to
use barcodes up to some point and then switch to "part IDs," or can we find
a way to sync those so they don't have to (and what's that do for the
possibility of buying pre-printed containers if so), or ???????????????????
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3630 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBFB7YSEQGHKFD4R3RTTQZTOTANCNFSM457B2RAQ>
.
|
I haven't seen that work yet, but as long as they can keep their numbers straight it's not a problem for Arctos.
Sure, that's always been possible/recommended, and it (along with good procedures) might be an actual way to "keep their numbers straight."
You've lost me and that seems to conflict with above - explain, please.
I'm still missing a big piece of the puzzle, but ARKs might be an easy way to get those. If you'll settle for a bit less stable, grab a series of barcodes of whatever format you want and do WHATEVER with them - get them printed, print them yourself, attempt to transcribe them, .... |
OK, this is probably crazy-talk because it occurred to me in the middle of night, but Arctos is cataloging occurrences (identification at a place and time) NOT collections (I have this thing from this occurrence). @dustymc is always hammering home that we are not cataloging the "item of interest" and this is absolutely true. Almost every prospective institution asks about catalog numbers like 12345.1 so that they can track the various parts associated with some thing (usually a plant or animal, but other stuff too with catalog number 12345) that they are managing. Because we are so focused on the event, the parts are secondary in the system. The problem comes from the fact that for the majority of our collection managers, the parts are really the focus, but we don't number/track them well. We have now created the "part identifier" attribute to get around this, but it only creates more work for collections. Barcodes are great - but they apply to containers, not parts and I think we need to keep that distinction. I think we need to look at MaterialSampleID:
Would it be possible to construct such an ID with the object url + Arctos part number? Or should parts be assigned a "GUID" equal to the Catalog record "GUID" + part number? (Perhaps we need both, one for humans, the other for machines) If Arctos could do this for us in a way that makes it easy for us, that would be GREAT. I think the thing we need to figure out is what this "part number" should be. While the part number assigned in the parts code table is nice, it isn't known until the part is entered. How can we best accomplish this? |
That's mostly "just UI" - Arctos is truly normalized, seeing it as a part management system (with catalog records as metadata) is a valid viewpoint. (So is seeing it as an event system, if you want to go there.)
Again, parts are 100% containers. The current level of container that can have an exposed identifier isn't in a 1:1 relationship with parts so I'm not suggesting what we have fully does what we should be doing, but there is always and inevitably a container that is in the correct relationship with parts, and it might serve this purpose (depending on what precisely that turns out to be).
The origins of that are a case study in how to not do science. Strongly suggest just avoiding that situation in exploring how to move forward.
I don't think there's anything exactly wrong with that, but it will inevitably get used in the wrong context so I'd rather avoid it.
We need to figure out what it DOES before we think about what it looks like. Eg deleting catalog records (==destroying GUIDs) is fairly difficult (it would be impossible if I had my way) because those are "citable" - minting them comes with some implicit (it would be explicit in my little fantasy world) promise that they'll be suitable for certain purposes, and that demands certain behavior from the creators. "Minting" UUIDs (or internal keys, etc.) is an act of convenience - once they've served whatever purpose they've been created to serve they can be deleted and nobody cares. I think the first question is, which of those situations is more analogous to what should be done here? If that answer turns out to be what I think, the second question involves our ability to live without (or with limited access to) 'delete part' buttons. |
@Jegelewicz Not crazy talk!
YES!!!!!
It helps keep track of how parts are used! I want to see parts tied to the outside identifiers. Liver part --> Loan--> Project --> Publication --> Genank etc. It wasn't the skull or postcranial or the kidney that lead to all that extra data about that occurrence. |
I think we are both pretty sure what the answer is and given @campmlc comment I think she does too. This also ties in with the Mexican Wolf scenarios and having events tied to the parts they came from. In my mind right now the answer is that we are cataloging the wrong way. A basic catalog record only requires an identification and a locality but NO PART. How does that make sense when we are managing PARTS? It should be the other way around - I should be able to catalog a part with absolutely no other information because the most important thing in that moment is that I can find the part and match it up with all of the other information. OK, before anyone jumps on me, I realize that I can put unknown everywhere (even for part name) but it feels wrong. Not saying we can't train people to do it though. Anyway, I think our problems mostly stem from putting too many parts in a single catalog record. If a part is important enough to have an associated GenBank sequence, maybe it needs it's own catalog number. Because all the parts from an event can share an event, we should not be afraid to do this. And yes, it will require a new pricing strategy.... As @dustymc says - catalog the item of interest and apparently that is not Andalgalomys pearsoni dorbignyi but one of these - And by the way, which of these ended up as these? In case anyone is interested - this ties in with tdwg/dwc#314 (comment) |
FWIW - our new entity module could help here...all the "organism" type attributes could go there and would not need to be re-created in every catalog record. |
@Jegelewicz Another example! In a paper, the Arctos interns found two UAM no data bison cited. Yeah, they had no data but data has now been generated about those parts. Unfortunately, they were not cataloged and we don't know which is which. The part has the data but also continues to generate MORE data.
Why is this a problem?
That is going to go over like a lead balloon. GGBN has a very similar model where all parts are separated out. |
I think there are two components of that:
That's but one use case. Catalog records are and always have been "whatever someone felt like cataloging." I don't see any realistic possibility of that changing, and I don't see much reason to attempt to change it. There are usability implications to cataloging a bucket of guppies or each of the 47 slices of liver, but sometimes reality (or tradition) ends up in strange places anyway. Mostly I'm just not sure why you'd want to juggle more data than you have to - this just doesn't make any sense to me.
Catalog numbers are special only because Curators have decided to treat them that way. A part identifier (assuming some decent design and curatorial commitment and all that jazz) can do ~everything a catalog number can do (and some other stuff), just issue them and change your loan agreements.
I think that's almost always the biological individual (where that's easy to define, anyway), and I don't think this one's any different - the focus is population-level stuff, the individual is representative (everyone hopes!) of that, the sample is just a way to get at characteristics of the individual. If nothing else, it's a lot easier to see that 27 methods all fail to reject that critter being a member of Andalgalomys pearsoni when those data are attached to a single data object.
....doesn't make any sense as a replacement for catalog records; it just doesn't have the structure to stand like that. It's not too late to ditch the thing and just let some new value in https://arctos.database.museum/info/ctDocumentation.cfm?table=ctcataloged_item_type define entities. (#1966 (comment)) |
You've heard of these guys, right? https://en.wikipedia.org/wiki/Led_Zeppelin |
Well I guess my answer to the question is YES |
Some Plant guy, right? Must be botanists....
I think I want to revise what I said earlier - there are two social issues:
It looks like we can't get enough ARKs after all, so maybe instead of preemptively giving all parts an ID it's some sort of on-demand mechanism, and getting the IDs prevents deletion. That might also set up a path to various kinds of IDs, which could either
|
This is exactly why I said what I said. As long as a bunch of parts are floating around with a common "catalog number" we will never get stuff lined up (part used to create Genbank ID). Maybe nobody cares, but if two researchers borrow parts from the same "catalog number" and get different identifications, I would think it would start to matter. |
I mean, there is always the old standby: MSB:Mamm:5000.1 for the first "request" and so on....but will it get cited correctly? |
There's plenty of that now, it's just tracked internally - given semi-sane loan policies/procedures I don't think it's a huge deal, but there's definitely room to improve.
#1257 - that might be this plus decent procedures.
"MSB:Mamm:5000" is the first problem - if we're starting with nonresolvable identifiers then the rest of this can only go so far; we're mostly not going to change anything of note. If you want unambiguous citable parts today,
and tomorrow we can worry about a more refined approach to (2) (then using them to add parts to loans &etc.). |
And they're useful for some purposes - I probably won't change them without some notice, but someone else with access to your collection might, for example - is that enough to do whatever you want to do? (They might be good enough to bulkload loan items, just don't expect them to be there in a week.)
That's up to you - there are plenty of values of barcodes which can do all kinds of things. They're just identifiers.
That's not how place names work - there's a user-supplied (with the option to autogenerate) value which permanentify-s. ANYWAY - an on-demand dedicated ID is about where I ended up as well (#3630 (comment)), but I can't see any reason not to add the extra (small, I think) bit of effort to make it citable if there's an actual use case. The use case part of this would make a useful AWG topic, lest we build something that nobody will use. The "someone gave this an ID so now you can't delete it" part is a necessary AWG topic. |
I think this (and similar) deserves some further exploration. This is probably another discussion, but I'm not sure enough of that to move it. Minimally it's relevant to this discussion. Short version: https://handbook.arctosdb.org/documentation/catalog.html#understanding-cataloged-items, you can catalog whatever you want. But... Much of that's reactionary (we have to accommodate whatever's been cataloged for whatever reason), and much of the discussion centers around our (Gordon's, maybe) "catalog the item of interest" mantra which presumes there is a "THE item," or one correct answer. Maybe there's not. Entities ultimately come back to choosing what to catalog, and that is again at least partially based on the idea that cataloging should somehow be limited to one correct THING. Existing (and discussed) Entities are all just things that might get cataloged in other circumstances; we're just making the data less accessible by introducing a new arbitrarily-used data object. Cataloged items can do everything Entities can do, and #3630 (comment) suggests they can't do some stuff that might be desirable. I can't quite wrap my hear around how cataloging a part could be Research Grade, but it probably happens and it's ultimately the same situation as Entities, just in a different direction - some arbitrary thing gets a catalog number because of some reason that may or may not make much sense in various contexts. #1966 (comment) (almost certainly including more, or more refined, values in https://arctos.database.museum/info/ctDocumentation.cfm?table=ctcataloged_item_type) unifies all of that, resulting in one kind of data object discoverable in one place through one set of authorities. Researchers don't have to guess at what we thought the item of scientific interest might be (or what was traditional in the discipline at the time the item was cataloged, or any other arbitrary thing), and "we" don't have to guess at what those researchers will want in order to choose what to catalog - we can just catalog what we have, or what someone asks for. In the most basic use case, a wolf sampled twice at two different Events will result in two cataloged items. If that wolf is known to be a member of a pack, an additional cataloged item (representing the pack) can be created and linked. (As always, a lack of this would indicate a lack of resources or knowledge, not an assertion that there is no pack/colony/hive/population/"super-individual"). If there's a reason to do so (and the resources to act are available), another cataloged item representing the wolf as a DWC:Organism could be created and linked. (This is basically the core of what we've done with Entities.) If there's some reason to catalog a sample of one of the original items, it's just more cataloged items. Etc. It's likely that something like all of those situations exists, so I'm simply suggesting we build on that rather than adding another way of doing the same thing. Doing more (UI styling, perhaps more search options, DWC mapping) with cataloged_item_type is probably necessary, but we should probably be doing that now - I think that's more "improvement" than "change." Adding more metadata to Other IDs (which are also relationships) would be necessary for some situations - eg an individual wolf might be a member of multiple packs at various times - but that's also an improvement that's come up a few times. Any tools (eg to create an Organism from multiple Occurrences) would be broadly applicable rather than limited to one way of doing things; I think this unified approach would ultimately result in a more usable system. I don't think any of that provides compelling reason not to add "citable" part identifiers, but perhaps it provides a citable alternative (eg just catalog the part) that allows this Issue to focus on more practical shorter-term usage (such as adding items to loans). |
We just discussed this today in planning the parasite webinar. There is a part in a mammal record = ectoparasite which is a vial of mixed ectos. These get split into a bunch of actual parasite records (fleas, ticks, etc). One way to relate the individual catalog records to the original part as well as to each other is to use a stable part ID for the part in the mammal record as the lot number for all of the parasite records. |
I could also see using this in the meteorite collection (see #4638): I have a catalog record with a meteorite part. Somebody prepares a thin section from it.
|
@DerekSikes has a defensible approach for this: Send out the vial of random junk along with clear instructions, catalog whatever gets sorted out, and cite that. That completely avoids needing to care what might have been in the jar of bug-like bits; "something from some jar of random junk" never gets published, the fact that it exists (or used to) is entirely an internal issue. If you're cataloging parts then you have no need for this - it's just redundant with the record's GUID. (That's its own flavor of mess, but not relevant to this.) |
We need a way to keep track of the parent-child relationships of containers
prior to cataloging - a lot ID, ideally part ID url, that will link all
derivative parts/containers and allow tracking and discovery for the
cataloging process. When a lot of ectoparasites get split into ticks,
mites, lice, and fleas and sent on multiple loans to different researchers
who return items over many years, we need something to link all these
derivatives back to the original vial and host that is not subject to the
transcription error that occurs with something like an NK. A part ID url
would do that.
…On Tue, May 3, 2022 at 12:15 PM dustymc ***@***.***> wrote:
* [EXTERNAL]*
@DerekSikes <https://github.com/DerekSikes> has a defensible approach for
this: Send out the vial of random junk along with clear instructions,
catalog whatever gets sorted out, and cite that. That completely avoids
needing to care what might have been in the jar of bug-like bits;
"something from some jar of random junk" never gets published, the fact
that it exists (or used to) is entirely an internal issue.
If you're cataloging parts then you have no need for this - it's just
redundant with the record's GUID. (That's its own flavor of mess, but not
relevant to this.)
—
Reply to this email directly, view it on GitHub
<#3630 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBBG32SVDNBX5VRICPDVIFUMXANCNFSM457B2RAQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Or a collection event nickname?
On Tue, May 3, 2022 at 10:23 AM Mariel Campbell ***@***.***>
wrote:
… We need a way to keep track of the parent-child relationships of containers
prior to cataloging - a lot ID, ideally part ID url, that will link all
derivative parts/containers and allow tracking and discovery for the
cataloging process. When a lot of ectoparasites get split into ticks,
mites, lice, and fleas and sent on multiple loans to different researchers
who return items over many years, we need something to link all these
derivatives back to the original vial and host that is not subject to the
transcription error that occurs with something like an NK. A part ID url
would do that.
On Tue, May 3, 2022 at 12:15 PM dustymc ***@***.***> wrote:
> * [EXTERNAL]*
>
> @DerekSikes <https://github.com/DerekSikes> has a defensible approach
for
> this: Send out the vial of random junk along with clear instructions,
> catalog whatever gets sorted out, and cite that. That completely avoids
> needing to care what might have been in the jar of bug-like bits;
> "something from some jar of random junk" never gets published, the fact
> that it exists (or used to) is entirely an internal issue.
>
> If you're cataloging parts then you have no need for this - it's just
> redundant with the record's GUID. (That's its own flavor of mess, but not
> relevant to this.)
>
> —
> Reply to this email directly, view it on GitHub
> <#3630 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/ADQ7JBBG32SVDNBX5VRICPDVIFUMXANCNFSM457B2RAQ
>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
—
Reply to this email directly, view it on GitHub
<#3630 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACFNUM4AMTQAGSPBWZ5CVETVIFVJXANCNFSM457B2RAQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
+++++++++++++++++++++++++++++++++++
*Derek S. Sikes*, Curator of Insects, Professor of Entomology
University of Alaska Museum (UAM), University of Alaska Fairbanks
1962 Yukon Drive, Fairbanks, AK 99775-6960
***@***.*** phone: 907-474-6278 he/him/his
University of Alaska Museum <https://www.uaf.edu/museum/collections/ento/>
- search 357,704 digitized arthropod records
<http://arctos.database.museum/uam_ento>
+++++++++++++++++++++++++++++++++++
Interested in Alaskan Entomology? Join the Alaska Entomological
Society and / or sign up for the email listserv "Alaska Entomological
Network" at
http://www.akentsoc.org/contact_us
|
Arctos cannot help you there.
You can (and should!) catalog those things before they get cited; "returned" is not related to that in any way I can see. |
This should not be a controversial request. I'm happy to discuss workflows
of what goes on in actual museums, vs in theory. This tool will help solve
multiple real world problems, and it was previously in proposed as a
solution to some of them. The reason it makes sense here is because IDs can
be minted as needed going forward, not retroactively applied to legacy
problem records. Yes, quality control checks are necessary prior to
assigning them, and yes, that should be integrated into the workflow. If no
permanent ID url is provided, we'll just have to use barcodes as IDs, which
will not work as well when we have the opportunity to use a URL.
…On Tue, May 3, 2022, 12:31 PM dustymc ***@***.***> wrote:
* [EXTERNAL]*
prior to cataloging
Arctos cannot help you there.
who return items over many years
You can (and should!) catalog those things before they get cited;
"returned" is not related to that in any way I can see.
—
Reply to this email directly, view it on GitHub
<#3630 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBEAX54ZOWZ5EFU6LYDVIFWHFANCNFSM457B2RAQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
In this case, it is the parts that need to be linked via a parent child
relationship across cataloged items. We can use a parent part ID from the
source part or container (part ID for "ectoparasite" vial in mammal record)
as the lot ID for all individual ectos taken out of that vial and placed in
other vials and cataloged as parts in other catalog records.
…On Tue, May 3, 2022, 12:28 PM DerekSikes ***@***.***> wrote:
* [EXTERNAL]*
Or a collection event nickname?
On Tue, May 3, 2022 at 10:23 AM Mariel Campbell ***@***.***>
wrote:
> We need a way to keep track of the parent-child relationships of
containers
> prior to cataloging - a lot ID, ideally part ID url, that will link all
> derivative parts/containers and allow tracking and discovery for the
> cataloging process. When a lot of ectoparasites get split into ticks,
> mites, lice, and fleas and sent on multiple loans to different
researchers
> who return items over many years, we need something to link all these
> derivatives back to the original vial and host that is not subject to the
> transcription error that occurs with something like an NK. A part ID url
> would do that.
>
> On Tue, May 3, 2022 at 12:15 PM dustymc ***@***.***> wrote:
>
> > * [EXTERNAL]*
> >
> > @DerekSikes <https://github.com/DerekSikes> has a defensible approach
> for
> > this: Send out the vial of random junk along with clear instructions,
> > catalog whatever gets sorted out, and cite that. That completely avoids
> > needing to care what might have been in the jar of bug-like bits;
> > "something from some jar of random junk" never gets published, the fact
> > that it exists (or used to) is entirely an internal issue.
> >
> > If you're cataloging parts then you have no need for this - it's just
> > redundant with the record's GUID. (That's its own flavor of mess, but
not
> > relevant to this.)
> >
> > —
> > Reply to this email directly, view it on GitHub
> > <
#3630 (comment)
> >,
> > or unsubscribe
> > <
>
https://github.com/notifications/unsubscribe-auth/ADQ7JBBG32SVDNBX5VRICPDVIFUMXANCNFSM457B2RAQ
> >
> > .
> > You are receiving this because you were mentioned.Message ID:
> > ***@***.***>
> >
>
> —
> Reply to this email directly, view it on GitHub
> <#3630 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/ACFNUM4AMTQAGSPBWZ5CVETVIFVJXANCNFSM457B2RAQ
>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
--
+++++++++++++++++++++++++++++++++++
*Derek S. Sikes*, Curator of Insects, Professor of Entomology
University of Alaska Museum (UAM), University of Alaska Fairbanks
1962 Yukon Drive, Fairbanks, AK 99775-6960
***@***.*** phone: 907-474-6278 he/him/his
University of Alaska Museum <https://www.uaf.edu/museum/collections/ento/>
- search 357,704 digitized arthropod records
<http://arctos.database.museum/uam_ento>
+++++++++++++++++++++++++++++++++++
Interested in Alaskan Entomology? Join the Alaska Entomological
Society and / or sign up for the email listserv "Alaska Entomological
Network" at
http://www.akentsoc.org/contact_us
—
Reply to this email directly, view it on GitHub
<#3630 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBB2T3WEP255KQ7TBETVIFV4PANCNFSM457B2RAQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
This does not do that. Anyway, at some level I suppose I don't have to understand how this might get used or even misused, I just need some sort of commitment that it will get used and some indication that whoever uses it understands the inherent implications (laid out above) of a truly persistent identifier at this level. FYI I read #3630 (comment) as a refusal to accept that commitment; please clarify if I'm not interpreting something correctly. (And, also as above, http://arctos-test.tacc.utexas.edu/info/ctDocumentation.cfm?table=ctspecpart_attribute_type#part_identifier provides a mechanism to introduce an identifier without the commitments this Issue would require.) |
On a global scale yes, but on a part-by-part basis most of that is acceptable. BUT - why couldn't you encumber a record with a stable part ID? How would that be different from encumbering any record in Arctos right now? It isn't any different than the resolvable (with a password) identifiers from the Zoo community, the record is there, you just don't have appropriate permissions.... |
Sure, but that is not actionable? I mean I add identifiers to parts a lot, but they don't give me the ability to link that part to anything. |
Also a wise person said this in the comments above:
|
I've never proposed that?!?
I don't think emulating our past mistakes is a great model.
"It might be there but we're not telling you" doesn't seem worthy of investment.
Depends on which identifier you use.
My opinions on that haven't changed! We can do something awesome here, but it will require a curatorial commitment ("pre-commitment" might be a better way to view it?), or we can use existing tools to do less-awesome stuff. (Which could still be pretty awesome, but it's not structurally constrained to awesomeness.)
There's no difference, minus the "structurally constrained" bits. Grab an ARK-or-whatever, stuff it in part attributes, demand your loan recipients use it, be careful not to hide it, and you've done exactly what we're proposing here. This would just make "be careful not to hide it" something you don't need to worry about (and maybe make the "grab..." step a bit easier, but we could do that without this). |
There are legitimate reasons to encumber information that we cannot ignore. |
I've just proposed prohibiting mask record (and I still don't think this is worth doing without that) - other current or future types of encumbrances would not be affected, as long as they leave SOMETHING behind. A "most everything but still there" encumbrance might even help atone for past sins, although that would of course ultimately be up to the collections.
|
Could we encumber identification, higher geog, locality, collector etc -
the whole shebang- but leave the record shell with URL?
…On Tue, May 3, 2022, 3:00 PM dustymc ***@***.***> wrote:
* [EXTERNAL]*
legitimate reasons to encumber information
I've just proposed prohibiting mask record (and I still don't think this
is worth doing without that) - other current or future types of
encumbrances would not be affected, as long as they leave SOMETHING behind.
A "most everything but still there" encumbrance might even help atone for
past sins, although that would of course ultimately be up to the
collections.
***@***.***>> select count(*) from flat
arctos-> inner join coll_object_encumbrance on flat.collection_object_id=coll_object_encumbrance.collection_object_id
arctos-> inner join encumbrance on coll_object_encumbrance.encumbrance_id=encumbrance.encumbrance_id and encumbrance_action='mask record'
arctos-> inner join citation on flat.collection_object_id=citation.collection_object_id
arctos-> ;
count
-------
37462
—
Reply to this email directly, view it on GitHub
<#3630 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBEHE3PJ5YN6TMQ4ERDVIGHU5ANCNFSM457B2RAQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Found an example of why we don't want to assign permanent PIDs to parts without validating legacy parts first - see all the false duplicate parts associated with this record (anything without a part location path, which may not be visible to those not logged in): https://arctos.database.museum/guid/MSB:Mamm:83457 |
Quite of few of these are real parts - and once I validate them, I'd like to be able to assign a permanent ID to confirm their validity. I'd rather not be forced to slap an actual barcode on to the vial to do this - that is the point of having the part ID. Possible? |
Technical: #3630 (comment) (very restricted, heavily documented bulkloader) still looks like the only plausible path to implementation; maybe you think I suggested something else?? Social: This is still pointless until someone commits to to demanding citations by partID - there are lots of easier paths (for all of us, in all directions) to "confirm validity." |
I guess my question is: is the current part ID stable within Arctos? If it
is, I could envision gradually shifting over to using Arctos assigned PIDs
as barcodes, even minting url-based PIDs. SOMEONE needs to start doing this
before we can start asking users to cite them - we have to be the horse
before the cart.
If it is not, and the PID may randomly change for no reason . . . then that
won't work.
…On Fri, Aug 12, 2022 at 5:55 PM dustymc ***@***.***> wrote:
* [EXTERNAL]*
Technical: #3630 (comment)
<#3630 (comment)>
(very restricted, heavily documented bulkloader) still looks like the only
plausible path to implementation; maybe you think I suggested something
else??
Social: This is still pointless until someone commits to to demanding
citations by partID - there are lots of easier paths (for all of us, in all
directions) to "confirm validity."
—
Reply to this email directly, view it on GitHub
<#3630 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBEENWHSWJ5XLQGOQI3VY3QABANCNFSM457B2RAQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@campmlc I believe it is completely up to you. You can MAKE the part ID stable. See #3630 (comment) Until someone experiments and DOES this, we are going to keep having circular conversations. I was going to try this in test, but I get this:
|
The "core" of that is functional in a few places, we've discussed the stability (lack thereof) of it many times.
We're having this conversation because ya'll convinced me barcodes are not suitable. (And you're right - they wear out, hold lots of parts, hold nothing, hold things that aren't parts, aren't used at all for political reasons, etc.) (Some sort of resolvable PID would still be fabulous barcodes, would seamlessly deal with the 'someone cited the barcode' scenario, let anyone get to at least where parts used to be, etc., but as they're used now they're not quite interchangeable.)
It's going to be a lot of work - but not much innovation - to make them do what they need to do to be stable, and there's a huge curatorial commitment involved. This is maybe closer to buying a horse and cart today (except I'm going to wave my wand and the horse won't conveniently keel over about the time the kids leave for college, your great grandkids will still need to feed it) - if you're not SURE you're going to use it then it'll just hang around and take space and consume resources and maybe make a huge mess from time to time all without really giving anything back.
It is not, this is discussed above, maybe I could work up a summary or something if that's useful, but I think it would just end up being a bad representation of this whole thread, and this whole thread needs read, carefully, before making any decisions.
Yes but no. I'm _probably_not going to mess with them, but you delete parts (even those that claim to be used) and 'mask record' encumber and such about every day. As is, part IDs are not suitable for citation. They are (usually) suitable for local timely things.
There's not much to experiment with. You say "hey borrower, use this OR ELSE", let me know about that and I make sure the identifier never changes, you go on to make sure you have policies and documentation so you don't toss it out and just not bother deleting it from the DB or reuse the identifier for something else and etc., and now there is in a very real way a physical item attached to a publication. Then we all swoon because that's sciencey. I don't think there's a less-rigorous yet still defensible approach, and maybe that's simply more than you can commit to, maybe even if the current CM, Curator, and Director think it's a fabulous idea. If that's the case then https://arctos.database.museum/info/ctDocumentation.cfm?table=ctspecpart_attribute_type#part_identifier is still available, and I could mint PIDs or get ARKs or something to go there. Those identifiers would be stable (as in, I'm not going to delete them), but there'd be no technology keeping them attached to things that exist and such, you (or your successor) can break a million publication<-->part links with a handwave and I can't stop it. Maybe that's still a decent babystep - if we start seeing those things pop up in GenBank and that changes our view then they could be elevated to some more "forever" structure, if it turns out nobody's going to USE them they can be quietly "sent to the farm" without making the world (rightly) think that Arctos itself is broken. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Picking a specific part out of the pile is a lot of work. Once that happens, you can
Barcodes, real or otherwise, serve nicely as unique identifiers. In this case, giving everything a barcode would mean you can add the attributes later, gives you a super-easy way to eventually add to the loan, and probably provides a pathway to whatever you mean by "loan subsamples in the same virtual container."
(I've been wondering if we need directly-attached stable part IDs for a while, and maybe we do - new Issue - but they're not available NOW and barcodes are.)
Originally posted by @dustymc in #3627 (comment)
The text was updated successfully, but these errors were encountered: