Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support recordedByOrcid #89

Closed
timrobertson100 opened this issue Mar 6, 2019 · 21 comments
Closed

Support recordedByOrcid #89

timrobertson100 opened this issue Mar 6, 2019 · 21 comments
Assignees

Comments

@timrobertson100
Copy link
Member

Users of iNaturalist want to use ORCID for recordedBy and a discussion on how to implement this is on this iNaturalist issue.

It is unlikely that DwC will have a new field for this imminently so I proposed an interim solution of a field in a GBIF namespace, similar to how other custom fields have been introduced (e.g. publisher country in eBird).

@MattBlissett
Copy link
Member

This will need

  • New term decided upon and added to dwc-api
  • Interpretation support
  • Add to SOLR so Peter can search on it
    • Rebuild SOLR 😱
  • Add to full download format? (Do we have everything in that?)

@dhobern
Copy link

dhobern commented Mar 20, 2019

On our side, a user can use ORCID (or Facebook or Github) to log in. When they are logged in, they only have the opportunity to connect with Facebook (not the others) within their profile. Presumably a next step would be to include a "Connect with ORCID" option to the profile page.

If an ORCID is connected, then the profile page could include a link to a search for Occurrence records with the ORCID in the DwC data.

Subsequent steps could include allowing people registering datasets to include their ORCIDs in the metadata for the dataset and to offer another link to search for Datasets with the ORCID in the metadata.

@timrobertson100
Copy link
Member Author

timrobertson100 commented Mar 20, 2019

Yes - all of that is anticipated

@dhobern
Copy link

dhobern commented Mar 20, 2019

Thanks

@MortenHofft
Copy link
Member

MortenHofft commented Mar 21, 2019

On our side, a user can use ORCID (or Facebook or Github) to log in. When they are logged in, they only have the opportunity to connect with Facebook (not the others) within their profile. Presumably a next step would be to include a "Connect with ORCID" option to the profile page.

The interface should suggest to connect with both ORCiD, Facebook and Github.
If it doesn't show, then there is a bug. I can see it, but there might be a bug somewhere.

@dhobern Could the reason you only see facebook be because you haven't connected with Facebook (but have so for the others?)

Connecting with datasets and occurrences
I tried to capture that idea in this issue: gbif/portal16#342 (comment)

@dhobern
Copy link

dhobern commented Mar 21, 2019 via email

@MortenHofft
Copy link
Member

There is nothing to show whether I have connected them

If you click edit profile you should see the option to disconnect. But clearly the interface isn’t intuitive.

@debpaul
Copy link

debpaul commented Aug 1, 2019

To @MattBlissett I would add to your list in #89 (comment) above

  • a plan to reach out (perhaps via SPNHC) to collections about storing IDs for AgentActions (we need to find out who can, who can't, and how to move them forward)
  • a plan to reach out to major software providers (for collections) about supporting this, as at least some will not have an elegant (1:many) way to do this. (or even many:many as one Agent may carry out more than one Action)

While this is not the purview of GBIF to do this, necessarily, it is part of the larger picture if we want to get the most value for everyone in these worldwide data mobilization efforts.

@kueda
Copy link

kueda commented Mar 5, 2020

Hey folks, just had a discussion about citation tracking and ORCID linkages here at CAS and it reminded me of this conversation. Any update on whether there is an interim (or final) term we should be using in our DwC-A to specify the ORCID of the observer?

@timrobertson100
Copy link
Member Author

timrobertson100 commented Mar 6, 2020

Thanks @kueda

I have tried to get identifiedById and recordedById into Darwin Core without success, and more recently a pragmatic proposal to support the simple terms in a GBIF namespace which got some push back. I am concerned that this limits our ability to progress simple things.

Can I please ask:

  1. Do you imagine multiple observers being supported?
  2. Do you imagine also providing ORCID for the people making the identification?

@dshorthouse
Copy link
Contributor

I would hope that the answer to 2 above is a resounding "yes" (it's a significant part of what makes a record research grade) which means we're in 1:many territory.

@kueda
Copy link

kueda commented Mar 6, 2020 via email

@timrobertson100
Copy link
Member Author

Thanks @kueda

I'll provide an answer on Monday, but at this point I anticipate recordedByID in a GBIF namespace, populated with a URI and being an interim (e.g. 1-2 year) term. I'll push to enable GBIF search in the coming ~2 weeks unless something comes up.

@timrobertson100
Copy link
Member Author

timrobertson100 commented Mar 9, 2020

@kueda

Can you please change your current

<field index="13" term="http://rs.gbif.org/terms/1.0/recordedByOrcid"/>

to

<field index="13" term="http://rs.gbif.org/terms/1.0/recordedByID"/>

populated with full URI, e.g. https://orcid.org/0000-0001-6215-3617 ?

If you plan to offer ORCIDs for those making the identifications then identifiedByID with a | (supposed to be a pipe) delimiter but I understand the challenges there.

Thank you

@kueda
Copy link

kueda commented Mar 9, 2020

Done: inaturalist/inaturalist@d5f5792. That should make it into next week's iNat DwC-A. I decided to include the ORCID of the first identifier to add an "improving" identification that exactly matches the taxon associated with the occurrence. That person seems to deserve credit without much ambiguity. Re: @dshorthouse's comment above, I agree that some people who provided "supporting" identifications deserve credit, but it's not clear to me which ones. We don't always know (or it is un-performative to calculate) exactly which identifications were required to make an observation "Research Grade" and which ones shifted the Community Taxon, which seem like good candidates for attribution. For my favorite / least favorite example of why this is not obvious, see https://www.inaturalist.org/observations/5890862

@timrobertson100
Copy link
Member Author

Now live on GBIF as in this example

Thank you @kueda

@kueda
Copy link

kueda commented Mar 26, 2020

So cool, and identifiedById is working too! Thanks!

@kueda
Copy link

kueda commented Jun 18, 2020

The issue of who to list as an identifier just came up in our Forum. To summarize, someone wants to be able to search GBIF for occurrences that have been identified by a particular person. Currently, iNat is only populating identifiedByID with the person who added the first improving identification that matches the observation taxon, i.e. the first person to add the "right" identification. This comports with what I thought would be the primary use of this field: to demonstrate which records a person has improved in a collection, i.e. a way to show that a person did valuable work.

However, the person in our Forum has a different use in mind. They want to retrieve records that have been identified by a particular person, regardless of whether that person was the 1st or the 20th person to add an identification to a particular observation. The seems reasonable, and we could facilitate this by listing all people who have added identifications to a record. However, this works against the first use because I could claim to have done a lot of identification work on iNat when all I really did was adding matching identifications to observations that already had multiple identifications.

So my question to you all: should one use take priority over the other? Was there a primary use for this field in mind when you introduced it? I'm guessing the answers are "???" and "no" because most sources of such data aren't crowdsourced. It's not like every single person who visits a museum can stick their label on every specimen, but that's how iNat works.

@dhobern
Copy link

dhobern commented Jun 18, 2020

These days, this is not really my business, but I do think the main use should be to highlight who has provided the intellectual basis for an identification and that the current implementation is therefore correct. However, I can see value in expressing (via yet another property like identifiedByOtherIDs) the extra information. There would be some interesting graphing applications that could be possible using this information. Not sure though that the effort required will be repaid with increased value.

@timrobertson100
Copy link
Member Author

Thanks @kueda

Was there a primary use for this field in mind when you introduced it?

The intention was to capture those responsible for making the determination, rather than - say - those who indicate they agree/confirm it. While it is an interesting use case, I'd agree with you that the current implementation is correct.

There is work underway by @dshorthouse to create an Agent Actions extension, which would allow modeling multiple people and the various roles they had on a record. Perhaps we could define roles that clearly distinguish those providing the original identification and those agreeing/confirming it? It would mean a more complex export from iNat (another file in the DwC-A as it is a many-to-one) and for GBIF to implement the support for it.

@dshorthouse
Copy link
Contributor

dshorthouse commented Jun 19, 2020

Confirmations of dets are common on museum specimens, especially in botany, but certainly not to the scale that @kueda points out as possible (and expected!) on iNat. What is ordinarily used in DwC-A land is the Identification History extension to Darwin Core with perhaps some rather clunky bits of text in either identificationRemarks, identificationQualifier, or identificationVerificationStatus to express divergence or concordance with a previous determination. As we all well know however, truthiness of these assertions is subject to the march of taxonomy. So, a timestamp is an important bit of information to share, perhaps more so than our own notions of what truthiness means or what credit to ascribe to however minor or seemingly trivial is a "Me too!" statement. How the dets stack-up as a temporal sequence allows anyone else to verify who said what and when and whether or not any of them are weightier than others.

@timrobertson100 points out this other DwC-A extension under development called Agent Actions. This one differs from the Identification History extension in that it makes an attempt to separate out the action from the agent. So, instead of identifiedBy on an observation, we'd explode this out into 1:many for all agents at play + what actions they took, expressed as verbs, tied to URIs in the VIVO ontology – in this case "identified" (and perhaps also something like "verified"). So, @timrobertson100, it's not role we'd use but these two actions. A role in this extension is has no temporal meaning; it's meant to differentiate one agent from another engaged in precisely the same action for a single occurrence in question. For example, "primary collector" vs. "participant". Not sure if this is comparable to what happens in iNat or if data are stored in that way. This would be like "primary observer" vs. "co-observer" for a sighting.

The hang-up now is the inability to declare any relationship between items in the Identification History extension and our draft Agents Actions extension. This is true of all extensions in a DwC-A. But, we push-on.

A small team of us will meet virtually again next week for a second round of ironing out terms for inclusion in the first draft of the actions vocabulary + their definitions. These are meant to cover specimen- and observation-based occurrences & we of course have iNat in mind. And, on that note, I'll add verified to the vocabulary; we stuck mostly to the concepts that are already expressed in DwC as in need of rethink with this perspective – identifiedBy, recordedBy, georeferencedBy. There is no comparable concept for verified in DwC, but this is a good one & is already on our radar. There are several dozen more actions verbs for specimen data that we have on our back-burner, not immediately compelling until we have use-cases for each.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants