Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crowd curation #3390

Open
chaoran-chen opened this issue Dec 6, 2024 · 1 comment
Open

Crowd curation #3390

chaoran-chen opened this issue Dec 6, 2024 · 1 comment
Labels
discussion Open questions

Comments

@chaoran-chen
Copy link
Member

chaoran-chen commented Dec 6, 2024

We often talked about crowd curation but haven't discussed much how we envision it in detail. This is to collect ideas of features that we might want. I started this list based on our very old requirements collection and extended it further.

As an authenticated user, I want to…

  • propose changes to metadata (ex: date, location)
  • propose changes to sequence data
  • propose flagging a sequence
  • add a comment about the proposal
  • curate an entry from the website
  • use external software (incl. self-written scripts) for curation (i.e. load sequences into the external software and submit curation results from the software) – in other words, I want to be able to curate through an API
  • see proposals by other users
  • support/up-vote curation changes proposed by others
  • have a web page that displays my curation contribution (e.g., number of curated sequences, identified issues, etc.)

As a data submitter, I want to…

  • see proposals to my sequences
  • accept or reject a proposal
  • get an email notification if there are new proposals for my sequences

As a maintainer, I want to…

  • accept or disable the crowd curation feature for the whole instance or only some organisms
@chaoran-chen chaoran-chen added the discussion Open questions label Dec 6, 2024
@anna-parker
Copy link
Contributor

anna-parker commented Dec 6, 2024

Thanks for starting this! I wanted to add some more points on the interaction with INSDC

Curation and the INSDC

As a curator

I want my curations to also appear in INSDC.

As a data owner

I want to see that a curator has proposed changes to my sequence in Pathoplexus even if I only submitted to INSDC and be able to accept/reject these changes.

As INSDC

Only ENA accepts curations of sequences in INSDC that have not been curated by a sequence owner - these are visible as an overlay.

As a maintainer (ingest)

I want to know which ingested metadata fields have been curated and which have been revised by a sequence owner.

As ENA solves this issue with keeping curations in an overlay and not as a separate version it would make sense to also do that for Loculus.

Proposal

Store curations in a separate DB table as a change log, where each curation is an overlay for a specific version and will be used by the subsequent versions of a sequence unless there are metadata field overlays in which case manual review will be required. In the mean time we can add a note on the website to inform users of the debate. (Note in our current state ingest will be forced to do that and create a changelog for each sequence that has been curated which is very tedious)

TO DO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Open questions
Projects
None yet
Development

No branches or pull requests

2 participants