Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify potential issues with gene references and platforms #1234

Open
arteymix opened this issue Sep 26, 2024 · 1 comment
Open

Identify potential issues with gene references and platforms #1234

arteymix opened this issue Sep 26, 2024 · 1 comment
Assignees
Labels
single cell Issues related to single-cell data support

Comments

@arteymix
Copy link
Member

arteymix commented Sep 26, 2024

The strategy that I've implemented is to map single cell data vectors to elements of a platform using various IDs with some predefined precedence:

  • composite sequence name
  • NCBI IDs
  • Ensembl IDs
  • official symbols (likely ambiguous, only of last resort)

This is necessary because we do not reprocess the data and thus cannot have a predictable set of design elements.

There are some issues like how we should deal with ambiguous identifiers or vector that do not map to any known design elements.

@arteymix arteymix added the single cell Issues related to single-cell data support label Sep 26, 2024
@arteymix
Copy link
Member Author

For Ensembl IDs, we will have to trim version numbers. I need to find a test case for that.

@arteymix arteymix self-assigned this Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
single cell Issues related to single-cell data support
Projects
None yet
Development

No branches or pull requests

1 participant