test data upload with different files #38

esmason · 2016-10-24T00:23:06Z

right now I'm just testing by re-uploading the same files and it seems to work but should test with different data or at least with truncated .tsv files to make sure it's working.

oganm · 2016-10-25T00:01:45Z

We should probably wait to decide what the exact input will be before implementing this. For instance starting users won't have tsne, then many users might not bother with using sparse matrices and can have different data structures, So data upload probably should be placed at the very front of the pipeline after everything else is done. Till then we can experiment with other datasets.

To start from zero I have several single cell datasets that I'm working on that can be useful. GSE67835, GSE67835 and GSE71585.

Here's the code to download and process them if you want to play around
Though you probably want to get rid of devtools::use_data lines.
https://gist.github.com/oganm/c9a0fea369d6e6f9eed71371730062ac

This also gives a decent idea how single cell data sharing looks like right now. character delimeted expression matrices are fairly common.

esmason · 2016-10-25T03:30:53Z

is there a (or several) standardized data formats for scRNA seq yet? We will need to specify at least some degree of formatting requirements to the user.

oganm · 2016-10-25T06:21:24Z

the most standard thing you can find right side of assembly is an expression matrix in any shape and form possible. I think it'll be enough for us to take in the matrix - sparse or not. If not sparse -> sparsify on intake. Gene name and cell ID (barcode) matrices might be problematic but in reality, we don't really need the cell id matrix. Order is a good enough identifier. I believe sparse matrices do not have row-column names so gene names will have to be supplied as a separate file to those or embedded in somehow (as a first row)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test data upload with different files #38

test data upload with different files #38

esmason commented Oct 24, 2016

oganm commented Oct 25, 2016

esmason commented Oct 25, 2016

oganm commented Oct 25, 2016 •

edited

Loading

test data upload with different files #38

test data upload with different files #38

Comments

esmason commented Oct 24, 2016

oganm commented Oct 25, 2016

esmason commented Oct 25, 2016

oganm commented Oct 25, 2016 • edited Loading

oganm commented Oct 25, 2016 •

edited

Loading