-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional parameters in some functions - new functions #460
Comments
Thank you for the constructive wish list! All of it sounds doable. Would you feel comfortable with adding this functionality yourself? A pull request would be welcome! |
Thank you for the prompt reply, I am going to open a pull request. Thanks! |
@andrea-tango Because I found there is some difference in makers by scanpy's default(using However, when I tried scanpy's Have you ever came into similar results? |
Interesting, @MichaelPeibo! Would you share these results somewhere publicly? A notebook on GitHub? @andrea-tango. Yes, we should get the functionality of 2 and 3 functionality into |
@MichaelPeibo @falexwolf I started working on points 2 and 3, but it is better if you will work on these points. Please, check the Many thanks. |
@falexwolf |
Great, thank you, @andrea-tango and @Koncopd! @andrea-tango, would you make a PR? We can then look at how you solved this. In principle, I'm very hesitant to add Regarding the discrepancy between We were just talking about @tcallies, any thoughts from your side? |
@falexwolf I agree with you about the
I did it, I pushed the code where I added the parameter
I wrote a function in which you can change the colour of the genes, you can add the names of the genes etc.
I tried them on data coming from the lab in which I am working. |
Oh, thanks! Sorry for the long downtime, the whole family was sick... I'm going through the PR now. The tests question was actually targeted towards @davidsebfischer, but thanks anyways! The comparison question was also targeted to @davidsebfischer, @tcallies. But if you do it, @andrea-tango, awesome! |
Very good suggestions. One small question:
Isn't this |
Hi @falexwolf Sorry for doing it in this way, I not familiar about how to make public notebook...and our data is too preliminary to be public. |
Hi all, thanks for the mention @andrea-tango! If have been making multiple changes in diffxpy and batchglm recently, the following refers to the branch diffxpy dev, I haven' merged all of this into master yet as I am waiting for some last issues to be fixed.
|
Hi @davidsebfischer, I am writing a simple jupyter notebook where I am analysing the 10x_pbmc68k_reduced.h5ad data. I selected only clusters 0 and 1: Running Trying with
For this test, I used the version downloaded with pip. |
Just a note... I also have some code to make a volcano plot in line [111] here. Don't know if this is still needed, but I thought I would see if someone cares. @davidsebfischer do you allow labels in your volcano plot function? And can it take any object, or only some custom |
@andrea-tango @MichaelPeibo To address the filtering of rank_genes_groups (eg. @falexwolf I don't know why |
does |
does not recompute, simply saves the filtered data under
adata.uns['rank_genes_groups_filtered']. Thus, different parameters can be
tested quickly. Off course, sc.tl.rank_genes_groups has to be call first.
…On Mon, Mar 11, 2019 at 3:47 PM MalteDLuecken ***@***.***> wrote:
does sc.tl.filter_rank_genes_groups filter the sc.tl.rank_genes_groups
result? Or does it recompute? The former would not alleviate the multiple
testing burden.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#460 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEu_1debC8DotLkQywhO8zJpEvfkBbSHks5vVmxpgaJpZM4ahuSs>
.
|
If p-values are regarded as a valuable output rather than just the ranks, it might be worth recomputing as thresholding would ease the multiple testing burden. I guess that's the idea behind the |
@LuckyMD your tutorial is very interesting! |
@falexwolf @andrea-tango |
Sorry this is a little off topic, but it's something I've found useful: @LuckyMD and anyone else looking for an interactive volcano plot, I've been using de_df.hvplot.scatter(
"logfoldchanges", "pvals_adj",
flip_yaxis=True, logy=True,
hover_cols=["names"]
) Complete example using scanpyimport pandas as pd
import numpy as np
import hvplot.pandas
import scanpy as sc
def rank_genes_groups_df(adata, group, pval_cutoff : float =None, logfc_cutoff=None):
d = pd.DataFrame()
for k in ['scores', 'names', 'logfoldchanges', 'pvals', 'pvals_adj']:
d[k] = adata.uns["rank_genes_groups"][k][group]
if pval_cutoff is not None:
d = d[d["pvals_adj"] < pval_cutoff]
if logfc_cutoff is not None:
d = d[d["logfoldchanges"].abs() > logfc_cutoff]
return d
pbmcs = sc.datasets.pbmc68k_reduced()
sc.tl.rank_genes_groups(pbmcs, "bulk_labels", n_genes=pbmcs.var_names.size)
de_df = rank_genes_groups_df(pbmcs, "CD34+")
de_df.hvplot.scatter(
"logfoldchanges", "pvals_adj",
flip_yaxis=True, logy=True,
hover_cols=["names"]
) |
@ivirshup, this is cool and very useful. Would you make a notebook "Interactive plotting" for http://scanpy-tutorials.readthedocs.io? And we'd link to it from https://scanpy.readthedocs.io/en/latest/tutorials.html as done for the other notebooks. You could also add the example from #510 (comment). As time passes, this could grow. But these two little examples are a very useful start, I think. |
@Koncopd, updating |
Dear all,
I am writing to ask you some other functionalities.
I have just moved from Seurat to Scanpy and I am finding Scanpy a very nice and well done Python package.
I wrote a function to show the 3D plot of the UMAP, tSNE and PCA spaces. In the
scanpy.tl.tsne
function is not possible to change the number of components, it calculates only the first two components, even if thescanpy.pl.tsne
function has a parametercomponent
. May you add a parameter like then_components
of thescanpy.tl.umap
function?In the
rank_genes_groups
function the log2FC values are provided only for ‘t-test’ based methods. May you return the log2FC values (maybe named log2FC) for all the implemented statistical methods?I think that two parameters in the
rank_genes_groups
function should be added.min_pCells
to test only the genes that are detected in a minimum fraction of cells of either of the two populations (e.g., cluster 0 vs rest). For instance, min_pCells=0.3 means that at least 30% of the cells must express that gene.positive
, if it is True, the function should return only positive marker genes for each population.A function showing the volcano plots (based on the log2FC) can help (I can write it if the log2FC values are provided).
Thank you in advance.
Best,
Andrea
The text was updated successfully, but these errors were encountered: