Missing code for RecombinationNodeMrcas figures #120

jeromekelleher · 2023-05-18T11:15:37Z

I can't find the code for where the fields (e.g. fwd_bck_parents_max_mut_dist) in the data frame used in the RecombinationNodeMrcas figure is defined @hyanwong.

Do you think you could pull this out into a notebook that would do the calculations and export to a CSV? We don't really want to be doing computation in the plots file (which shouldn't need to read in the actual trees files at all).

hyanwong · 2023-05-18T11:46:38Z

I started dumping the CSV creation code into make_csv_files.py - probably better than a notebook?

hyanwong · 2023-05-18T11:47:39Z

I think the plots file might need to read in the trees file, if only to be able to plot the cophylogenies, right? So either we have a file which produces a smaller set of tree sequences for plotting the trees (so that the computation of simplifying the trees down is done there), or we do that simplification in the plots file.

The same goes for the subgraphs, which make up most of the code in plots.py, I think. How much of the computation of the subgraph plotting should happen in plots.py? If we do the computation somewhere else, we'd need to work out an export format for the subgraphs, which could be a real pain: not an easy thing to use a CSV file for.

jeromekelleher · 2023-05-18T12:59:28Z

We don't need to read the trees for this particular plot, and there's a lot of analysis in this that I (for one) don't fully follow. A notebook would be helpful.

hyanwong · 2023-05-18T13:01:12Z

Happy to make a notebook for the MRCAs plot(s). In fact, I think I have one anyway, that I used to get it all working. I thought you meant that you didn't want the any of the code in plots.py to read the trees, which would be tricky, IMO.

jeromekelleher · 2023-05-18T13:08:25Z

We want a notebook that has analysis to produce (a) the CSV used and (b) all the numbers quoted in the text.

hyanwong · 2023-05-18T13:11:46Z

I'll sort that. FWIW producing the CSV (in make_csv_files.py) is simply:

df = treeinfo.export_recombinant_breakpoints()
df.to_csv(f"data/breakpoints_{prefix}.csv")

and the numbers are output as extra info when running the plot-producing script with -v. It should be easy to add them to the notebook output too (personally I prefer them as text output when creating the plots, as I can easily get lost when looking through a notebook).

jeromekelleher · 2023-05-18T13:31:34Z

I don't think that produces fwd_bck_parents_max_mut_dist? I couldn't find code for it anyway.

hyanwong · 2023-05-18T13:32:44Z

It was a PR to the sc2ts utils file that was merged a while ago, I think? I'll check.

jeromekelleher · 2023-05-18T13:34:47Z

and the numbers are output as extra info when running the plot-producing script with -v

Not all of them - there's some more numbers in the text with no source

hyanwong · 2023-05-18T14:12:53Z

It was a PR to the sc2ts utils file that was merged a while ago, I think? I'll check.

Not merged yet, That's why. Sorry! jeromekelleher/sc2ts#141

there's some more numbers in the text with no source

Ah, good spotting then. I'll check.

hyanwong · 2023-05-19T15:25:01Z

We don't need to read the trees for this particular plot

Just looking at this again. I think it's helpful to have the trees in the plotting code, because we need to find the number of descendants of different types (e.g. BA.1) to label the top 4 MRCA nodes. This isn't something you can easily store in the CSV (since you don't want it for all nodes).

But I agree that we should be using a CSV for the point locations etc.

jeromekelleher · 2023-05-25T14:25:07Z

I think this is sorted now @hyanwong? Can we use data/wide_arg_recombinants.csv and delete the breakpoints... files, since they contain the same data? It looks like the only extra field needs the nodes_time field, which we could put into the plotting code since it needs to load the trees above?

We may as well delete the make_csvs file then, because all the rest are produced by exporting from notebooks (and I want to spend the time systematising this now).

hyanwong · 2023-05-25T15:37:57Z

Yep, fine.

hyanwong mentioned this issue May 19, 2023

Add the max genetic distance between parents at a breakpoint jeromekelleher/sc2ts#163

Open

hyanwong mentioned this issue May 26, 2023

Makefile #9

Closed

jeromekelleher closed this as completed in 7b2285d May 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing code for RecombinationNodeMrcas figures #120

Missing code for RecombinationNodeMrcas figures #120

jeromekelleher commented May 18, 2023

hyanwong commented May 18, 2023

hyanwong commented May 18, 2023 •

edited

Loading

jeromekelleher commented May 18, 2023

hyanwong commented May 18, 2023 •

edited

Loading

jeromekelleher commented May 18, 2023

hyanwong commented May 18, 2023 •

edited

Loading

jeromekelleher commented May 18, 2023

hyanwong commented May 18, 2023 •

edited

Loading

jeromekelleher commented May 18, 2023

hyanwong commented May 18, 2023

hyanwong commented May 19, 2023 •

edited

Loading

jeromekelleher commented May 25, 2023

hyanwong commented May 25, 2023

Missing code for RecombinationNodeMrcas figures #120

Missing code for RecombinationNodeMrcas figures #120

Comments

jeromekelleher commented May 18, 2023

hyanwong commented May 18, 2023

hyanwong commented May 18, 2023 • edited Loading

jeromekelleher commented May 18, 2023

hyanwong commented May 18, 2023 • edited Loading

jeromekelleher commented May 18, 2023

hyanwong commented May 18, 2023 • edited Loading

jeromekelleher commented May 18, 2023

hyanwong commented May 18, 2023 • edited Loading

jeromekelleher commented May 18, 2023

hyanwong commented May 18, 2023

hyanwong commented May 19, 2023 • edited Loading

jeromekelleher commented May 25, 2023

hyanwong commented May 25, 2023

hyanwong commented May 18, 2023 •

edited

Loading

hyanwong commented May 18, 2023 •

edited

Loading

hyanwong commented May 18, 2023 •

edited

Loading

hyanwong commented May 18, 2023 •

edited

Loading

hyanwong commented May 19, 2023 •

edited

Loading