purpose of the -group and -target flags #4

derekcg · 2023-02-02T20:43:14Z

Hello Dr. Qian Feng,

I'm investigating recombination in a large gene family and I think detREC is a good approach, but I'm still struggling to implement the parallelization pipeline you and Dr. Gerry Tonkin-Hill used in your two papers. Could you tell me how the -group and -target flags work in mosaic? I can't find any documentation on them. I'm trying to understand the "-group 2 db target -target target" part of the mosaic command used both in 1nd_mosaic_est_par.sh and in Gerry's supplemental scripts. When I include this in a mosaic command I just get the error, "Could not assign sequence to group". It seems to run without that part but I don't want to exclude that part of the command without understanding what its doing.

Also, in order to replicate the pipeline of Dr. Gerry Tonkin-Hill, when estimating the recombination rate parameter I'll ultimately want to perform some 1000-vs-all runs like he did, where one set of sequences (a subset with 1000 sequences) are aligned to another set of sequences (all of the sequences). However I don't know how to get mosaic to use two sets of sequences like that. I suspect it involves the -group and -target flags, but that's just me guessing.

I'd appreciate any insight you can offer.

Best,
Derek Conkle-Gutierrez

qianfeng2 · 2024-05-19T10:25:50Z

Hi Derek,

Very sorry for the late reply. I didn't notice this comment until today.

We ran mosaic by firstly defining target and source sequences. The target sequences are searched against all source sequences to get the mosaic representations. Therefore, there are two groups of sequences (target and source). All the target sequences need to be labeled with a prefix "target_", and all sources are labeled with a prefix "db_". See examples in https://github.com/qianfeng2/Ghana_data_analysis/blob/main/mosaic_processed_data/results_final_alignment/Protein_translateable_pilot_upper_centroids_run100.fasta_align.txt. You have to obey this rule in order to running mosaic program (determined by its source scripts). 1000-vs-all uses the same rule as above.

Please let me know if any problems.

Kind regards,
Qian

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

purpose of the -group and -target flags #4

purpose of the -group and -target flags #4

derekcg commented Feb 2, 2023

qianfeng2 commented May 19, 2024

purpose of the -group and -target flags #4

purpose of the -group and -target flags #4

Comments

derekcg commented Feb 2, 2023

qianfeng2 commented May 19, 2024