-
Notifications
You must be signed in to change notification settings - Fork 0
/
ChapterConclusion.tex
25 lines (19 loc) · 2.51 KB
/
ChapterConclusion.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
%Chapter 2
\renewcommand{\thechapter}{7}
\chapter{Conclusion}
%The research in this dissertation enables comparison and evaluation of single and metagenomic assemblies.
The genome is the blueprint for building an organism and helps researchers better understand the organism's function and evolution.
Initially published in 2001, the human genome has undergone dozens of revisions over the years~\cite{monya2012}.
Researchers fill in gaps, and correct mistakes in the sequence.
It is not an easy task determining what parts of the genome are missing, what parts are mistakes, and what parts are due to experimental artifacts from the sequencing machine.
This biological problem can be formulated as reconstructing a text (genome) from a collection of randomly sampled word fragments with errors (sequence reads).
The focus of this dissertation was to develop the theory and computational methods to compare and evaluate the reconstructed texts (assemblies).
I have developed computational tools that use characteristics of the sequence data generation process to reproduce evaluations conducted by assembly experts without the use of reference genomes.
% compare and evaluate single genome and metagenomic assemblies.
%Our LAP framework is able to reproduce gold-standard evaluations of assemblies without the use of a reference genome.
I extended our likelihood-based framework and show that by taking into account abundances of assembled sequences, I can accurately compare different metagenomic assemblies.
Lastly, I introduced VALET, the first \emph{de novo} pipeline that flags regions in metagenomic assemblies that are statistically inconsistent with the data generation process.
VALET has detected mis-assemblies in publicly available datasets and highlights shortcomings in currently available metagenomic assemblers.
%By providing the computational methods for researchers to acccurately evaluate their assemblies before they publicaly disseminate it, we reduce the chance of other researchers making incorrect biological conclusions and misguided future studies.
By providing the computational tools for researchers to accurately evaluate their assemblies, I decrease the chance of incorrect biological conclusions and misguided future studies.
%I have contributed novel computational methods both to utilize emerging se- quencing for genome assembly and expand its efficiency on novel targets. Together, these advances form a comprehensive genome assembly and analysis toolset and enable new avenues of biological discovery.