-
Notifications
You must be signed in to change notification settings - Fork 1
ABSOLUTE
RunAbsolute
is the main function. Its first parameter is the name of the segments file, which is either a data table with a specific format (see below), or a .Rdata file produced by Hapseg. When using copy_num_type="total"
, you must supply a CSV; when copy_num_type="allelic"
, you need a .Rdata file.
The segments file used as input to RunAbsolute must be a tab-delimited file with the following columns.
- "Chromosome": the chromosome that the segment is on (eg. 1)
- "Start": start of the segment (eg. 742429)
- "End": end of the segment (eg. 6398273)
- "Num_Probes": ???
- "Segment_Mean": ???
Some parameters are required by RunAbsolute. Most of the descriptions and example values are taken from the official documentation.
- min.ploidy: minimum ploidy value to consider (eg. 0.95)
- max.ploidy: maximum ploidy value to consider (eg. 10)
- max.sigma.h: ??? "Maximum value of excess sample level variance (Eq. 6)". Eq. 6 refers to this paper (eg. 0.02).
- sigma.p: ??? "Provisional value of excess sample level variance used for mode search" (eg. 0)
- platform: one of "SNP_250K_STY", "SNP_6.0", or "Illumina_WES".
- copy_num_type: one of "total" or "allelic" (use "total" if you have a seg file, "allelic" if you have a Hapseg .Rdata file).
- results.dir: where to put results
- primary.disease: a string describing the disease being studied (eg. "cancer"). Seems to do nothing.
- sample.name: name of the sample (eg. "foo").
- max.as.seg.count: maximum number of allelic segments. Samples with a higher segment count will be flagged as 'failed'. (eg. 1500).
- If you are supplying a seg file, you must also set
copy_num_type="total"
. You must also setmax.as.seg.count
to a sufficiently large value (but it doesn't seem to need to be bigger than the total number of segments in the file). - Make sure
max.as.seg.count
is large enough. If RunAbsolute finishes very quickly, load the RData and checkseg.dat[["mode.res"]][["mode.flag"]]
. If it is"OVERSEG"
, it means there were more segments thanmax.as.seg.count
, so try increasing that parameter until it runs.
RunAbsolute("mix250K_seg_out.txt",
min.ploidy=0.95,
max.ploidy=10,
max.sigma.h=0.02,
sigma.p=0,
platform="Illumina_WES",
copy_num_type="total",
results.dir="test",
primary.disease="cancer",
sample.name="foo",
max.as.seg.count=1500)
Calling RunAbsolute
will produce a file results.dir/sample.name.ABSOLUTE.RData
(so in the above example, it would be test/foo.ABSOLUTE.RData
). If you load
this file within R, you will have access to an object called seg.dat
which contains the output. This object has the following attributes.
- segtab: a data.frame with the following columns.
- Chromosome
- Start.bp
- End.bp
- n_probes
- length
- copy_num
- seg_sigma
- W
- error_model:
- primary.disease:
- group:
- platform:
- sample.name:
- array.name:
- obs.scna:
- mode.res:
- version: