Skip to content

Latest commit

 

History

History
125 lines (96 loc) · 4.31 KB

README.md

File metadata and controls

125 lines (96 loc) · 4.31 KB

Targets

user can pass in either paths or regex, but functions always accept two lists of paths. if a regex is a subset of another regex, then those two are considered "equal" when it comes to the topological sort. i.e. a regex is a subset if its diff matches anything. Use union type on Path, Regex/string. if regex, then ls-grep the string; if empty, find the rule that yeilds the superset of the regex; if the ls-grep succeeds, the rule is fulfilled. If the type is path, just check if the path(s) exist. If not, the yielding rule is equal to the path OR matches the path as a regex.

#Regexes Regexes built via dregex

Regexes mimic bash-style: p"*_R[12]_*.fastq", p!"*_R[12]_*.fastq" the later is a not regex

if r2.diff(r1).doIntersect(".*") and r1.doIntersect(r2), then r1 is a subset of r2 the opposite of a regex can be found by diffing Regex(".*").diff(myRegex)

import dregex.Regex
("*.filtered.fastq" -> "*.fastq",
"whatever" -> "*_R1_*.filtered.fastq") // this rule requires "*.filtered.fastq"
val Seq(r1, r2) = Regex.compile(Seq("a", "[ab]"))
r2._2.diff(r1._2).matches("b")

// Regexes that are compared must be built in the same "universe"

import dregex.Regex
val (r1, r2) = 
scala> r1.diff(r2).matches("fR1f.fastq")
res19: Boolean = false

scala> r1.diff(r2).matches("fR2f.fastq")
res20: Boolean = true

scala> val Seq(r1, r2) = Regex.compile(Seq(".*\\.fastq", ".*_R[12]_.*\\.fastq")).unzip._2 

#Python inter-op

Two options:

###Jep not work java 8 requires pip install jep No module named jep -> pip install jep example undefined symbol: PyUnicodeUCS4_AsEncodedString -> $ LD_LIBRARY_PATH=$HOME/anaconda/lib:$LD_LIBRARY_PATH ../bin/sbt run allows arbitrary imports as long as that environment is activated & LD_LIBRARY_PATH is correct also allows CPython stuff (unlike jython)

###Create command scripts The py command will write the command to a temporary file and execute it. (syntax highilight ala` drake)

py f"""
from Bio import SeqIO
SeqIO.convert($in, 'sff', $out, 'fasta')"""" 

#Dynamic Changes Using: scala.rx The JobGraph gets wrapped in an Rx object, and all its components may be wrapped in Vars, if the user wishes. So if the user alters the Var the graph gets updated appropriately. The graph itself can be updated as well such that edges can be removed or added, or changed. Using the empty list, an edges requirements could be removed. Rx usage:

import rx._
val a = Var(1); val b = Var(2)
val jobMap = Rx{ Map( 'a -> a, 'b -> b) }
// functions get run with the Vars dereference
a() // derference a
a() = 4 // change a, now jobMap has changed

Additionally, it's easy to make a rule depend in some way on user options:

val opts = Map('mapper -> 'bwa
val Mapper:JobGraph = Map(
   ("paired.bam", List("*.fastq", opts.get('mapper)) -> 
      match { 
        case (out, (in :: 'bwa)) => runBwa 
        }// etc.
        )

###Command line and Settings integration see config.scala. Jackson can conver yaml to json.

###Union types

import com.gensler.scalavro.util.Union
import com.gensler.scalavro.util.Union.{ union, prove }

type ISB = union[Int]#or[String]#or[Boolean]

def f[T: prove[ISB]#containsType](t: T) {
  t match {
    case x:Int => println(x)
    case x:Boolean => println(1)
    case x:String => println(x.length) } }
}

#Resources Union types: http://stackoverflow.com/questions/3508077/how-to-define-type-disjunction-union-types#comment26966453_6312508

regex inclusion problem:

http://stackoverflow.com/questions/6363397/how-to-tell-if-one-regular-expression-matches-a-subset-of-another-regular-expres http://stackoverflow.com/questions/18729015/determining-whether-a-regex-is-a-subset-of-another/19013279#19013279 http://repository.upenn.edu/cgi/viewcontent.cgi?article=1089&context=cis_papers http://www.ii.uib.no/~dagh/presInclusionWroclaw.pdf http://qerub.se/improving-scalas-regex-support

pimp my library http://www.artima.com/weblogs/viewpost.jsp?thread=179766

http://twitter.github.io/effectivescala/

#Notes