Skip to content

Commit

Permalink
update the readme and some help messages
Browse files Browse the repository at this point in the history
  • Loading branch information
hasindu2008 committed Sep 20, 2024
1 parent 9a1b827 commit b4e6141
Show file tree
Hide file tree
Showing 3 changed files with 32 additions and 33 deletions.
39 changes: 19 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# sigtk

A simple toolkit written for performing various operations on nanopore raw signal data. This is still in a very premature development stage and thus anticipate changes. Currently, *sigtk* is single threaded and has not been optimised for performance. The intended use is to perform operations on relatively smaller datasets for learning purposes and eyeballing.
A simple toolkit written for performing various operations on nanopore raw signal data. *sigtk* is single threaded and is not optimised for performance. The intended use is to perform operations on relatively smaller datasets for learning purposes and eyeballing.

## Building

Expand Down Expand Up @@ -105,14 +105,9 @@ Prints signal statistics.
|7 |int |raw_median |Median of raw signal values |
|8 |float |pa_median |Mean of pico-amperes scaled signal |


## subtools under development

Note that these are not much tested and the interface and output may change at anytime.

### prefix

Under construction. Will change anytime. Only for direct RNA at the moment.
Under development. Only for direct RNA at the moment.
Finds prefix segments in a raw signal such as adaptor and polyA.

|Col|Type |Name |Description |
Expand All @@ -137,7 +132,7 @@ If `--print-stat` is printed, following additional columns will be printed.

### jnn

Under construction. Will change anytime. Print segments found using JNN segmenter.
Under development. Print segments found using JNN segmenter.

|Col|Type |Name |Description |
|--:|:----:|:------: |:----------------------------------------- |
Expand All @@ -163,9 +158,20 @@ If `-c` is specified, output will be in the following short notation by using re

`100H10,91H11,`


### ss

Under development. Operations to convert to/from signal alignment string (ss). See https://hasindu2008.github.io/f5c/docs/output#resquiggle-paf-output-format for explanation of ss.

To convert a PAF file with ss tags to TSV, you can use:
```
sigtk ss paf2tsv in.paf
```


### ent

Under construction. Will change anytime. Calculates shannon entropy for reads in a given S/BLOW5 file.
Calculates shannon entropy for reads in a given S/BLOW5 file.

|Col|Type |Name |Description |
|--:|:----:|:------: |:----------------------------------------- |
Expand All @@ -174,25 +180,18 @@ Under construction. Will change anytime. Calculates shannon entropy for reads in
|3 |float |delta_ent |entropy after zig-zag delta |
|4 |float |byte_ent |entropy after splitting and storing least significant byte and most significant byte of the zig-zag delta values separately: ent(LSB)+ent(MSB) |

### ss

Under construction. Will change anytime. Operations to convert to/from signal alignment string (ss). See https://hasindu2008.github.io/f5c/docs/output#resquiggle-paf-output-format for explanation of ss.

To convert a PAF file with ss tags to TSV, you can use:
```
sigtk ss paf2tsv in.paf
```

### qts
Under construction. Will change anytime. Quantise the raw signal in a S/BLOW5 files. Takes a S/BLOW5 file as the input and writes the quantised output to a S/BLOW5 file.

Quantise the raw signal in a S/BLOW5 files. Takes a S/BLOW5 file as the input and writes the quantised output to a S/BLOW5 file.

Usage:
```
sigtk qts original.blow5 -o quantised.blow5
```

Options:
`-q INT` : Number of LSB bits to trucate (set to 0). Default is 1.
`-q INT` : Number of lower significant bits eliminate. Default is 1.
`-m [floor|round|fill-ones]`: quantisation method. Default is round.


## Acknowledgement
Expand Down
2 changes: 1 addition & 1 deletion src/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ int print_usage(FILE *fp_help){
fprintf(fp_help," jnn print segments found using JNN segmenter\n");
fprintf(fp_help," ss ss string conversion\n");
fprintf(fp_help," ent calculate entropies\n");
fprintf(fp_help," qts quantise\n");
fprintf(fp_help," qts quantise the raw signal in a S/BLOW5 files\n");

if(fp_help==stderr){
exit(EXIT_FAILURE);
Expand Down
24 changes: 12 additions & 12 deletions src/qts.c
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,13 @@ static struct option long_options[] = {
int round_to_power_of_2(int number, int number_of_bits) {
//create a binary mask with the specified number of bits
int bit_mask = (1 << number_of_bits) - 1;

//extract out the value of the LSBs considered
int lsb_bits = number & bit_mask;


int round_threshold = (1 << (number_of_bits - 1));

//check if the least significant bits are closer to 0 or 2^n
if (lsb_bits < round_threshold) {
return (number & ~bit_mask) + 0; //round down to the nearest power of 2
Expand All @@ -46,7 +46,7 @@ int round_to_power_of_2(int number, int number_of_bits) {

int qtsmain(int argc, char* argv[]) {

const char* optstring = "hVv:o:b:";
const char* optstring = "hVv:o:b:m:";

int longindex = 0;
int32_t c = -1;
Expand All @@ -56,7 +56,7 @@ int qtsmain(int argc, char* argv[]) {

int8_t b = 1; //number of LSB bits to truncate

char *method = "floor"; //quantization method
char *method = "round"; //quantization method

//parse the user args
while ((c = getopt_long(argc, argv, optstring, long_options, &longindex)) >= 0) {
Expand All @@ -67,10 +67,10 @@ int qtsmain(int argc, char* argv[]) {
fp_help = stdout;
} else if (c=='o'){
out_fn = optarg;
} else if (c=='b'){
} else if (c=='b'){
b = atoi(optarg);
if (b < 0 || b > 16) {
fprintf(stderr, "Error: number of bits to truncate must be between 0 and 8\n");
if (b < 1 || b > 8) {
fprintf(stderr, "Error: number of bits to truncate must be between 1 and 8\n");
exit(EXIT_FAILURE);
}
} else if (c=='m'){
Expand All @@ -84,8 +84,8 @@ int qtsmain(int argc, char* argv[]) {
fprintf(fp_help," -h help\n");
fprintf(fp_help," -o FILE output file\n");
fprintf(fp_help," --version print version\n");
fprintf(fp_help," -b INT number of LSB bits to truncate [%d]\n", b);
fprintf(fp_help," --method=[floor|round|fill-ones] quantization method\n");
fprintf(fp_help," -b INT number of lower significant bits to eliminate [%d]\n", b);
fprintf(fp_help," -m [floor|round|fill-ones] quantisation method [round]\n");
if(fp_help == stdout){
exit(EXIT_SUCCESS);
}
Expand Down Expand Up @@ -140,7 +140,7 @@ int qtsmain(int argc, char* argv[]) {
fprintf(stderr,"Unknown method for -m. Available options are floor,round,fill-ones.\n");
exit(EXIT_FAILURE);
}

//write to file
if(slow5_write(rec, sp_w) < 0){
fprintf(stderr,"Error writing record!\n");
Expand Down

0 comments on commit b4e6141

Please sign in to comment.