Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issues with jplace output #15

Open
lpipes opened this issue Oct 26, 2023 · 9 comments
Open

issues with jplace output #15

lpipes opened this issue Oct 26, 2023 · 9 comments

Comments

@lpipes
Copy link

lpipes commented Oct 26, 2023

Hi, I'm trying to convert the jplace table but it seems like my jplace file has no entries for multiclass. Can you explain why this is? I was assuming that the jplace files are the same as the output for pplacer. I attached the test.jplace which I had to rename to test.txt. Thanks

> library(BoSSA)
> sql<-read_sqlite("test.db",jplace_file=gsub("sqlite","jplace","test.jplace"))
> sql
pplace object
run: 1
call run 1: guppy classify -c test.refpkg/ test.jplace --sqlite test.db
Placement on a phylogenetic tree with 1456 tips and 871 internal nodes.
sequence nb: 1805
placement nb: 1805
> table<-pplace_to_table(sql,type="best")
> table
NULL

test.txt

@balabanmetin
Copy link
Owner

thanks for using APPLES-2.

APPLES-2 outputs a place file in the format described here. Many tools including guppy and gappa can read the APPLES-2 jplace file. guppy s classify module must have extra requirements that are not mentioned in the original jplace format. My suspicion is that inconsistencies between the taxa names (underscores, quotation marks, etc.) in test.jplace and refpkg you use in classification is the reason. I would place a single sequence using both pplacer and APPLES-2 and look at the differences in the jplace file to find the error.

@lpipes
Copy link
Author

lpipes commented Oct 26, 2023

The 2 jplace files look very different. From APPLES:

    "placements": [
        {
            "n": [
                "AY666199.1_151"
            ],
            "p": [
                [
                    615,
                    23.17112997861102,
                    1,
                    0.0236617022326274,
                    0.048165539617111265
                ]
            ]
        }
    ]

and then from pplacer:

  "placements":
  [
    {"p":
      [
        ["161681", 0.0337533286028, 1591, 0.438682310596, -7990.27227968,
          0.0505814804718
        ],
        ["8917", 0.0163040309998, 952, 0.166632640315, -7991.24026354,
          0.0523737200202
        ],
        ["161681", 0.000589067307439, 1593, 0.0940030574849, -7991.81272786,
          0.0614246190462
        ],
        ["161677", 5e-07, 953, 0.093953452743, -7991.81325569,
          0.0616765683895
        ],
        ["8917", 0.00712812213867, 1592, 0.0939127701877, -7991.81368879,
          0.0616753354433
        ],
        ["161677", 5e-07, 1396, 0.0609206694948, -7992.24648265,
          0.0652258111253
        ],
        ["190658", 0.0212321684278, 1137, 0.051895099178, -7992.40683081,
          0.0534156384798
        ]
      ], "nm": [["AY666199.1_151", 1]]
    }
  ]

I attached all of the files used to run both commands
files.tar.gz

@balabanmetin
Copy link
Owner

balabanmetin commented Oct 26, 2023

The problem may be stemming from the fact that APPLES-2 outputs a single name "n" and placer outputs "nm" namelist. The other difference here is that pplacer output has classification ("161681") and APPLES-2 doesn't. But that what you were trying in the snippet you shared in the beginning of the thread: guppy classify -c test.refpkg/ test.jplace --sqlite test.db). I would try to change "n" to "nm" and retry classifying using guppy.

I just had a thought: another problem is that the "tree" in the APPLES-2 place file is not identical to the input file since APPLES-2 re-estimated input tree branch lengths. That might be conflicting with the tree file inside the test.refpkg.

@lpipes
Copy link
Author

lpipes commented Oct 26, 2023

If I change n to nm in the test2.jplace file, I get this error:

guppy classify -c test.refpkg/ test2.jplace --sqlite test.db 
guppy: loadlocale.c:129: _nl_intern_locale_data: Assertion `cnt < (sizeof (_nl_value_type_LC_TIME) / sizeof (_nl_value_type_LC_TIME[0]))' failed.
Aborted (core dumped)

If I change the nm back to n and run the same command, I get no error.

@lpipes
Copy link
Author

lpipes commented Oct 26, 2023

If it is the case that the tree in the APPLES-2 place file is not identical to the one in test.refpkg, how do I extract the tree easily from the jplace file?

@lpipes
Copy link
Author

lpipes commented Oct 27, 2023

I tried to build the taxit refpkg using the tree that was output by apples and now I am getting this error (it runs if using the RAxML tree)

taxit create -P apples.refpkg -l COI --aln-fasta 79_MSA.fasta --taxonomy 79_taxonomyfromtaxids.csv --seq-info 79_seqInfo.csv --tree-file apples.tre
rppr: loadlocale.c:129: _nl_intern_locale_data: Assertion `cnt < (sizeof (_nl_value_type_LC_TIME) / sizeof (_nl_value_type_LC_TIME[0]))' failed.
Traceback (most recent call last):
  File "/space/s1/lenore/software/taxtastic_2/taxtastic/taxit.py", line 22, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/space/s1/lenore/software/taxtastic_2/taxtastic/taxtastic/scripts/taxit.py", line 51, in main
    return action(arguments)
  File "/space/s1/lenore/software/taxtastic_2/taxtastic/taxtastic/subcommands/create.py", line 168, in action
    r.reroot(rppr=args.rppr)
  File "/home/lenore/.local/lib/python3.10/site-packages/decorator-5.1.1-py3.10.egg/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/space/s1/lenore/software/taxtastic_2/taxtastic/taxtastic/refpkg.py", line 132, in transaction
    return f(self, *args, **kwargs)
  File "/space/s1/lenore/software/taxtastic_2/taxtastic/taxtastic/refpkg.py", line 497, in reroot
    subprocess.check_call([rppr or 'rppr', 'reroot',
  File "/home/lenore/Python-3.10.3/Lib/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['rppr', 'reroot', '-c', '/space/s1/lenore/tronko_revisions/APPLES/apples.refpkg', '-o', '/tmp/treet9rp6nni.tre']' died with <Signals.SIGABRT: 6>

Ultimately, I'm just trying to get the taxonomy associated with the apples placement in the tree. I don't see any way to do this in the tutorial and the way I do this for pplacer output doesn't work.

@lpipes
Copy link
Author

lpipes commented Oct 27, 2023

I think the reason why I can't make the refpkg is because the APPLES-2 tree is not binary:

> apples <- read.tree("apples.tre")
> apples

Phylogenetic tree with 1456 tips and 871 internal nodes.

Tip labels:
  JX160001.1, KM001306.1, KM001305.1, KM001304.1, DQ433124.1, AY666202.1, ...

Rooted; includes branch lengths.
> is.binary(apples)
[1] FALSE

I made the tree binary and ran taxit and that worked to make the refpkg but then failed again when running guppy classify.

@lpipes
Copy link
Author

lpipes commented Oct 27, 2023

I disabled the tree re-estimation and I still cannot get the taxids associated with the placement:

> sql<-read_sqlite("test.db",jplace_file=gsub("sqlite","jplace","test3.jplace"))
> sql
pplace object
run: 3
call run 1: guppy classify -c test.refpkg/ test3.jplace --sqlite test2.db
Placement on a phylogenetic tree with 1456 tips and 1455 internal nodes.
sequence nb: 1807
placement nb: 1807
> table<-pplace_to_table(sql,type="best")
> table
NULL
> sql$multiclass
[1] placement_id name         want_rank    rank         tax_id      
[6] likelihood  
<0 rows> (or 0-length row.names)

The problem is that there is no multiclass associated with the placement

@lpipes
Copy link
Author

lpipes commented Oct 27, 2023

I also tried to use gappa and it is complaining about the tree not being binary:

Found 1 jplace file
Error: Supplied tree is not bifurcating.

terminate called after throwing an instance of 'std::runtime_error'
  what():  Supplied tree is not bifurcating.
Aborted (core dumped)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants