-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interest in Krakatau 2? #185
Comments
Really, don't sure about any future of this project... I prefer to use CFR during last years. What would be interesting is a Kotlin specific decompiler (which uses metadata, deals with lambdas, etc.) Thank you anyway for your awesome project! |
I have a bunch of obscure samples for potential testing. I don't know about krakatau internals but I liked its peephole analysis/optimizations it did, as its unique to krakatau. What is your goal for rewriting Krakatau? Better code? Python 3? Fun side project? |
I would help if yourewrite it in java.
If still python, then I could do nearly nothing.
XenoAmess
…________________________________
From: Janmm14 ***@***.***>
Sent: Sunday, April 24, 2022 6:06:26 AM
To: Storyyeller/Krakatau ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [Storyyeller/Krakatau] Interest in Krakatau 2? (Issue #185)
I have a bunch of obscure samples for potential testing.
I think besides support for basic and better display of custom/complex invokedynamic, an option for it to do unambiguous imports helps much in user-friendlyness.
I don't know about krakatau internals but I liked its peephole analysis/optimizations it did, as its unique to krakatau.
—
Reply to this email directly, view it on GitHub<#185 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEFFR2JLVANMCPXV5XSDBOTVGRX6FANCNFSM5UFGUI6A>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
If I'm going to write anything new, it will be in Rust. But I don't need help with the coding anyway. What I need help with is testing, and in particular identifying interesting samples of obfuscated applications, figuring out where the decompiler works well or not, highlighting features that would be useful to add, etc. |
@Storyyeller, Krakatau was the only Java decompiler I know that has reliably decompiled the code that was not originally in Java (I use it on the project written in Scala called Kaitai Struct, I have written the kinda python bindings to the compiler using JVM, but I cannot share it because KS is under GPL, since Scala methods calls map to Java methods calls in obscure ways, I had to decompile the binary to know how exactly I should call which methods) somehow correctly. Other decompilers usually either throw exceptions on such code or emit incorrect code. So yes, there is interes in Krakatau, if one is interacting to Scala code from other languages. CFR is cool but it often doesn't decompile Scala code correctly.
Why not to use the existing test suites? |
Status update and some questions for you all: The new disassembler is mostly finished and in the testing and polish stage. I haven't started the assembler or decompiler yet. I expect the assembler to be a bit longer than the disassembler and the decompiler to take longer than both put together. Question 1:Currently, the plan is to support three output modes:
Directory output is the currently recommended way to use Krakatau, but it is problematic because there is no guarantee that the class names will result in valid filenames. There may be errors trying to create files with the corresponding names, or even worse, output from one class might silently overwrite another on case insensitive filesystems, e.g. Windows. Currently, Krakatau has complicated name mangling logic to try to work around this, but this has the downsides that a) it adds a lot of complexity, and b) there is no way for users to predict where the output for a given class will actually be written to anyway. Therefore, my plan is to remove all the name mangling logic in v2 and just say that directory output is "use at your own risk" and recommend that you use zipfile output instead. Is this ok with everyone? Question 2Deployment - I never bothered to set anything fancy up packaging-wise for v1, but now that I'm rewriting it in Rust, I figure it's a good time to try to find out what people think the best distribution strategy would be. tagging @anthraxx, @MartinThoma, @toddATavail as well since you asked about packaging issues before |
As long as this complexity is isolated, it is OK.
One can just create a tsv file with the pairs
I thknk it is extremily inconvenient. To analyse the source I usually unpack the zip archives of the decompilers emitting them and it has always felt weird that they emit an archive, not a dir. Thank you for clarifying that aspect. While it is likely no binaries I had analysed had such paths, I think the potential inability to get decompilation results in a form of file tree is not good.
I guess one can try to create GitHub Actions pipeline emitting GitHub Pages, which will generate a repo that can be consumed by native repository managers like |
Krakatau is currently used in some guis for decompilation. They usually ask for decompilation/disassemble of a single class file. To make this easy for potential bad-named class files (multiple issues with that in the past) I'd suggest to allow single class file name input and defined output file name where the name of the input file is ignored. |
I'm already planning to do that. The question is whether directory output is also necessary. |
I just heard about that Krakatau is being re-written in Rust, would it be possible to add JNI bindings for easier use? That would be very handy |
@xxDark, you may want to try GraalVM with GraalPython module. It allows to use not only python from java and java from python, but also other languages, like JS. |
I never actually looked into how Graal itself works, I might give it a shot just to try and see how it goes |
I briefly looked at the Recaf repository, and it looks like it is mostly focused on disassembly and reassembly rather than decompilation, correct? I expect it to take much longer to rewrite the Krakatau decompiler than the assembler and disassembler, so I was wondering if you would be interested in trying it out once I finish the assembler, even without the decompiler being rewritten. It seems like just having access to the Krakatau assembler and disassembler would already be very useful for you. As for JNI support, that's something I might consider later, but it's not an immediate priority. I think calling it as a subprocess would be easiest for now. |
By the way, one other question for you all - I've been thinking about removing the ACC_SUPER and strictfp flags when disassembling in non-roundtrip mode since those flags don't actually do anything in modern Java and just add visual noise to the disassembly which might make it harder to understand. What do you think? |
I don't have any strong opinion on that since I'm not familiar to that impl details.
I guess if the code is intended to be executed on the versions of Java where they do something (does the bytecode format have any mechanism to indicate that?), they should be kept. |
The current master branch contains the current 2X release source, which has a really crappy assembler in it. We're focusing our efforts on getting 3X ready for release, and on that branch we recently invested into making a new assembler. In addition to not being crap, it offers some quality-of-life features like in-line expressions and name-based variable access. Stuff to make the bytecode a bit more accessible to new users. With that in mind, I don't think we would get a lot of value from a new assembler at the moment, especially if it were to require an layer of interop and not support the features our current model operates off of. As everyone else has said thus far, we still look forward to whatever progress gets made on Krakatau 2 :) |
BTW for parsing java bytecode you can try to utilize Kaitai Struct |
The main advantage of Krakatau is full support for every part of the classfile format, as well as support for bytecode that makes use of a number of bugs and undocumented features in older versions of the JVM. Admittedly, that's not so relevant nowadays. It definitely prioritizes control over low level details over beginner-friendliness though - it's more aimed at bytecode hackers who are already familiar with how Java bytecode works. |
Status update: I started work on the assembler today. |
I use recaf as a gui for decompilers when very old Helios 0.0.7 fails me. Recaf aims to be able to read zip files like the jvm and uses CAFED00D to normalize bytecode. |
Do you have any samples handy that I can use to test the zipfile reading issue you mentioned? |
Actually I was wrong with that part, recaf doesn't include such a feature and I'm not sure whether I got such a jar so far or whether current jvm does open initial jar differently from its java zip implementation at all. |
What do you mean? The 3X branch does read zip files as the JVM does. Both 2X and 3X include bytecode normalization (Removing intentionally malformed attributes that aren't used at runtime in order to crash reverse engineering tools/libraries).
The library we made to read zip files as the JVM does has some samples in the test directory. See the The major thing being that most zip parsers sig scan going forward for section headers. The JVM looks for the "end central directory entry" by looking backwards because that is optimal. The entry is found at the end of the file. Now consider if you use a hex editor to put two zip files together. Most tools will read/display the one at the beginning. But the JVM will read the one at the end. You can add on some extra tricks to make for a confusing archive, but this is the major gist. |
@Col-E i was just using githubs search so it doesnt check branches. also he didnt ask for abnormal bytecode, so my answer was fully related to zip files |
The I was writing my own rust library for parsing class files and hit this. There was an unpaired surrogate codepoint in that string which is not valid UTF-8 when decoded. |
Status update: I finished the initial version of the new assembler and started testing it today. |
I have finished testing the assembler and disassembler and think they are ready for public testing now. Anyone interested in trying them out? New features:
Backwards incompatible changes:
|
Hi. |
There is. But for C libraries. openssl-sys links to openssl dynamically. |
Is python still going to be maintained? Also you should try incremental compilation as that would improve compile times after small changes. |
Thanks for the info. But as I have said, in order to debloat projects written in Rust, it should be de-facto working in practice for Rust libs. If it is working in the tools, but the community is within a tragedy of commons situation and cannot make it work and prefer to bloat the software, then implementing the needed features in the tools becomes useless, because noone will use them.
Thanks for the info, I should read about it. |
Thanks for the feedback! I'll see if I can optimize it a bit after the holidays. Could you provide the jar you used for benchmarking? I would expect a much larger speedup for optimized builds than that. |
mkdir -p ./destdir
wget -O ./destdir/kait.deb https://dl.cloudsmith.io/public/kaitai/debian-unstable/deb/any-distro/pool/any-version/main/k/ka/kaitai-struct-compiler_0.10-SNAPSHOT20220813.105458.a4435936/kaitai-struct-compiler_0.10-SNAPSHOT20220813.105458.a4435936_all.deb
ar x --output=./destdir ./destdir/kait.deb ./data.tar.gz
tar -zxv --directory ./destdir -f ./destdir/data.tar.gz ./usr/share/kaitai-struct-compiler/lib/io.kaitai.kaitai-struct-compiler-0.10-SNAPSHOT20220813.105458.a4435936.jar
mv ./destdir/usr/share/kaitai-struct-compiler/lib/io.kaitai.kaitai-struct-compiler-0.10-SNAPSHOT20220813.105458.a4435936.jar ./kait.jar
rm -rf ./destdir |
I'm sorry, it was my fault, I have mistakingly called |
@KOLANICH I updated it to remove unnecessary zip dependencies, reducing the binary size from 8.5mb to 7.3mb. I also tried to optimize the disassembler. However, it is already so fast that it was difficult to even benchmark or profile, and it looks like a lot of the remaining time is just spent on IO, which is unavoidable, so there doesn't seem to be much potential for further speedups here. |
One other note - I did do extensive testing before release making sure that Py disassembler -> Py assembler, Py disassembler -> Rust assembler, Rust disassembler -> Py assembler, Rust disassembler -> Rust assembler, etc. all give compatible results where expected. In fact, the main reason I backported most of the new features to the Python version was to make this comparison easier. |
@Storyyeller just wanted to drop a comment here to say the v2 Rust assembler / dissembler is great! Noticed a large speed improvement. Have not ran into any issues thus far. Are there any plans to port the decompiler also? |
Thanks! I hadn't attempted to rewrite the decompiler yet because it would be a lot of work and I worried that noone would use it anyway due to the lack of response on the assembler/disassembler. |
I usually use the decompiler component of Krakatau. |
The Krakatau decompiler is still the one resisting most obfuscation techniques and providing accurate results. Quiltflower and the other ones cannot compete when having obfuscated bytecode. It is your decision whether you want to invest the time into this. |
yes you can try eclipse instead for decompiler.
decompiling is not that same question to deassemble...
…________________________________
From: Janmm14 ***@***.***>
Sent: Wednesday, February 1, 2023 11:04:54 AM
To: Storyyeller/Krakatau ***@***.***>
Cc: XenoAmess ***@***.***>; Mention ***@***.***>
Subject: Re: [Storyyeller/Krakatau] Interest in Krakatau 2? (Issue #185)
The Krakatau decompiler is still the one resisting most obfuscation techniques and providing accurate results. Quiltflower and the other ones cannot compete when having obfuscated bytecode.
It is your decision whether you want to invest the time into this.
Decompilation has never been a popular topic.
Other decompilers focus on readability and good regular invokedynamic display, Krakatau focuses on correct and mostly runnable decompilability of even highly obfuscated source code at the cost of some readability.
Combined with the need for all libraries and python, other decompilers (mostly in java) had it a lot easier.
—
Reply to this email directly, view it on GitHub<#185 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEFFR2LPN67DIBDHHOGJERLWVHHFNANCNFSM5UFGUI6A>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@KOLANICH +1 to this. Its far superior to anything else out there in my opinion. I will always favour correctness over readability.
@Storyyeller I tend to use all three equally and often! |
Thanks for the responses everyone. It might be a while before I have the time, but I'll look into working on the decompiler. |
FYI, I started working on the decompiler last week. However, it will take a long time to get anywhere, so don't get your hopes up. |
Amazing news 😀 let me know if I can help in any way (my Rust experience is limited, but happy to do some testing / QA) |
Hi. now it can only do but when 1 we can put out first, like well is this by means to do so? really don't think there should be such order limit for |
Like I said before I'm really interested in krak2 |
Update: My previous comment on Feb 12 was way too optimistic. I've been busy and haven't gotten the chance to work on the decompiler at all lately. Today, I finally found the time to work on Krakatau again, but not on the decompiler. I improved the error messages for the assembler and disassembler (#194). |
Updated Krakatau v2 to handle the fake directory attack (https://github.com/x4e/fakedirectory). v1 is still affected. |
Nice! (although it won't affect my workflow) |
Updated Krakatau v2 to ignore CRC checksums in jar files. |
@Storyyeller |
I'm not sure how to do that. |
Update: I haven't worked on the decompiler at all since early February. I still intend to rewrite it eventually, but I don't know if or when I'll be able to start making progress on it. |
Definitely interested in the v2 decompiler. Happy to jump in and help where I can, too |
The basic problem is that I didn't want the v2 decompiler to just be a straight rewrite of v1. I wanted to come up with a better structuring algorithm in order to handle try blocks with multiple catch arms, which the v1 algorithm can't handle. But then I got stuck trying to come up with a new algorithm and eventually gave up. |
rip |
@KOLANICH @Janmm14 @QwertyYtPl @samczsun @lab313ru @Dmunch04
I've been thinking about doing a complete ground-up redesign and modernization of Krakatau, but I'm not sure if there is enough interest to justify the effort, so I was curious if anyone would be interested in such a project. One particular problem is that I haven't been active in Java reverse engineering myself since 2015 or so, so I would be reliant on users to do all the testing. What do you think?
The text was updated successfully, but these errors were encountered: