Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry about Method Caller Extraction in Java Library project Using Tai-e #113

Closed
vbifg opened this issue Jul 13, 2024 · 6 comments
Closed

Comments

@vbifg
Copy link

vbifg commented Jul 13, 2024

Overall Description

Hello,
I am currently analyzing a Java library application project named P_library using Tai-e. My goal is to extract the function signatures of all callers of the getData method defined in the example.java file across the P_library project. Since the library lacks a main function and the getData method contains bugs, I modified the onStart() function in EntryPointHandler.java within Tai-e. I set it to trigger the bug-invoking test method in the P_library project's JUnit tests, knowing this would reach the getData method, in an attempt to capture all callers of this method throughout the project. However, I've encountered some issues:

  1. The extracted callers are incomplete. During debugging, I realized that the construction of the call graph also starts from the entry point, which may be due to incomplete test coverage.
  2. The extraction is inaccurate. Although the method name getData is the same, the extracted callers are from another file's getData, not the one from example.java.

Could you please advise if it is possible to accurately extract the function signatures of all callers for the getData method defined in example.java within the P_library project using Tai-e?
I look forward to your response and am willing to provide any additional information needed. Thank you very much.

Expected Behavior

Extract the function signatures of all callers of the getData method defined in the example.java file across the P_library project.

Current Behavior

  1. The extracted callers are incomplete. During debugging, I realized that the construction of the call graph also starts from the entry point, which may be due to incomplete test coverage.
  2. The extraction is inaccurate. Although the method name getData is the same, the extracted callers are from another file's getData, not the one from example.java.

Tai-e Arguments

Click here to see Tai-e Options
{{optionsFile: null
printHelp: false
classPath:
- ../docker_code/P_Library/lib/junit.jar
- ../docker_code/P_Library/target/P_Library_tests.jar
- ../docker_code/P_Library/target/P_Library_source.jar
appClassPath: []
mainClass: null
inputClasses:
- org.example
javaVersion: 6
prependJVM: false
allowPhantom: false
worldBuilderClass: pascal.taie.frontend.soot.SootWorldBuilder
outputDir: output
preBuildIR: false
worldCacheMode: true
scope: APP
nativeModel: true
planFile: null
analyses:
pta: cs:2-obj;only-app:true;implicit-entries:true;handle-invokedynamic:true
cg: algorithm:pta;dump:true;dump-methods:true;dump-call-edges:true
process-result: "analyses:[cg];action:dump;action-file:caller_main.txt"
onlyGenPlan: false
keepResult:
- $KEEP-ALL
}}
Click here to see Tai-e Analysis Plan
{{- id: pta
options:
  cs: 2-obj
  only-app: true
  implicit-entries: true
  distinguish-string-constants: reflection
  merge-string-objects: true
  merge-string-builders: true
  merge-exception-objects: true
  handle-invokedynamic: true
  propagate-types:
  - reference
  advanced: null
  dump: false
  dump-ci: false
  dump-yaml: false
  expected-file: null
  reflection-inference: string-constant
  reflection-log: null
  taint-config: null
  taint-interactive-mode: false
  plugins: []
  time-limit: -1
- id: cg
options:
  algorithm: pta
  dump: true
  dump-methods: true
  dump-call-edges: true
- id: process-result
options:
  analyses:
  - cg
  only-app: true
  action: dump
  action-file: caller_main.txt
  log-mismatches: false
}}

Tai-e Log

Click here to see Tai-e Log
{{Writing log to /Users/kkk/Tai-e/output/tai-e.log
java.version: 17.0.11
java.version.date: 2024-04-16
java.runtime.version: 17.0.11+9-LTS
java.vendor: Amazon.com Inc.
java.vendor.version: Corretto-17.0.11.9.1
os.name: Mac OS X
os.version: 14.5
os.arch: x86_64
Tai-e Version: 0.5.1-SNAPSHOT
Tai-e Commit: f3e1891c84a3b493636683583267aff5ef698aa2
Writing analysis plan to /Users/kkk/Tai-e/output/tai-e-plan.yml
WorldBuilder starts ...
The world cache mode is enabled.
Loading the world cache from /Users/kkk/Tai-e/cache/world-cache-283076132.bin
[Load the world cache] elapsed time: 1426.66s
5879 classes with 57910 methods in the world
WorldBuilder finishes, elapsed time: 1426.86s
pta starts ...
}}

Additional Information

No response

@zhangt2333
Copy link
Member

Any examples? I'm not sure if I understand your question clearly enough.

@vbifg
Copy link
Author

vbifg commented Jul 13, 2024

I sincerely apologize for the inconvenience. After carefully tracking the actual types of object instances created during runtime, I have found that the caller information extracted by Tai-e is correct.

Attached is the relevant code content, and the Readme file contains information about the code structure and related issues. I would greatly appreciate your assistance in resolving these issues. Thank you very much for your time.

issue-113.zip

@zhangt2333
Copy link
Member

zhangt2333 commented Jul 13, 2024

Thank you for the detailed and reproducible material. It is one of the most thorough submissions I have seen in all Tai-e issues. I truly appreciate your effort.

However, after reviewing the materials, it feels like you've moved an elephant (your entire job) into our discussion room. As a Tai-e developer, this overwhelms me, despite my enthusiasm for solving your problems and those of other Tai-e users. I'm concerned that I don't have enough time to fully understand the entire context of your job.

To assist you more effectively, could you please break down your issue into smaller, more specific questions? This will allow us to focus on the particular problems you're encountering with Tai-e, rather than the entirety of your job.

Let me make an attempt to understand the problem. Are you asking how to efficiently find all the possible caller methods of a method <C: void m()>, e.g.,

public class C {
  public void m() {...}
}

while considering that you do not need to handle all entry methods by hand?

If so, Call Graph Construction via CHA (Class Hierarchy Analysis) might be what you need (while letting all methods in the program be entry methods), even though it may be low precision. If this is not the case, please correct my understanding.

@vbifg
Copy link
Author

vbifg commented Jul 13, 2024

Thank you for your patience. Indeed, my goal is to accurately find all the exact caller methods for a specific method. I apologize for not clearly describing the specific problem I encountered while using Tai-e, which caused you confusion.

I just started learning static analysis and using Tai-e. In my understanding, using the Pointer Analysis (PTA) algorithm can construct a call graph for a specific entry point with relatively high precision, but it handles call information not covered by the specific entry path poorly; while using Class Hierarchy Analysis (CHA) can build a more comprehensive call graph, but might be less precise.

Could you please confirm if my understanding is accurate?

@zhangt2333
Copy link
Member

zhangt2333 commented Jul 14, 2024

Yes, you are right. As we know, static analysis involves balancing soundness, precision, scalability, and automation. If you choose PTA for call graph construction, it offers better soundness and precision but has lower scalability and automation in your scenario. Tai-e's PTA is a top-down analysis that depends on specific entry points.

If you care a lot about precision, CHA might not be suitable; you need to consider how to find all entry points.

@vbifg
Copy link
Author

vbifg commented Jul 14, 2024

Understood, I appreciate your response immensely. Thank you!!!

@vbifg vbifg closed this as completed Jul 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants