Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get object names #126

Open
aascorreia opened this issue Nov 3, 2024 · 4 comments
Open

How to get object names #126

aascorreia opened this issue Nov 3, 2024 · 4 comments
Assignees

Comments

@aascorreia
Copy link

📝 Overall Description

Hello!

I'm currently performing Tai-e's PTA over a test class Counter with the following options:

-pp -cp bin -m classes.Counter -a pta=cs:1-call;implicit-entries:false;only-app:true;distinguish-string-constants:reflection;time-limit:-1;

The class itself looks like this:

public class Counter {
    public static void main(String... args) {
        Counter c1 = new Counter();
        Counter c2 = new Counter();
        increment(c1);
        increment(c2);
    }
    private int counter;
    static void increment(Counter c) { c.counter++; }
}

The goal in mind is to gather information regarding the objects invoking the various class methods, identifying the fields involved in read and/or write operations.

By iterating through LoadField and StoreField statements for each variable provided by PointerAnalysisResultImpl.getVars(), I am able to access the field references that are subject to read and write operations, respectively. This information is then used to populate a map whose keys correspond to said field references' names, and values store objects representing the variable's access information (method name and access type).

When running Tai-e's PTA over Counter, I get the following information:

{counter=[increment{READ}, increment{WRITE}, increment{READ}, increment{WRITE}]}

Since both c1 and c2 call increment, it makes sense that two instances of read and write operations are captured. The issue is that it would be ideal to separate these two instances into distinct map keys such that:

{c1.counter=[increment{READ}, increment{WRITE}], c2.counter=[increment{READ}, increment{WRITE}]}

While it is possible to obtain the points-to set of a given variable using PointerAnalysisResultImpl.getPointsToSet(Var), I cannot seem to find a way to "resolve" the retrieved Obj objects (example shown below) to get the corresponding object names that are present in the code (c1 and c2 in this case).

NewObj{<classes.Counter: void main(java.lang.String[])>[0@L5] new classes.Counter}
NewObj{<classes.Counter: void main(java.lang.String[])>[2@L6] new classes.Counter}

Is it feasible to obtain this information, or does the framework not allow it? Should I be using other analysis options or plugins?

Additionally, if Counter's field was instead a reference to another class which contains the counter that is modified in increment, would I have to call getPointsToSet recursively to get to c1.ref.counter, for example?

Thank you for your time.

🎯 Expected Behavior

Printing my map should output to the console:

{c1.counter=[increment{READ}, increment{WRITE}], c2.counter=[increment{READ}, increment{WRITE}]}

🐛 Current Behavior

Because I cannot obtain the actual object names currently, the console displays:

{counter=[increment{READ}, increment{WRITE}, increment{READ}, increment{WRITE}]}

🔄 Reproducible Example

No response

⚙️ Tai-e Arguments

No response

📜 Tai-e Log

No response

ℹ️ Additional Information

No response

@jjppp
Copy link
Collaborator

jjppp commented Nov 4, 2024

Hi @aascorreia.

Conceptually, method increment performs READ and WRITE operations on objects rather than variables.
Here c1 and c2 are variables, while NewObj{<classes.Counter: void main(java.lang.String[])>[0@L5] new classes.Counter} and NewObj{<classes.Counter: void main(java.lang.String[])>[2@L6] new classes.Counter} are objects.
Since objects can be passed around and assigned to variables with different names, the two Counter objects's name aren't really c1 and c2.

Tai-e prefix abstract objects with their allocation site (i.e., the method where allocations happen).
As you can see, those two Counter objects are allocated in the method void main(String[]).

A possible solution to your problem would be roughly like this:

  1. perform pointer analysis for the program to be analyzed
  2. for each Load instruction x = y.f in method increment, record a READ operation on objects in pts(y) (using possibly a map with Obj as its key), where pts(y) refers to the points-to-set of the variable y.
  3. for each Store instruction x.f = y, record a WRITE operation on objects in pts(x).
  4. for each variable of interests (e.g., c1 and c2 in method void main(String[])), scan the points-to-set of the variable, lookup what operations are performed on it, and store those operations back to a map whose keys are variables.

@aascorreia
Copy link
Author

aascorreia commented Nov 4, 2024

Thank you for shedding light on the difference between objects and variables. I did not consider that aspect of objects at first and can now understand why Obj does not store the name of either c1 or c2.

However, I am a bit confused with Step 4. For reference, here is the code that is being executed after analysis is done, as I believe Steps 2 and 3 have already been accomplished to some extent.

            PointerAnalysisResultImpl result = World.get().getResult("pta");
            Collection<CSVar> csVars = result.getCSVars();
            FieldAccessMap ptaInfo = new FieldAccessMap();
            if (!csVars.isEmpty())
                for (CSVar var : csVars) {
                    for (Obj obj : result.getPointsToSet(var.getVar()))
                        System.out.println(var.getVar().getName() + "=> " + obj);
                    System.out.println("-".repeat(100));

                    if (!var.getVar().getLoadFields().isEmpty())
                        for (LoadField lField : var.getVar().getLoadFields())
                            ptaInfo.recordAccess(
                                    lField.getFieldAccess().getFieldRef().getName(),
                                    var.getVar().getMethod().getName(),
                                    AccessType.READ
                            );

                    if (!var.getVar().getStoreFields().isEmpty())
                        for (StoreField sField : var.getVar().getStoreFields())
                            if (!sField.getRValue().getMethod().getSignature().contains("<init>"))
                                ptaInfo.recordAccess(
                                        sField.getFieldAccess().getFieldRef().getName(),
                                        var.getVar().getMethod().getName(),
                                        AccessType.WRITE
                                );
                }
            ptaInfo.printAccessMap();

FieldAccessMap is what holds the map I initially mentioned.

@jjppp
Copy link
Collaborator

jjppp commented Nov 5, 2024

I believe this is what you want.

if (!var.getVar().getLoadFields().isEmpty()) {
    for (LoadField lField : var.getVar().getLoadFields()) {
        for (Obj obj : result.getPointsToSet(var.getVar())) {
            ptaInfo.recordAccess(
                    obj + lField.getFieldAccess().getFieldRef().getName(),
                    var.getVar().getMethod().getName(),
                    AccessType.READ
            );
        }
    }
}

if (!var.getVar().getStoreFields().isEmpty()) {
    for (StoreField sField : var.getVar().getStoreFields()) {
        if (!sField.getRValue().getMethod().getSignature().contains("<init>")) {
            for (Obj obj : result.getPointsToSet(var.getVar())) {
                ptaInfo.recordAccess(
                        obj + sField.getFieldAccess().getFieldRef().getName(),
                        var.getVar().getMethod().getName(),
                        AccessType.WRITE
                );
            }
        }
    }
}

and the output will be something like this

NewObj{<Counter: void main(java.lang.String[])>[2@L4] new Counter}counter: [<increment, READ>, <increment, WRITE>, <increment, READ>, <increment, WRITE>]
NewObj{<Counter: void main(java.lang.String[])>[0@L3] new Counter}counter: [<increment, READ>, <increment, WRITE>, <increment, READ>, <increment, WRITE>]

@aascorreia
Copy link
Author

I see. Given your answer though, I'm assuming there really is no way to know the name of a variable since, from what I understood, var.getVar().getName() retrieves a reference to memory rather than an actual name, and NewObj is referencing the object itself.

I was hoping that, even if Tai-e cannot provide that bit of information (c1 and c2 as names), it would be possible to implement a new plugin, or use an existing one, for that effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants