How to get object names #126

aascorreia · 2024-11-03T13:25:43Z

📝 Overall Description

Hello!

I'm currently performing Tai-e's PTA over a test class Counter with the following options:

-pp -cp bin -m classes.Counter -a pta=cs:1-call;implicit-entries:false;only-app:true;distinguish-string-constants:reflection;time-limit:-1;

The class itself looks like this:

public class Counter {
    public static void main(String... args) {
        Counter c1 = new Counter();
        Counter c2 = new Counter();
        increment(c1);
        increment(c2);
    }
    private int counter;
    static void increment(Counter c) { c.counter++; }
}

The goal in mind is to gather information regarding the objects invoking the various class methods, identifying the fields involved in read and/or write operations.

By iterating through LoadField and StoreField statements for each variable provided by PointerAnalysisResultImpl.getVars(), I am able to access the field references that are subject to read and write operations, respectively. This information is then used to populate a map whose keys correspond to said field references' names, and values store objects representing the variable's access information (method name and access type).

When running Tai-e's PTA over Counter, I get the following information:

{counter=[increment{READ}, increment{WRITE}, increment{READ}, increment{WRITE}]}

Since both c1 and c2 call increment, it makes sense that two instances of read and write operations are captured. The issue is that it would be ideal to separate these two instances into distinct map keys such that:

{c1.counter=[increment{READ}, increment{WRITE}], c2.counter=[increment{READ}, increment{WRITE}]}

While it is possible to obtain the points-to set of a given variable using PointerAnalysisResultImpl.getPointsToSet(Var), I cannot seem to find a way to "resolve" the retrieved Obj objects (example shown below) to get the corresponding object names that are present in the code (c1 and c2 in this case).

NewObj{<classes.Counter: void main(java.lang.String[])>[0@L5] new classes.Counter}
NewObj{<classes.Counter: void main(java.lang.String[])>[2@L6] new classes.Counter}

Is it feasible to obtain this information, or does the framework not allow it? Should I be using other analysis options or plugins?

Additionally, if Counter's field was instead a reference to another class which contains the counter that is modified in increment, would I have to call getPointsToSet recursively to get to c1.ref.counter, for example?

Thank you for your time.

🎯 Expected Behavior

Printing my map should output to the console:

{c1.counter=[increment{READ}, increment{WRITE}], c2.counter=[increment{READ}, increment{WRITE}]}

🐛 Current Behavior

Because I cannot obtain the actual object names currently, the console displays:

{counter=[increment{READ}, increment{WRITE}, increment{READ}, increment{WRITE}]}

🔄 Reproducible Example

No response

⚙️ Tai-e Arguments

No response

📜 Tai-e Log

No response

ℹ️ Additional Information

No response

The text was updated successfully, but these errors were encountered:

jjppp · 2024-11-04T10:46:53Z

Hi @aascorreia.

Conceptually, method increment performs READ and WRITE operations on objects rather than variables.
Here c1 and c2 are variables, while NewObj{<classes.Counter: void main(java.lang.String[])>[0@L5] new classes.Counter} and NewObj{<classes.Counter: void main(java.lang.String[])>[2@L6] new classes.Counter} are objects.
Since objects can be passed around and assigned to variables with different names, the two Counter objects's name aren't really c1 and c2.

Tai-e prefix abstract objects with their allocation site (i.e., the method where allocations happen).
As you can see, those two Counter objects are allocated in the method void main(String[]).

A possible solution to your problem would be roughly like this:

perform pointer analysis for the program to be analyzed
for each Load instruction x = y.f in method increment, record a READ operation on objects in pts(y) (using possibly a map with Obj as its key), where pts(y) refers to the points-to-set of the variable y.
for each Store instruction x.f = y, record a WRITE operation on objects in pts(x).
for each variable of interests (e.g., c1 and c2 in method void main(String[])), scan the points-to-set of the variable, lookup what operations are performed on it, and store those operations back to a map whose keys are variables.

aascorreia · 2024-11-04T19:55:17Z

Thank you for shedding light on the difference between objects and variables. I did not consider that aspect of objects at first and can now understand why Obj does not store the name of either c1 or c2.

However, I am a bit confused with Step 4. For reference, here is the code that is being executed after analysis is done, as I believe Steps 2 and 3 have already been accomplished to some extent.

            PointerAnalysisResultImpl result = World.get().getResult("pta");
            Collection<CSVar> csVars = result.getCSVars();
            FieldAccessMap ptaInfo = new FieldAccessMap();
            if (!csVars.isEmpty())
                for (CSVar var : csVars) {
                    for (Obj obj : result.getPointsToSet(var.getVar()))
                        System.out.println(var.getVar().getName() + "=> " + obj);
                    System.out.println("-".repeat(100));

                    if (!var.getVar().getLoadFields().isEmpty())
                        for (LoadField lField : var.getVar().getLoadFields())
                            ptaInfo.recordAccess(
                                    lField.getFieldAccess().getFieldRef().getName(),
                                    var.getVar().getMethod().getName(),
                                    AccessType.READ
                            );

                    if (!var.getVar().getStoreFields().isEmpty())
                        for (StoreField sField : var.getVar().getStoreFields())
                            if (!sField.getRValue().getMethod().getSignature().contains("<init>"))
                                ptaInfo.recordAccess(
                                        sField.getFieldAccess().getFieldRef().getName(),
                                        var.getVar().getMethod().getName(),
                                        AccessType.WRITE
                                );
                }
            ptaInfo.printAccessMap();

FieldAccessMap is what holds the map I initially mentioned.

jjppp · 2024-11-05T07:24:18Z

I believe this is what you want.

if (!var.getVar().getLoadFields().isEmpty()) {
    for (LoadField lField : var.getVar().getLoadFields()) {
        for (Obj obj : result.getPointsToSet(var.getVar())) {
            ptaInfo.recordAccess(
                    obj + lField.getFieldAccess().getFieldRef().getName(),
                    var.getVar().getMethod().getName(),
                    AccessType.READ
            );
        }
    }
}

if (!var.getVar().getStoreFields().isEmpty()) {
    for (StoreField sField : var.getVar().getStoreFields()) {
        if (!sField.getRValue().getMethod().getSignature().contains("<init>")) {
            for (Obj obj : result.getPointsToSet(var.getVar())) {
                ptaInfo.recordAccess(
                        obj + sField.getFieldAccess().getFieldRef().getName(),
                        var.getVar().getMethod().getName(),
                        AccessType.WRITE
                );
            }
        }
    }
}

and the output will be something like this

NewObj{<Counter: void main(java.lang.String[])>[2@L4] new Counter}counter: [<increment, READ>, <increment, WRITE>, <increment, READ>, <increment, WRITE>]
NewObj{<Counter: void main(java.lang.String[])>[0@L3] new Counter}counter: [<increment, READ>, <increment, WRITE>, <increment, READ>, <increment, WRITE>]

aascorreia · 2024-11-05T08:32:24Z

I see. Given your answer though, I'm assuming there really is no way to know the name of a variable since, from what I understood, var.getVar().getName() retrieves a reference to memory rather than an actual name, and NewObj is referencing the object itself.

I was hoping that, even if Tai-e cannot provide that bit of information (c1 and c2 as names), it would be possible to implement a new plugin, or use an existing one, for that effect.

aascorreia added the type: question label Nov 3, 2024

zhangt2333 assigned jjppp Nov 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get object names #126

How to get object names #126

aascorreia commented Nov 3, 2024

jjppp commented Nov 4, 2024

aascorreia commented Nov 4, 2024 •

edited

Loading

jjppp commented Nov 5, 2024

aascorreia commented Nov 5, 2024

How to get object names #126

How to get object names #126

Comments

aascorreia commented Nov 3, 2024

📝 Overall Description

🎯 Expected Behavior

🐛 Current Behavior

🔄 Reproducible Example

⚙️ Tai-e Arguments

📜 Tai-e Log

ℹ️ Additional Information

jjppp commented Nov 4, 2024

aascorreia commented Nov 4, 2024 • edited Loading

jjppp commented Nov 5, 2024

aascorreia commented Nov 5, 2024

aascorreia commented Nov 4, 2024 •

edited

Loading