Unsoundness in SIF due to encoding of object allocation #152

tobycmurray · 2022-09-07T01:59:21Z

The modular product program encoding assumes that any object allocation reached in both executions produces an object reference that is low. However real allocators don’t behave that way: allocation decisions depend on what previous allocations have occurred. The following example demonstrates this potential unsoundness in current Nagini:

from nagini_contracts.contracts import *

class MyObject(object):
    pass

def sif_print_str(x: str) -> None:
    Requires(Low(x))
    Requires(LowVal(x))
    pass

def test(x: int) -> None:
    if x > 0:
      m1 = MyObject()
    m2 = MyObject()
    sif_print_str(str(m2))

Nagini verifies this correctly (using command line nagini –sif true). Yet on my machine, the first allocation always yields an object whose least significant 8 bits of the address exposed in its str is “c0” while for the second allocation these bits are reliably “60” instead.

$ python
Python 3.8.13 (default, Apr 19 2022, 00:53:22) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class MyObject(object):
...     pass
... 
>>> m1 = MyObject()
>>> m2 = MyObject()
>>> str(m1)
'<__main__.MyObject object at 0x7f214e94d4c0>'
>>> str(m2)
'<__main__.MyObject object at 0x7f214e8a5d60>'

The text was updated successfully, but these errors were encountered:

marcoeilers · 2022-09-07T09:07:08Z

Thanks for filing the issue!

I'm discussing this problem in the final version of the thesis, of course. For the implementation, I'm not yet sure what's the best solution. We could, as you suggest, either treat new allocations as low if all previous low allocations were low events, but that would require new specifications to be modular, or just enforce that all allocations are low events, which would be quite restrictive.

An alternative that I'm currently leaning toward is this: The encoding we use is sound only if the actual addresses of objects are never visible, i.e., cannot be inspected except for equality with other references. This is the case throughout most of Python (which is why we're using that encoding), but as you show, the default implementation of str is an exception and leaks the actual address value. So instead of changing the encoding of allocation, we could (in general) treat the output of object.str as high, since it gives access to values that we don't want to be visible, and do the same with any other built-in methods that leak address information.

tobycmurray · 2022-09-08T00:21:32Z

That's a really nice solution (treating the output of "str" as high by default since it exposes address information). I'm not sure if there are other places in Python that would expose address information that would also need to be treated as "high".

Thanks for looking into this and well done again on your thesis. I'm looking forward to pointing my students toward it once the final version is published

marcoeilers · 2023-08-22T21:55:06Z

Finally fixed in PR #155

marcoeilers added a commit that referenced this issue Aug 22, 2023

Fixing issue #152

d032b72

marcoeilers closed this as completed Aug 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unsoundness in SIF due to encoding of object allocation #152

Unsoundness in SIF due to encoding of object allocation #152

tobycmurray commented Sep 7, 2022

marcoeilers commented Sep 7, 2022

tobycmurray commented Sep 8, 2022

marcoeilers commented Aug 22, 2023

Unsoundness in SIF due to encoding of object allocation #152

Unsoundness in SIF due to encoding of object allocation #152

Comments

tobycmurray commented Sep 7, 2022

marcoeilers commented Sep 7, 2022

tobycmurray commented Sep 8, 2022

marcoeilers commented Aug 22, 2023