Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsoundness in SIF due to encoding of object allocation #152

Closed
tobycmurray opened this issue Sep 7, 2022 · 3 comments
Closed

Unsoundness in SIF due to encoding of object allocation #152

tobycmurray opened this issue Sep 7, 2022 · 3 comments

Comments

@tobycmurray
Copy link

The modular product program encoding assumes that any object allocation reached in both executions produces an object reference that is low. However real allocators don’t behave that way: allocation decisions depend on what previous allocations have occurred. The following example demonstrates this potential unsoundness in current Nagini:

from nagini_contracts.contracts import *

class MyObject(object):
    pass

def sif_print_str(x: str) -> None:
    Requires(Low(x))
    Requires(LowVal(x))
    pass

def test(x: int) -> None:
    if x > 0:
      m1 = MyObject()
    m2 = MyObject()
    sif_print_str(str(m2))

Nagini verifies this correctly (using command line nagini –sif true). Yet on my machine, the first allocation always yields an object whose least significant 8 bits of the address exposed in its str is “c0” while for the second allocation these bits are reliably “60” instead.

$ python
Python 3.8.13 (default, Apr 19 2022, 00:53:22) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class MyObject(object):
...     pass
... 
>>> m1 = MyObject()
>>> m2 = MyObject()
>>> str(m1)
'<__main__.MyObject object at 0x7f214e94d4c0>'
>>> str(m2)
'<__main__.MyObject object at 0x7f214e8a5d60>'
@marcoeilers
Copy link
Owner

Thanks for filing the issue!

I'm discussing this problem in the final version of the thesis, of course. For the implementation, I'm not yet sure what's the best solution. We could, as you suggest, either treat new allocations as low if all previous low allocations were low events, but that would require new specifications to be modular, or just enforce that all allocations are low events, which would be quite restrictive.

An alternative that I'm currently leaning toward is this: The encoding we use is sound only if the actual addresses of objects are never visible, i.e., cannot be inspected except for equality with other references. This is the case throughout most of Python (which is why we're using that encoding), but as you show, the default implementation of str is an exception and leaks the actual address value. So instead of changing the encoding of allocation, we could (in general) treat the output of object.str as high, since it gives access to values that we don't want to be visible, and do the same with any other built-in methods that leak address information.

@tobycmurray
Copy link
Author

That's a really nice solution (treating the output of "str" as high by default since it exposes address information). I'm not sure if there are other places in Python that would expose address information that would also need to be treated as "high".

Thanks for looking into this and well done again on your thesis. I'm looking forward to pointing my students toward it once the final version is published

marcoeilers added a commit that referenced this issue Aug 22, 2023
@marcoeilers
Copy link
Owner

Finally fixed in PR #155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants