Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False positive because the value domain doesn't split/merge cell variables #117

Open
ceesb opened this issue Apr 19, 2019 · 6 comments
Open
Labels
C-false-positive Category: False Positive L-c Language: C P-medium Priority: Medium

Comments

@ceesb
Copy link

ceesb commented Apr 19, 2019

Hi,

I am playing with some code that uses memset (or initializers like below) to set defaults in structures. I'm running into an issue there. This ikos_assert can not be proven:

#include <ikos/analyzer/intrinsic.h>
#include <stdint.h>

struct a {
	uint8_t b;
};


int main() {
	struct a i = {0};
	__ikos_assert(i.b == 0);
	return 0;
}

But this one can, the only difference being the value of the constant:

#include <ikos/analyzer/intrinsic.h>
#include <stdint.h>

struct a {
	uint8_t b;
};


int main() {
	struct a i = {1};
	__ikos_assert(i.b == 1);
	return 0;
}

Out of curiosity, I added the --display-ar flag to ikos to print the abstract representation. For the first example, I see the initializer is translated into ar.memset:

define si32 @main() {
#entry !entry successors={#1, #2} {
  {0: ui8}* $i = allocate {0: ui8}, 1, align 1
  si8* %1 = bitcast $i
  call @ar.memset(%1, 0, 1, 1, 0)
....

For the second example, I see ar.memcpy instead:

define si32 @main() {
#entry !entry successors={#1, #2} {
  {0: ui8}* $i = allocate {0: ui8}, 1, align 1
  si8* %1 = bitcast $i
  si8* %2 = ptrshift @__const.main.i, 1 * 0, 1 * 0
  call @ar.memcpy(%1, %2, 1, 1, 1, 0)
...

Any ideas as to what's happening? I may be able to fix if you point me into the right direction.

thx!
Cees

@arthaud arthaud added L-c Language: C C-false-positive Category: False Positive labels Apr 20, 2019
@arthaud
Copy link
Member

arthaud commented Apr 20, 2019

Hi @ceesb,

Thanks for trying ikos.

This is an imprecision of the memory analysis. The current implementation abstracts the memory using cell variables. A cell variable is a symbolic variable that represents the value at a given memory location, offset, and size.

In the second example, there is a constant structure __const.main.i that has a global initializer. This creates the cell variable Cell{__const.main.i, offset=0, size=1} with a value of 1. The memcpy then copies this cell into the local variable. This works fine.

In the first example, the local variable is initialized with memset. The current implementation of the abstract memset (see here) only updates existing cell variables. It should also create one big cell variable equal to zero.

The reason why it is not currently implemented is because the current implementation does not know how to split cell variables, thus creating a big cell for the whole structure is pointless. If you have a structure with multiple fields, and you create a big cell of the size of the structure that is equal to zero, when you will read a specific field, the big cell will just be ignored, and the result will be unknown. This could be improved too, but we should be careful about endianness (see here and here).

In this case, fixing memset to create a big cell should be enough to fix the problem. Would you like to try to implement it?

@ceesb
Copy link
Author

ceesb commented Apr 22, 2019

I see now that the memcpy/memset difference is introduced by LLVM.

I tricked myself by incorrectly assuming that the "{1}" initializer would write 1's in the entire structure, but that's not true, and ikos rightly points out that the assertion below never holds ;-) Got to love C ..

#include <ikos/analyzer/intrinsic.h>
#include <stdint.h>

struct a {
	uint8_t b;
	uint8_t c;
};


int main() {
	struct a i = {1};
	__ikos_assert(i.c == 1);
	return 0;
}

Just to understand the underlying issue with the memory cells you dscribe, I constructed another example using a struct with multiple elements of different sizes. The last assertion cannot be proven (the platform is little endian). This is caused by ikos not being able to split cells?

#include <ikos/analyzer/intrinsic.h>
#include <stdint.h>

struct __attribute__((__packed__)) a {
	uint8_t a;
	uint8_t b;
	uint32_t c;
};

int main() {
	uint8_t buf[6] = {0xce,0xe5,0x01,0x00,0x00,0x00};

	struct a *i = (struct a*)&buf;
	__ikos_assert(i->b == 0xe5);
	__ikos_assert(i->c == 1);
	return 0;
}

@arthaud
Copy link
Member

arthaud commented Apr 22, 2019

Yes, but in that case the problem is the opposite: it doesn't merge the cells.

@ceesb
Copy link
Author

ceesb commented Apr 22, 2019

OK, thanks. I understand that the missing cell splitting and merging in ikos affects its capabilities on code that uses the case-to-struct trick on arrays of primitives types, and code that heavily depends on static initializers.

I have another question. The target code I'm looking at is mixing pointers and primitive types. Ikos doesn't seem to appreciate this. The following example causes ikos to report it cannot bound the offset of a->b.

#include <stdint.h>
#include <stdio.h>

struct a {
	uint8_t b;
};

struct b {
	struct a a;
	intptr_t p;
	uint8_t c;
};

void test(struct b *b) {
	struct a *a = NULL;

	a = (struct a*)(b->p);

	if(a->b) {
		printf("if branch\n");
	} else {
		printf("else branch\n");
	}
}

int main() {
	struct a a;
	a.b = 0;
	struct b b;
	b.p = (uintptr_t)&a;
	
	test(&b);
	
	return 0;
}

Is this because once you determine a variable to be a primitive type it may never be used as a pointer? A limitation of the memory model?

I hope you don't mind me peeking and poking a little bit, I'm just trying to understand how I can make the most use of ikos on some real-world code I'm reviewing.

@arthaud
Copy link
Member

arthaud commented Apr 22, 2019

OK, thanks. I understand that the missing cell splitting and merging in ikos affects its capabilities on code that uses the case-to-struct trick on arrays of primitives types, and code that heavily depends on static initializers.

Yes. So far, I didn't have any problem with it when analyzing safety critical code, but I imagine this can be a problem for some users.

I have another question. The target code I'm looking at is mixing pointers and primitive types. Ikos doesn't seem to appreciate this. The following example causes ikos to report it cannot bound the offset of a->b.

The problem here is the cast from pointer to integer. When converting to an integer, we are losing all the relevant information on the pointer.

You can fix this by using a void* instead of uintptr_t in the structure.
Another trick would be to cast b.p to struct a* when using it:

// In test
a = *((struct a**)(&b->p));
// In main
*((struct a**)(&b.p)) = &a;

But this is not easy to read..

I hope you don't mind me peeking and poking a little bit, I'm just trying to understand how I can make the most use of ikos on some real-world code I'm reviewing.

I don't mind at all. I hope this can help other users as well.

@arthaud arthaud changed the title Memset issue False positive because the value domain doesn't split/merge cell variables Jul 9, 2019
@arthaud arthaud added the P-medium Priority: Medium label Jul 10, 2019
@ivanperez-keera
Copy link
Collaborator

@ceesb I know it's been a few years, but I'm following up on this to see if you'd like to contribute this improvement for a future version of IKOS. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-false-positive Category: False Positive L-c Language: C P-medium Priority: Medium
Projects
None yet
Development

No branches or pull requests

3 participants