Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help compiling cgc programs #7

Open
man9ourah opened this issue Sep 10, 2024 · 9 comments
Open

Help compiling cgc programs #7

man9ourah opened this issue Sep 10, 2024 · 9 comments

Comments

@man9ourah
Copy link

Hello!

I am trying to run Beacon against cgc benchmarks. For about 100 of them which are the ones I tested Beacon against, Beacon seems to just insert an exit call right at the beginning of the main function, deeming the binary useless. I only run Beacon against linked llvm bitcodes so everything should be there and the binaries are small enough for static analysis I think. I am wondering if I am doing something wrong? Let me put up a case here as an example with the simplest cgc binary CADET_00001. Its source code (with all needed library, in one unit):

service.c:

//CADET's first C program

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>

#define HI "\nWelcome to Palindrome Finder\n\n"
#define ASK "\tPlease enter a possible palindrome: "
#define YES "\t\tYes, that's a palindrome!\n\n"
#define NO "\t\tNope, that's not a palindrome\n\n"
#define EASTEREGG "\n\nEASTER EGG!\n\n"

int cgc_check();

int main(int cgc_argc, char *cgc_argv[]) {
    int r;

    printf("%s", HI);

    while(1){
        printf("%s", ASK);
        r = cgc_check();
        if (r == -1){
            break;
        }
        else if (r == 0){
          printf("%s", NO);
        }
        else{
          printf("%s", YES);
        }
    }
    return 0;
}

/* Receives data from another CGC process. */
int cgc_receive(int fd, void *buf, unsigned long count, unsigned long *rx_bytes) {
    const long ret = read(fd, buf, count);

    if (ret < 0) {
        return errno;
    } else if (rx_bytes != NULL) {
        *rx_bytes = ret;
    }

    return 0;
}


int cgc_receive_delim(int fd, char *buf, const unsigned long size, char delim) {
    unsigned long rx = 0;
    unsigned long rx_now = 0;
    int ret;

    if (!buf)
        return 1;

    if (!size)
        return 2;

    while (rx < size) {
        ret = cgc_receive(fd, buf + rx, 1, &rx_now);
        if (rx_now == 0) {
            //should never return until at least something was received
            //so consider this an error too
            return 3;
        }
        if (ret != 0) {
            return 3;
        }
        if (buf[rx] == delim) {
           break;
        }
        rx += rx_now;
    }

    return 0;
}

int cgc_check(){
    int len = -1;
    int i;
    int pal = 1;
    char string[64];
    for (i = 0; i < sizeof(string); i++)
        string[i] = '\0';
    if (cgc_receive_delim(0, string, 128, '\n') != 0)
        return -1;
    for(i = 0; string[i] != '\0'; i++){
        len++;
    }
    int steps = len;
    if(len % 2 == 1){
        steps--;
    }
    for(i = 0; i <= steps/2; i++){
        if(string[i] != string[len-1-i]){
            pal = 0;
        }
    }
    if(string[0] == '^'){
      printf("%s", EASTEREGG);
    }
    return pal;
}

The target here is server.c:88; a simple buffer overflow. Here is the exact commands I ran afterwards:

/home/mansour/cgc-exp/fuzzers/Beacon/llvm4/bin/clang -m32 -flto -fuse-ld=gold -Wl,-plugin-opt=save-temps service.c -o service

This gives out a working service binary. Now for Beacon:

# compile bc:
/home/mansour/cgc-exp/fuzzers/Beacon/llvm4/bin/clang -g -m32 -flto -fuse-ld=gold -Wl,-plugin-opt=save-temps service.c -c -emit-llvm -o service.bc
# Precond:
/home/mansour/cgc-exp/fuzzers/Beacon/precondInfer ./service.bc --target-file=target.txt --join-bound=5
# Ins
/home/mansour/cgc-exp/fuzzers/Beacon/Ins --output=fuzz-Beacon-CADET_00001-instrumented.bc -byte -blocks=bbreaches__target.txt -afl -log=log.txt -load=./range_res.txt ./transed.bc
# compiling
/home/mansour/cgc-exp/fuzzers/Beacon/llvm4/bin/clang -m32 -flto -fuse-ld=gold -Wl,-plugin-opt=save-temps fuzz-Beacon-CADET_00001-instrumented.bc -o fuzz-Beacon-CADET_00001  /home/mansour/cgc-exp/fuzzers/Beacon/afl-llvm-rt-32.o

Non of those commands fail, all complete successfully as far as return codes. However, as I said in my intro, we can see the exit call inserted by Beacon in fuzz-Beacon-CADET_00001-instrumented.bc; the output of Beacon's Ins.

Any chance you can give me a clue what contributes to this? I am open to suggestions even as far as minor modification of the code being tested to maybe remove problamatic library calls that impact static analysis this severely even on a very basic binary.

Thank you so much for your help!

@5hadowblad3
Copy link
Owner

Hi, thanks for your test.

Can you remove the -blocks=bbreaches__target.txt in the following command and test whether the problem still exists?

/home/mansour/cgc-exp/fuzzers/Beacon/Ins --output=fuzz-Beacon-CADET_00001-instrumented.bc -byte -blocks=bbreaches__target.txt -afl -log=log.txt -load=./range_res.txt ./transed.bc

@man9ourah
Copy link
Author

Thank you for the response!

I am not sure if it solves the problem but we dont have the call to exit at main's head anymore. Can you give me an insight on what is happening how does this contribute to solving the issue? Does this still prune paths?

@5hadowblad3
Copy link
Owner

It is the incompleteness issue of the bc files provided not including all the code that causes the reachability analysis not to find reachable paths to the target. Meanwhile, the reachability analysis has different granularity since different alternatives may perform better for diverse projects. Therefore, the main function can be pruned since it is not found through our backward propagation. You can refer to the answer in the FAQ and use our script that provides a different version of the reachability analysis (, which is also the one we used for the paper evaluation).

PS: The correct compilation process that can include everything in one bc file seems to be a very crucial obstacle for using static analysis in real-world applications.

@man9ourah
Copy link
Author

Thank you for your response. Respectfully, I dont think this is the problem. The example above is a single-unit file, so I dont think there is incompleteness in terms of bitcode files in this particular example. Unless you mean the stdlib (which is the only thing missing here), then all programs will be missing that.. no?

I think this maybe a bug, or I am misusing something.
Here is the simplest possible example (target is line number 10):

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int cgc_argc, char *cgc_argv[]) {
    char string[64];
    const long ret = read(0, string, 62);

    if(string[0] == 0x41){
      printf("You caught me");
    }else{
      printf("Cool");
    }
}

Running all the commands above succeed. Yet Beacon still early-terminate at main function head. There is not indirect-calls, no inter-procedural dependency, no pointer deference. I am wondering if you can reproduce this or is it something with my tool chain or execution.

Thanks for the help!

@5hadowblad3
Copy link
Owner

5hadowblad3 commented Sep 25, 2024

Everything is working on my side. Here are the related files generated for your reference.

Make sure you have the correct target location and the bc file with the debug info.

demo.tar.zip

@man9ourah
Copy link
Author

Hmm. Something must off in my steps. Does my commands above looks correct? Could you please also upload log.txt, range_res.txt and the output bitcode of Beacon's Ins? I would like to compare with my output to know where I went wrong. If you used a different commands than mine above please let me know as well. I really appreciate your help and response! Thanks.

@man9ourah
Copy link
Author

My clang command does include the debug info and target location seem to be found with Beacon. Also, I seem to generate the same bbreaches.txt file, so thats why I want to compare the rest of the output files. Thanks again!

@man9ourah
Copy link
Author

As far as my environment, I am using the Beacon docker container and running the test above inside it.

@5hadowblad3
Copy link
Owner

You should use the tool to compile the code from scratch. The version in the container has some bugs that have been modified. If you are still having issues, please let me know.

I have been really busy recently and will come back to this issue afterward. Sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants