Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stability problem #2

Open
zhouxucs opened this issue May 21, 2018 · 8 comments
Open

stability problem #2

zhouxucs opened this issue May 21, 2018 · 8 comments
Assignees
Labels
enhancement New feature or request

Comments

@zhouxucs
Copy link
Contributor

The stability of AFL is not 100% (lower than 20%) due to decoding TNT packets. If we only decode TIP packet, the stability is better but is still not 100%.

@zhanggenex
Copy link
Member

zhanggenex commented May 22, 2018

What stability means and the possible causes for low stability in AFL

That last bit is actually fairly interesting: it measures the consistency of
observed traces. If a program always behaves the same for the same input data,
it will earn a score of 100%. When the value is lower but still shown in purple,
the fuzzing process is unlikely to be negatively affected. If it goes into red,
you may be in trouble, since AFL will have difficulty discerning between
meaningful and "phantom" effects of tweaking the input file.

Now, most targets will just get a 100% score, but when you see lower figures,
there are several things to look at:

  • The use of uninitialized memory in conjunction with some intrinsic sources
    of entropy in the tested binary. Harmless to AFL, but could be indicative
    of a security bug.

  • Attempts to manipulate persistent resources, such as left over temporary
    files or shared memory objects. This is usually harmless, but you may want
    to double-check to make sure the program isn't bailing out prematurely.
    Running out of disk space, SHM handles, or other global resources can
    trigger this, too.

  • Hitting some functionality that is actually designed to behave randomly.
    Generally harmless. For example, when fuzzing sqlite, an input like
    'select random();' will trigger a variable execution path.

  • Multiple threads executing at once in semi-random order. This is harmless
    when the 'stability' metric stays over 90% or so, but can become an issue
    if not. Here's what to try:

    • Use afl-clang-fast from llvm_mode/ - it uses a thread-local tracking
      model that is less prone to concurrency issues,

    • See if the target can be compiled or run without threads. Common
      ./configure options include --without-threads, --disable-pthreads, or
      --disable-openmp.

    • Replace pthreads with GNU Pth (https://www.gnu.org/software/pth/), which
      allows you to use a deterministic scheduler.

  • In persistent mode, minor drops in the "stability" metric can be normal,
    because not all the code behaves identically when re-entered; but major
    dips may signify that the code within __AFL_LOOP() is not behaving
    correctly on subsequent iterations (e.g., due to incomplete clean-up or
    reinitialization of the state) and that most of the fuzzing effort goes
    to waste.

The paths where variable behavior is detected are marked with a matching entry
in the <out_dir>/queue/.state/variable_behavior/ directory, so you can look
them up easily.

@zhanggenex
Copy link
Member

How is stability calculated in the code?

  1. In function calibrate_case(), run_target() is called.
  2. first_trace[] and trace_bits[] is compared to update var_bytes[] and var_detected.
  3. var_byte_count is calculated by count_bytes(var_bytes). This function counts the non-zero byte in var_bytes.
  4. t_bytes = count_non_255_bytes(virgin_bits); This function counts the non-255 bytes in virgin_bits.
  5. if t_bytes is not zero, stab_ratio = 100 - ((double)var_byte_count) * 100 / t_bytes; else, stab_ratio is set to 100%.

@zhanggenex
Copy link
Member

The 1st known causes of low stability in ptfuzzer.

There are tip_fup packets in the packets recorded by PT. And tip_fup, along with several following packets, will cause unstable behaviors for the fuzzing process, thus resulting in low stability.

We found 2 patterns of tip_fup packets can be deleted from raw PT packets.

  1. tip_fup: xxx
    only one tip_fup alone
  2. tip_fup: xxx
    tip_pgd: 0
    tip_pge: xxx
    3 packets in total

And when these 2 patterns are deleted, we may get a higher stability in ptfuzzer.

@zhouxucs
Copy link
Contributor Author

After applying the aforementioned patterns to filter fup packets, the traces still differs for two command line executions according to my test.

The first one is:

./ptest/readelf -a ./ptest/readelf

However, the traces differ in file length only. So we assume the buffer we use for storing PT packets if full.

This problem is confirmed. When we enlarge the perf aux buffer (_HF_PERF_AUX_SZ in pt.h) from 1M to 16M, and the problem is gone.

The second one is:

/bin/ls

However, the traces differs not in fup packets.

So we assume we write log files to the current directory while ls is running. It should be confirmed later.

@zhanggenex
Copy link
Member

zhanggenex commented May 29, 2018

stability after filtering the two fup packets patterns in ptfuzzer
We filtered the two above fup patterns in the decoding process of ptfuzzer, and had these two results:

  1. when we exclude tnt packets in decoding, stability still goes down, but in a much slower speed compared to before, decreasing around 1-2% in 1 minute.
  2. when we include tnt packets in decoding, stability goes down extremely in the first few minutes. And it also decreases around 1-2% in 1 minute later.

Anyway, stability still goes down after filtering the two fup patterns

However, something is interesting. in tip mode, when we exclude tip_pge and tip_pgd packets from the decoding process, stability is always 100% and ptfuzzer found several crashes in this situation
The cause for this testing result still need to be discussed.

@zhanggenex
Copy link
Member

zhanggenex commented May 30, 2018

Decoding psb packet
By examining the debug information, we found that there aer still some undecoded tnt packets after decoding tip_pgd, which should not exist if the decoding is correct.
Usually packets look like this:

tip_pge: 11111
tnt NN
tnt TN
tip_pgd: 22222

But when there are tip_fup packets generated, they look like this:

tip_pge: 11111
tnt NN
psb
tip_fup: 33333
tnt TN
tip_pgd: 22222

And psb packet will reset some values to 0, and tnt packet between tip_pge and psb will not be processed, so it will cause low stability in ptfuzzer.
After fixing this problem, stability can stay at around 95%, and AFL's doc claims that this number of stability, even though not 100%, won't have much impact on fuzzing.

@zhouxucs
Copy link
Contributor Author

zhouxucs commented Jun 27, 2018

测试pandoc时发现了一种新的扰动模式:

last equal:  tip: 40910c
tip: 40910c				tip: 40910c
tip: 2b15354				tip_pgd: 0
tip: 40910c				tip_pge: 2afd270
tip: 2b15354				tip: 2afd279
tip: 40910c				tip: 2afd33b
tip: 2b15354				tip_pgd: 7ffff6a6e150
tip: 40910c				tip_pge: 407c41
tip: 2b15429				tip: 2b15354
tip: 4090c5				tip: 40910c
tip: 2b15429				tip: 2b15354

这种新的模型仍然由fup引起,上面fup包已经被删除,原始fup包应该在tip_gpd之前。与之前fup包不同的是,此处fup包引起了一个有效的pge,在这个pge中由两个tip包,即tip: 2afd279tip: 2afd33b,按照此前的规则处理方式,这两个tip包是不能被忽略的,从而导致不稳定性。

从地址可以看出,此处多引入的两个tip包仍然在程序的有效地址范围之内,因此怀疑此处是程序对某种信号的处理代码,从而由系统的中断处理函数进入了程序自定义的信号处理函数。

通过设定地址过滤的upper limit,让这段信号处理代码不被记录,在fuzz初期stability达到了100%,但在半分钟之后降至69%,估计可能是因为其他原因引起。

@zhouxucs
Copy link
Contributor Author

Pandoc is programmed using Haskell.

@zhanggenex zhanggenex added the enhancement New feature or request label Apr 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants