Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a tier-1 JIT compiler based on x86-64 architecture #289

Merged
merged 2 commits into from
Dec 16, 2023

Commits on Dec 15, 2023

  1. Introduce a tier-1 JIT compiler based on x86-64 architecture

    When the using frequency of a block exceeds a predetermined threshold,
    the tier-1 JIT compiler traces the chained block and generate
    corresponding low quailty machine code. The resulting target machine
    code is stored in the code cache for future utilization.
    
    The primary objective of introducing the tier-1 JIT compiler is to
    enhance the execution speed of RISC-V instructions. This implementation
    requires two additional components: a tier-1 machine code generator,
    and code cache. Furthermore, this tier-1 JIT compiler serves as the
    foundational target for future improvements.
    
    In addition, we have developed a Python script that effectively traces
    code templates and automatically generates JIT code templates. This
    approach eliminates the need for manually writing duplicated code.
    
    As shown in the performance analysis below, the tier-1 JIT compiler's
    performance closely parallels that of QEMU in benchmarks with a
    constrained dynamic instruction count. However, for benchmarks
    featuring a substantial dynamic instruction count or lacking specific
    hotspots—examples include pi and STRINGSORT—the tier-1 JIT compiler
    demonstrates noticeably slower execution compared to QEMU.
    
    Hence, a robust tier-2 JIT compiler is essential to generate optimized
    machine code across diverse execution paths, coupled with a runtime
    profiler for detecting hotspots.
    
    * Perfromance
    | Metric   | rv32emu-T1C | qemu  |
    |----------+-------------+-------|
    |aes	   |         0.02|  0.031|
    |mandelbrot|	    0.029| 0.0115|
    |puzzle	   |       0.0115|  0.009|
    |pi        |       0.0413| 0.0177|
    |dhrystone |	    0.331|  0.393|
    |Nqeueens  |	    0.854|  0.749|
    |qsort-O2  |	    2.384|   2.16|
    |miniz-O2  |	     1.33|   1.01|
    |primes-O2 |	     2.93|  1.069|
    |sha512-O2 |	    2.057|  0.939|
    |stream	   |       12.747|  10.36|
    |STRINGSORT|       89.012| 11.496|
    
    As demonstrated in the memory usage analysis below, the tier-1 JIT
    compiler utilizes less memory than QEMU across all benchmarks.
    
    * Memory usage
    | Metric   | rv32emu-T1C |   qemu  |
    |----------+-------------+---------|
    |aes	   |      186,228|1,343,012|
    |mandelbrot|	  152,203|  841,841|
    |puzzle	   |      153,423|  890,225|
    |pi        |      152,923|  879,957|
    |dhrystone |	  154,466|  856,404|
    |Nqeueens  |	  154,880|  858,618|
    |qsort-O2  |	  155,091|  933,506|
    |miniz-O2  |	  165,627|1,076,682|
    |primes-O2 |	  150,540|  928,446|
    |sha512-O2 |	  153,553|  978,177|
    |stream	   |      165,911|  957,845|
    |STRINGSORT|      167,871|1,104,702|
    
    Related: sysprog21#238
    qwe661234 committed Dec 15, 2023
    1 Configuration menu
    Copy the full SHA
    3aa197b View commit details
    Browse the repository at this point in the history
  2. 1 Configuration menu
    Copy the full SHA
    03cdb98 View commit details
    Browse the repository at this point in the history