results/4cHh.txt

sbc-bench v0.9.9 Khadas VIM4 (Sun, 09 Oct 2022 18:04:05 +0100)

Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.1 LTS
Release:	22.04
Codename:	jammy

/usr/bin/gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0

Uptime: 18:04:06 up 4 days,  2:07,  2 users,  load average: 2.51, 2.34, 2.43,  52.4°C

Linux 5.4.125 (Khadas) 	10/09/22 	_aarch64_	(8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.19    0.28    0.21    0.18    0.00   99.14

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
mmcblk0           2.33       193.74        12.34         0.00   68436430    4359968          0
sda               0.08         0.03        21.02         0.00       9136    7426864          0
zram1             0.01         0.01         0.05         0.00       2488      17252          0
zram2             0.01         0.01         0.05         0.00       2416      16960          0
zram3             0.01         0.01         0.05         0.00       2700      17136          0
zram4             0.01         0.01         0.05         0.00       2420      17028          0

               total        used        free      shared  buff/cache   available
Mem:           7.8Gi       748Mi       6.7Gi       115Mi       308Mi       6.8Gi
Swap:          3.9Gi        41Mi       3.8Gi

Filename				Type		Size		Used		Priority
/dev/zram1                              partition	1017792		10512		5
/dev/zram2                              partition	1017792		10612		5
/dev/zram3                              partition	1017792		10492		5
/dev/zram4                              partition	1017792		10624		5

##########################################################################

Checking cpufreq OPP for cpu0-cpu3 (Cortex-A73):

Cpufreq OPP: 2208    Measured: 2200 (2200.933/2200.886/2199.902)
Cpufreq OPP: 2016    Measured: 2012 (2012.674/2012.625/2011.988)
Cpufreq OPP: 1896    Measured: 1892 (1892.820/1892.603/1892.386)
Cpufreq OPP: 1800    Measured: 1796 (1796.925/1796.847/1796.769)
Cpufreq OPP: 1704    Measured: 1700 (1700.811/1700.741/1700.741)
Cpufreq OPP: 1608    Measured: 1604 (1604.748/1604.631/1604.397)
Cpufreq OPP: 1512    Measured: 1463 (1532.901/1451.428/1405.604)     (-3.2%)
Cpufreq OPP: 1392    Measured: 1388 (1388.833/1388.483/1388.425)
Cpufreq OPP: 1200    Measured: 1196 (1196.688/1196.633/1196.552)
Cpufreq OPP: 1000    Measured:  996    (996.843/996.796/996.069)
Cpufreq OPP:  666    Measured:  663    (663.703/663.573/663.264)
Cpufreq OPP:  500    Measured:  496    (496.833/496.582/496.400)

Checking cpufreq OPP for cpu4-cpu7 (Cortex-A53):

Cpufreq OPP: 2016    Measured: 2013 (2013.606/2013.409/2012.919)
Cpufreq OPP: 1896    Measured: 1893 (1893.514/1893.340/1893.297)
Cpufreq OPP: 1800    Measured: 1797 (1797.551/1797.551/1797.551)
Cpufreq OPP: 1704    Measured: 1701 (1701.511/1701.441/1701.371)
Cpufreq OPP: 1608    Measured: 1605 (1605.644/1605.527/1605.371)
Cpufreq OPP: 1512    Measured: 1509 (1509.418/1509.383/1509.383)
Cpufreq OPP: 1392    Measured: 1389 (1389.650/1389.533/1389.387)
Cpufreq OPP: 1200    Measured: 1197 (1197.663/1197.500/1197.094)
Cpufreq OPP: 1000    Measured:  997    (997.548/997.548/997.172)
Cpufreq OPP:  666    Measured:  664    (664.468/664.191/664.093)
Cpufreq OPP:  500    Measured:  497    (497.838/497.826/497.552)

##########################################################################

Hardware sensors:

hevc_thermal-virtual-0
temp1:        +47.4 C  (crit = +110.0 C)

gpu_thermal-virtual-0
temp1:        +47.6 C  (crit = +110.0 C)

soc_thermal-virtual-0
temp1:        +49.2 C  (crit = +110.0 C)

vpu_thermal-virtual-0
temp1:        +47.8 C  (crit = +110.0 C)

nna_thermal-virtual-0
temp1:        +47.1 C  (crit = +110.0 C)

a53_thermal-virtual-0
temp1:        +47.7 C  (crit = +110.0 C)

/dev/sda:	

##########################################################################

Executing benchmark on cpu0 (Cortex-A73):

tinymembench v0.4.9 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests                                               ==
==                                                                      ==
== Note 1: 1MB = 1000000 bytes                                          ==
== Note 2: Results for 'copy' tests show how many bytes can be          ==
==         copied per second (adding together read and writen           ==
==         bytes would have provided twice higher numbers)              ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
==         to first fetch data into it, and only then write it to the   ==
==         destination (source -> L1 cache, L1 cache -> destination)    ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in    ==
==         brackets                                                     ==
==========================================================================

 C copy backwards                                     :   8158.7 MB/s (2.0%)
 C copy backwards (32 byte blocks)                    :   8155.5 MB/s (1.5%)
 C copy backwards (64 byte blocks)                    :   8172.2 MB/s
 C copy                                               :   8188.0 MB/s
 C copy prefetched (32 bytes step)                    :   7940.5 MB/s (0.5%)
 C copy prefetched (64 bytes step)                    :   7958.6 MB/s (0.6%)
 C 2-pass copy                                        :   4362.5 MB/s
 C 2-pass copy prefetched (32 bytes step)             :   3952.6 MB/s (0.7%)
 C 2-pass copy prefetched (64 bytes step)             :   4011.0 MB/s (0.2%)
 C fill                                               :  11683.9 MB/s
 C fill (shuffle within 16 byte blocks)               :  11678.8 MB/s
 C fill (shuffle within 32 byte blocks)               :  11681.9 MB/s
 C fill (shuffle within 64 byte blocks)               :  11682.9 MB/s
 ---
 standard memcpy                                      :   8180.2 MB/s
 standard memset                                      :  11683.4 MB/s
 ---
 NEON LDP/STP copy                                    :   8180.9 MB/s
 NEON LDP/STP copy pldl2strm (32 bytes step)          :   8173.8 MB/s
 NEON LDP/STP copy pldl2strm (64 bytes step)          :   8173.5 MB/s
 NEON LDP/STP copy pldl1keep (32 bytes step)          :   7895.4 MB/s (2.6%)
 NEON LDP/STP copy pldl1keep (64 bytes step)          :   8073.4 MB/s (0.3%)
 NEON LD1/ST1 copy                                    :   8165.8 MB/s
 NEON STP fill                                        :  11680.9 MB/s
 NEON STNP fill                                       :  11677.8 MB/s
 ARM LDP/STP copy                                     :   8174.7 MB/s (1.0%)
 ARM STP fill                                         :  11681.5 MB/s (0.3%)
 ARM STNP fill                                        :  11681.8 MB/s

==========================================================================
== Framebuffer read tests.                                              ==
==                                                                      ==
== Many ARM devices use a part of the system memory as the framebuffer, ==
== typically mapped as uncached but with write-combining enabled.       ==
== Writes to such framebuffers are quite fast, but reads are much       ==
== slower and very sensitive to the alignment and the selection of      ==
== CPU instructions which are used for accessing memory.                ==
==                                                                      ==
== Many x86 systems allocate the framebuffer in the GPU memory,         ==
== accessible for the CPU via a relatively slow PCI-E bus. Moreover,    ==
== PCI-E is asymmetric and handles reads a lot worse than writes.       ==
==                                                                      ==
== If uncached framebuffer reads are reasonably fast (at least 100 MB/s ==
== or preferably >300 MB/s), then using the shadow framebuffer layer    ==
== is not necessary in Xorg DDX drivers, resulting in a nice overall    ==
== performance improvement. For example, the xf86-video-fbturbo DDX     ==
== uses this trick.                                                     ==
==========================================================================

 NEON LDP/STP copy (from framebuffer)                 :   8263.6 MB/s (2.2%)
 NEON LDP/STP 2-pass copy (from framebuffer)          :   4532.6 MB/s
 NEON LD1/ST1 copy (from framebuffer)                 :   8248.7 MB/s
 NEON LD1/ST1 2-pass copy (from framebuffer)          :   4577.3 MB/s
 ARM LDP/STP copy (from framebuffer)                  :   8257.2 MB/s
 ARM LDP/STP 2-pass copy (from framebuffer)           :   4618.5 MB/s (0.1%)

==========================================================================
== Memory latency test                                                  ==
==                                                                      ==
== Average time is measured for random memory accesses in the buffers   ==
== of different sizes. The larger is the buffer, the more significant   ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM      ==
== accesses. For extremely large buffer sizes we are expecting to see   ==
== page table walk with several requests to SDRAM for almost every      ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest).                                         ==
==                                                                      ==
== Note 1: All the numbers are representing extra time, which needs to  ==
==         be added to L1 cache latency. The cycle timings for L1 cache ==
==         latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
==         two independent memory accesses at a time. In the case if    ==
==         the memory subsystem can't handle multiple outstanding       ==
==         requests, dual random read has the same timings as two       ==
==         single reads performed one after another.                    ==
==========================================================================

block size : single random read / dual random read
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    5.0 ns          /     8.4 ns 
    131072 :    7.5 ns          /    11.7 ns 
    262144 :    9.3 ns          /    13.2 ns 
    524288 :   10.9 ns          /    13.7 ns 
   1048576 :   22.6 ns          /    30.5 ns 
   2097152 :   73.9 ns          /   107.1 ns 
   4194304 :  102.8 ns          /   132.5 ns 
   8388608 :  120.7 ns          /   145.3 ns 
  16777216 :  130.7 ns          /   151.0 ns 
  33554432 :  137.0 ns          /   154.3 ns 
  67108864 :  140.5 ns          /   156.1 ns 

Executing benchmark on cpu4 (Cortex-A53):

tinymembench v0.4.9 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests                                               ==
==                                                                      ==
== Note 1: 1MB = 1000000 bytes                                          ==
== Note 2: Results for 'copy' tests show how many bytes can be          ==
==         copied per second (adding together read and writen           ==
==         bytes would have provided twice higher numbers)              ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
==         to first fetch data into it, and only then write it to the   ==
==         destination (source -> L1 cache, L1 cache -> destination)    ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in    ==
==         brackets                                                     ==
==========================================================================

 C copy backwards                                     :   2232.6 MB/s (0.1%)
 C copy backwards (32 byte blocks)                    :   2252.5 MB/s (0.2%)
 C copy backwards (64 byte blocks)                    :   2262.1 MB/s (0.3%)
 C copy                                               :   2240.1 MB/s (0.6%)
 C copy prefetched (32 bytes step)                    :   1567.2 MB/s
 C copy prefetched (64 bytes step)                    :   1799.6 MB/s
 C 2-pass copy                                        :   1869.5 MB/s (0.3%)
 C 2-pass copy prefetched (32 bytes step)             :   1331.3 MB/s
 C 2-pass copy prefetched (64 bytes step)             :   1138.3 MB/s
 C fill                                               :  10804.2 MB/s (0.2%)
 C fill (shuffle within 16 byte blocks)               :  10798.4 MB/s
 C fill (shuffle within 32 byte blocks)               :  10799.5 MB/s
 C fill (shuffle within 64 byte blocks)               :  10803.2 MB/s (0.2%)
 ---
 standard memcpy                                      :   2280.8 MB/s
 standard memset                                      :  10891.3 MB/s (0.3%)
 ---
 NEON LDP/STP copy                                    :   2239.0 MB/s
 NEON LDP/STP copy pldl2strm (32 bytes step)          :   1401.8 MB/s (0.8%)
 NEON LDP/STP copy pldl2strm (64 bytes step)          :   1815.7 MB/s
 NEON LDP/STP copy pldl1keep (32 bytes step)          :   2761.9 MB/s
 NEON LDP/STP copy pldl1keep (64 bytes step)          :   2765.1 MB/s
 NEON LD1/ST1 copy                                    :   2260.3 MB/s (0.1%)
 NEON STP fill                                        :  10876.7 MB/s (0.5%)
 NEON STNP fill                                       :   8342.9 MB/s (0.4%)
 ARM LDP/STP copy                                     :   2248.6 MB/s (0.2%)
 ARM STP fill                                         :  10875.3 MB/s (0.2%)
 ARM STNP fill                                        :   8321.5 MB/s

==========================================================================
== Framebuffer read tests.                                              ==
==                                                                      ==
== Many ARM devices use a part of the system memory as the framebuffer, ==
== typically mapped as uncached but with write-combining enabled.       ==
== Writes to such framebuffers are quite fast, but reads are much       ==
== slower and very sensitive to the alignment and the selection of      ==
== CPU instructions which are used for accessing memory.                ==
==                                                                      ==
== Many x86 systems allocate the framebuffer in the GPU memory,         ==
== accessible for the CPU via a relatively slow PCI-E bus. Moreover,    ==
== PCI-E is asymmetric and handles reads a lot worse than writes.       ==
==                                                                      ==
== If uncached framebuffer reads are reasonably fast (at least 100 MB/s ==
== or preferably >300 MB/s), then using the shadow framebuffer layer    ==
== is not necessary in Xorg DDX drivers, resulting in a nice overall    ==
== performance improvement. For example, the xf86-video-fbturbo DDX     ==
== uses this trick.                                                     ==
==========================================================================

 NEON LDP/STP copy (from framebuffer)                 :   2308.1 MB/s (0.4%)
 NEON LDP/STP 2-pass copy (from framebuffer)          :   1893.5 MB/s
 NEON LD1/ST1 copy (from framebuffer)                 :   2291.2 MB/s (0.3%)
 NEON LD1/ST1 2-pass copy (from framebuffer)          :   1855.6 MB/s (0.2%)
 ARM LDP/STP copy (from framebuffer)                  :   2306.3 MB/s (0.3%)
 ARM LDP/STP 2-pass copy (from framebuffer)           :   1893.9 MB/s (0.4%)

==========================================================================
== Memory latency test                                                  ==
==                                                                      ==
== Average time is measured for random memory accesses in the buffers   ==
== of different sizes. The larger is the buffer, the more significant   ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM      ==
== accesses. For extremely large buffer sizes we are expecting to see   ==
== page table walk with several requests to SDRAM for almost every      ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest).                                         ==
==                                                                      ==
== Note 1: All the numbers are representing extra time, which needs to  ==
==         be added to L1 cache latency. The cycle timings for L1 cache ==
==         latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
==         two independent memory accesses at a time. In the case if    ==
==         the memory subsystem can't handle multiple outstanding       ==
==         requests, dual random read has the same timings as two       ==
==         single reads performed one after another.                    ==
==========================================================================

block size : single random read / dual random read
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    3.4 ns          /     5.7 ns 
    131072 :    5.2 ns          /     8.0 ns 
    262144 :    6.1 ns          /     8.9 ns 
    524288 :    6.7 ns          /     9.3 ns 
   1048576 :   15.2 ns          /    24.9 ns 
   2097152 :   68.6 ns          /   102.5 ns 
   4194304 :  103.2 ns          /   134.1 ns 
   8388608 :  120.5 ns          /   145.8 ns 
  16777216 :  130.7 ns          /   151.8 ns 
  33554432 :  136.7 ns          /   155.0 ns 
  67108864 :  140.1 ns          /   156.8 ns 

##########################################################################

Executing ramlat on cpu0 (Cortex-A73), results in ns:

       size:  1x32  2x32  1x64  2x64 1xPTR 2xPTR 4xPTR 8xPTR
         4k: 1.816 1.815 1.815 1.816 1.361 1.361 2.106 3.597 
         8k: 1.816 1.815 1.815 1.815 1.361 1.361 2.337 3.585 
        16k: 1.816 1.815 1.815 1.815 1.361 1.362 1.914 3.587 
        32k: 1.819 1.820 1.822 1.817 1.364 1.363 1.921 3.588 
        64k: 9.655 9.982 9.655 9.984 9.730 9.887 11.67 21.21 
       128k: 11.30 11.31 11.29 11.31 11.30 11.30 12.90 25.32 
       256k: 13.64 13.63 13.62 13.61 13.61 13.61 13.62 25.79 
       512k: 13.67 13.62 13.63 13.62 13.63 13.62 13.62 25.77 
      1024k: 49.30 40.82 50.39 40.91 50.45 40.99 46.92 62.63 
      2048k: 125.9 126.0 128.3 125.8 117.0 120.2 126.7 133.5 
      4096k: 129.4 130.9 130.8 131.0 128.1 130.4 131.4 138.7 
      8192k: 134.3 135.6 134.8 135.5 134.5 135.2 136.5 144.1 
     16384k: 136.4 137.6 136.2 137.6 136.8 137.8 140.1 150.2 

Executing ramlat on cpu4 (Cortex-A53), results in ns:

       size:  1x32  2x32  1x64  2x64 1xPTR 2xPTR 4xPTR 8xPTR
         4k: 1.987 1.989 1.492 1.491 1.491 1.491 2.050 4.162 
         8k: 1.987 1.988 1.491 1.491 1.491 1.491 2.050 4.162 
        16k: 1.988 1.988 1.491 1.491 1.491 1.491 2.050 4.165 
        32k: 4.250 5.593 3.950 5.290 3.951 5.389 8.078 14.55 
        64k: 11.07 12.39 11.48 12.16 11.48 12.21 17.29 33.37 
       128k: 13.15 13.55 12.86 13.33 12.84 13.38 18.89 37.93 
       256k: 14.13 14.52 14.26 14.56 14.26 14.52 19.47 38.40 
       512k: 13.74 14.15 13.91 14.29 13.94 14.30 19.24 38.38 
      1024k: 36.89 46.40 38.02 46.02 39.38 47.30 73.48 144.0 
      2048k: 117.1 121.1 119.0 119.9 118.1 121.2 173.8 343.4 
      4096k: 131.4 131.7 131.4 131.7 131.4 131.6 179.7 358.2 
      8192k: 131.2 131.5 131.3 131.5 131.3 131.5 180.1 358.4 
     16384k: 133.2 134.3 133.2 133.8 133.2 133.8 183.5 373.8 

##########################################################################

Executing benchmark on each cluster individually

OpenSSL 3.0.2, built on 15 Mar 2022 (Library: OpenSSL 3.0.2 15 Mar 2022)
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc     340470.15k   880448.70k  1409682.86k  1649495.04k  1747817.81k  1755436.37k (Cortex-A73)
aes-128-cbc     149779.50k   463344.13k   951279.62k  1329947.65k  1503420.42k  1510566.57k (Cortex-A53)
aes-192-cbc     318672.95k   788053.55k  1175147.52k  1384816.98k  1457468.76k  1463533.57k (Cortex-A73)
aes-192-cbc     141732.83k   412888.13k   784826.03k  1030200.66k  1132904.45k  1140752.38k (Cortex-A53)
aes-256-cbc     308400.61k   725074.01k  1063220.65k  1197337.60k  1251161.43k  1254539.26k (Cortex-A73)
aes-256-cbc     140012.80k   385086.25k   686041.51k   864842.41k   935567.36k   940900.35k (Cortex-A53)

##########################################################################

Executing benchmark single-threaded on cpu0 (Cortex-A73)

7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,8 CPUs LE)

LE
CPU Freq: 21333333 32000000 32000000 32000000 64000000 128000000 256000000 1024000000 682666666

RAM size:    7951 MB,  # CPU hardware threads:   8
RAM usage:    435 MB,  # Benchmark threads:      1

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       1817   100   1775   1768  |      28706   100   2457   2451
23:       1670   100   1710   1702  |      28152   100   2443   2437
24:       1589   100   1717   1709  |      27499   100   2420   2414
25:       1476   100   1693   1686  |      26612   100   2375   2369
----------------------------------  | ------------------------------
Avr:             100   1724   1716  |              100   2424   2418
Tot:             100   2074   2067

Executing benchmark single-threaded on cpu4 (Cortex-A53)

7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,8 CPUs LE)

LE
CPU Freq: 32000000 - 64000000 64000000 128000000 256000000 512000000 - 2048000000

RAM size:    7951 MB,  # CPU hardware threads:   8
RAM usage:    435 MB,  # Benchmark threads:      1

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       1231   100   1202   1198  |      22308   100   1909   1905
23:       1164   100   1190   1186  |      21829   100   1894   1890
24:       1122   100   1212   1207  |      21338   100   1877   1873
25:       1068   100   1225   1220  |      20765   100   1852   1848
----------------------------------  | ------------------------------
Avr:             100   1207   1203  |              100   1883   1879
Tot:             100   1545   1541

##########################################################################

Executing benchmark 3 times multi-threaded on CPUs 0-7

7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,8 CPUs LE)

LE
CPU Freq: 32000000 32000000 32000000 21333333 64000000 256000000 512000000 512000000 1024000000

RAM size:    7951 MB,  # CPU hardware threads:   8
RAM usage:   1765 MB,  # Benchmark threads:      8

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       9513   746   1240   9255  |     180499   721   2137  15396
23:       8554   724   1203   8716  |     176220   721   2116  15250
24:       8190   723   1218   8806  |     171567   719   2094  15058
25:       8275   756   1249   9448  |     167089   723   2058  14870
----------------------------------  | ------------------------------
Avr:             737   1228   9056  |              721   2101  15143
Tot:             729   1665  12100

7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,8 CPUs LE)

LE
CPU Freq: 32000000 32000000 32000000 32000000 128000000 256000000 512000000 1024000000 1024000000

RAM size:    7951 MB,  # CPU hardware threads:   8
RAM usage:   1765 MB,  # Benchmark threads:      8

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       9775   768   1239   9510  |     180100   720   2135  15362
23:       8905   746   1216   9074  |     176180   720   2117  15246
24:       8286   731   1218   8909  |     172177   721   2097  15112
25:       8109   748   1238   9259  |     165920   718   2057  14766
----------------------------------  | ------------------------------
Avr:             748   1228   9188  |              720   2101  15121
Tot:             734   1665  12155

7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,8 CPUs LE)

LE
CPU Freq: 64000000 64000000 64000000 32000000 64000000 256000000 512000000 1024000000 1024000000

RAM size:    7951 MB,  # CPU hardware threads:   8
RAM usage:   1765 MB,  # Benchmark threads:      8

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       9337   730   1245   9083  |     180594   720   2139  15404
23:       8659   726   1216   8823  |     175980   720   2116  15229
24:       8447   742   1224   9083  |     171179   719   2090  15024
25:       8233   758   1240   9400  |     166752   721   2058  14840
----------------------------------  | ------------------------------
Avr:             739   1231   9097  |              720   2101  15124
Tot:             729   1666  12111

Compression: 9056,9188,9097
Decompression: 15143,15121,15124
Total: 12100,12155,12111

##########################################################################

Testing maximum cpufreq again, still under full load. System health now:

Time       big.LITTLE   load %cpu %sys %usr %nice %io %irq   Temp
18:24:12: 2208/2016MHz 10.11  95%   1%  92%   0%   0%   0%  57.3°C

Checking cpufreq OPP for cpu0-cpu3 (Cortex-A73):

Cpufreq OPP: 2208    Measured: 2204 (2204.926/2204.550/2204.268)

Checking cpufreq OPP for cpu4-cpu7 (Cortex-A53):

Cpufreq OPP: 2016    Measured: 2013 (2013.409/2013.311/2013.115)

##########################################################################

Hardware sensors:

hevc_thermal-virtual-0
temp1:        +43.1 C  (crit = +110.0 C)

gpu_thermal-virtual-0
temp1:        +43.1 C  (crit = +110.0 C)

soc_thermal-virtual-0
temp1:        +44.6 C  (crit = +110.0 C)

vpu_thermal-virtual-0
temp1:        +43.3 C  (crit = +110.0 C)

nna_thermal-virtual-0
temp1:        +42.6 C  (crit = +110.0 C)

a53_thermal-virtual-0
temp1:        +43.7 C  (crit = +110.0 C)

/dev/sda:	

##########################################################################

Thermal source: /sys/devices/virtual/thermal/thermal_zone0/ (soc_thermal)

System health while running tinymembench:

Time       big.LITTLE   load %cpu %sys %usr %nice %io %irq   Temp
18:05:26: 2208/2016MHz  2.93   0%   0%   0%   0%   0%   0%  51.0°C
18:06:46: 2208/2016MHz  2.98  13%   0%  12%   0%   0%   0%  51.3°C
18:08:06: 2208/2016MHz  3.04  13%   0%  12%   0%   0%   0%  46.7°C
18:09:26: 2208/2016MHz  3.01  13%   0%  12%   0%   0%   0%  47.9°C
18:10:46: 2208/2016MHz  3.00  13%   0%  12%   0%   0%   0%  50.5°C
18:12:06: 2208/2016MHz  3.12  13%   0%  12%   0%   0%   0%  52.5°C
18:13:26: 2208/2016MHz  3.08  13%   0%  12%   0%   0%   0%  49.4°C
18:14:46: 2208/2016MHz  3.02  13%   0%  12%   0%   0%   0%  50.0°C

System health while running ramlat:

Time       big.LITTLE   load %cpu %sys %usr %nice %io %irq   Temp
18:15:08: 2208/2016MHz  3.01   0%   0%   0%   0%   0%   0%  51.3°C
18:15:14: 2208/2016MHz  3.09  12%   0%  12%   0%   0%   0%  50.2°C
18:15:20: 2208/2016MHz  3.08  12%   0%  12%   0%   0%   0%  51.2°C
18:15:26: 2208/2016MHz  3.08  12%   0%  12%   0%   0%   0%  49.0°C
18:15:32: 2208/2016MHz  3.07  12%   0%  12%   0%   0%   0%  48.1°C
18:15:38: 2208/2016MHz  3.14  13%   0%  12%   0%   0%   0%  46.3°C
18:15:44: 2208/2016MHz  3.12  12%   0%  12%   0%   0%   0%  45.6°C
18:15:50: 2208/2016MHz  3.11  15%   0%  12%   1%   0%   0%  46.1°C
18:15:56: 2208/2016MHz  3.10  12%   0%  12%   0%   0%   0%  46.2°C

System health while running OpenSSL benchmark:

Time       big.LITTLE   load %cpu %sys %usr %nice %io %irq   Temp
18:15:58: 2208/2016MHz  3.10   0%   0%   0%   0%   0%   0%  48.1°C
18:16:14: 2208/2016MHz  3.20  13%   0%  12%   0%   0%   0%  50.2°C
18:16:30: 2208/2016MHz  3.15  12%   0%  12%   0%   0%   0%  48.2°C
18:16:46: 2208/2016MHz  3.27  13%   0%  12%   0%   0%   0%  52.7°C
18:17:02: 2208/2016MHz  3.41  13%   0%  12%   0%   0%   0%  47.2°C
18:17:19: 2208/2016MHz  3.32  13%   0%  12%   0%   0%   0%  48.9°C
18:17:35: 2208/2016MHz  3.31  13%   0%  12%   0%   0%   0%  46.8°C

System health while running 7-zip single core benchmark:

Time       big.LITTLE   load %cpu %sys %usr %nice %io %irq   Temp
18:17:47: 2208/2016MHz  3.58   0%   0%   0%   0%   0%   0%  49.0°C
18:17:58: 2208/2016MHz  3.89  13%   0%  12%   0%   0%   0%  50.4°C
18:18:09: 2208/2016MHz  3.76  13%   0%  12%   0%   0%   0%  51.2°C
18:18:20: 2208/2016MHz  3.64  12%   0%  12%   0%   0%   0%  51.5°C
18:18:31: 2208/2016MHz  3.55  12%   0%  12%   0%   0%   0%  50.2°C
18:18:42: 2208/2016MHz  3.46  13%   0%  12%   0%   0%   0%  48.6°C
18:18:53: 2208/2016MHz  3.55  14%   0%  12%   1%   0%   0%  47.7°C
18:19:04: 2208/2016MHz  3.67  13%   0%  12%   0%   0%   0%  48.3°C
18:19:15: 2208/2016MHz  3.81  13%   0%  12%   0%   0%   0%  47.1°C
18:19:26: 2208/2016MHz  4.00  12%   0%  12%   0%   0%   0%  47.9°C
18:19:37: 2208/2016MHz  3.84  13%   0%  12%   0%   0%   0%  48.2°C
18:19:48: 2208/2016MHz  3.71  13%   0%  12%   0%   0%   0%  49.3°C
18:19:59: 2208/2016MHz  3.55  13%   0%  12%   0%   0%   0%  49.3°C
18:20:10: 2208/2016MHz  3.62  13%   0%  12%   0%   0%   0%  50.0°C
18:20:21: 2208/2016MHz  3.52  12%   0%  12%   0%   0%   0%  50.4°C
18:20:32: 2208/2016MHz  3.44  12%   0%  12%   0%   0%   0%  49.0°C

System health while running 7-zip multi core benchmark:

Time       big.LITTLE   load %cpu %sys %usr %nice %io %irq   Temp
18:20:40: 2208/2016MHz  3.45   0%   0%   0%   0%   0%   0%  49.0°C
18:20:51: 2208/2016MHz  4.83  80%   0%  78%   0%   0%   0%  59.9°C
18:21:02: 2208/2016MHz  5.94  88%   1%  86%   1%   0%   0%  58.7°C
18:21:12: 2208/2016MHz  6.32  87%   0%  85%   0%   0%   0%  56.4°C
18:21:23: 2208/2016MHz  6.96  93%   1%  92%   0%   0%   0%  57.8°C
18:21:33: 2208/2016MHz  7.59  80%   1%  78%   0%   0%   0%  54.9°C
18:21:43: 2208/2016MHz  8.26  95%   2%  91%   0%   0%   0%  51.1°C
18:21:53: 2208/2016MHz  8.61  75%   0%  73%   1%   0%   0%  49.2°C
18:22:04: 2208/2016MHz  9.05  98%   0%  96%   0%   0%   0%  57.9°C
18:22:15: 2208/2016MHz  9.62  89%   1%  87%   0%   0%   0%  57.7°C
18:22:26: 2208/2016MHz  9.67  87%   1%  85%   0%   0%   0%  55.2°C
18:22:37: 2208/2016MHz  9.36  92%   0%  91%   0%   0%   0%  57.3°C
18:22:47: 2208/2016MHz  9.10  82%   2%  78%   0%   0%   0%  55.6°C
18:22:57: 2208/2016MHz  9.47  93%   2%  89%   1%   0%   0%  56.8°C
18:23:08: 2208/2016MHz  8.99  76%   0%  75%   0%   0%   0%  55.4°C
18:23:19: 2208/2016MHz  9.37  94%   0%  92%   0%   0%   0%  57.5°C
18:23:29: 2208/2016MHz  9.17  89%   1%  87%   0%   0%   0%  57.8°C
18:23:40: 2208/2016MHz  9.66  86%   1%  84%   0%   0%   0%  54.6°C
18:23:50: 2208/2016MHz  9.54  84%   0%  82%   1%   0%   0%  51.0°C
18:24:01: 2208/2016MHz  9.77  91%   2%  87%   0%   0%   0%  54.6°C
18:24:12: 2208/2016MHz 10.11  95%   1%  92%   0%   0%   0%  57.3°C

##########################################################################

Linux 5.4.125 (Khadas) 	10/09/22 	_aarch64_	(8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.27    0.28    0.21    0.18    0.00   99.05

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
mmcblk0           2.33       193.30        12.32         0.00   68517086    4366808          0
sda               0.08         0.03        20.95         0.00       9136    7426864          0
zram1             0.01         0.01         0.05         0.00       2488      17252          0
zram2             0.01         0.01         0.05         0.00       2416      16960          0
zram3             0.01         0.01         0.05         0.00       2700      17136          0
zram4             0.01         0.01         0.05         0.00       2420      17028          0

               total        used        free      shared  buff/cache   available
Mem:           7.8Gi       741Mi       6.7Gi       115Mi       392Mi       6.9Gi
Swap:          3.9Gi        41Mi       3.8Gi

Filename				Type		Size		Used		Priority
/dev/zram1                              partition	1017792		10512		5
/dev/zram2                              partition	1017792		10612		5
/dev/zram3                              partition	1017792		10492		5
/dev/zram4                              partition	1017792		10624		5

CPU sysfs topology (clusters, cpufreq members, clockspeeds)
                 cpufreq   min    max
 CPU    cluster  policy   speed  speed   core type
  0        0        0      500    2208   Cortex-A73 / r0p2
  1        0        0      500    2208   Cortex-A73 / r0p2
  2        0        0      500    2208   Cortex-A73 / r0p2
  3        0        0      500    2208   Cortex-A73 / r0p2
  4        1        4      500    2016   Cortex-A53 / r0p4
  5        1        4      500    2016   Cortex-A53 / r0p4
  6        1        4      500    2016   Cortex-A53 / r0p4
  7        1        4      500    2016   Cortex-A53 / r0p4

Architecture:                    aarch64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
CPU(s):                          8
On-line CPU(s) list:             0-7
Vendor ID:                       ARM
Model name:                      Cortex-A73
Model:                           2
Thread(s) per core:              1
Core(s) per socket:              4
Socket(s):                       1
Stepping:                        r0p2
CPU max MHz:                     2208.0000
CPU min MHz:                     500.0000
BogoMIPS:                        48.00
Flags:                           fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
Model name:                      Cortex-A53
Model:                           4
Thread(s) per core:              1
Core(s) per socket:              4
Socket(s):                       1
Stepping:                        r0p4
CPU max MHz:                     2016.0000
CPU min MHz:                     500.0000
BogoMIPS:                        48.00
Flags:                           fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Branch predictor hardening, BHB
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected

SoC guess: Amlogic Meson T7 (A311D2) Revision 36:b (1:3)
DT compat: amlogic, t7
 Compiler: /usr/bin/gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0 / aarch64-linux-gnu
 Userland: arm64
   Kernel: 5.4.125/aarch64
           CONFIG_HZ=250
           CONFIG_HZ_250=y
           CONFIG_PREEMPTION=y
           CONFIG_PREEMPT=y
           CONFIG_PREEMPT_COUNT=y
           CONFIG_PREEMPT_RCU=y
           raid6: neonx8   gen()  3009 MB/s
           raid6: neonx8   xor()  2821 MB/s
           raid6: neonx4   gen()  2489 MB/s
           raid6: neonx4   xor()  2990 MB/s
           raid6: neonx2   gen()  2292 MB/s
           raid6: neonx2   xor()  2518 MB/s
           raid6: neonx1   gen()  1799 MB/s
           raid6: neonx1   xor()  1959 MB/s
           raid6: int64x8  gen()  1787 MB/s
           raid6: int64x8  xor()  1105 MB/s
           raid6: int64x4  gen()  1429 MB/s
           raid6: int64x4  xor()  1054 MB/s
           raid6: int64x2  gen()  1150 MB/s
           raid6: int64x2  xor()   891 MB/s
           raid6: int64x1  gen()   772 MB/s
           raid6: int64x1  xor()   721 MB/s
           raid6: using algorithm neonx8 gen() 3009 MB/s
           raid6: .... xor() 2821 MB/s, rmw enabled
           raid6: using neon recovery algorithm
           xor: measuring software checksum speed
           xor: using function: 32regs (5662.000 MB/sec)

| Khadas VIM4 | 2208/2016 MHz | 5.4 | Ubuntu 22.04.1 LTS arm64 | 12120 | 340470 | 1254540 | 8180 | 11680 |