forked from foss-for-synopsys-dwc-arc-processors/lmbench
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCHANGES
82 lines (73 loc) · 3.61 KB
/
CHANGES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
lmbench3-alpha1
Added new benchmark line, which determines the cache line size
Added new benchmark tlb, which determines the effective TLB size.
Note that this may differ from the hardware TLB size due to OS
TLB entries and super-pages.
Added new benchmark par_mem, which determines the possible
speedup due to multiple memory reads progressing in parallel.
This number usually depends highly on the portion of the
memory hierarchy being probed, with higher caches generally
having greater parallelism.
Added new benchmark cache, which determines the number of caches,
their sizes, latency, and available parallelism. It also
reports the latency and available parallelism for main memory.
Added new benchmark lat_ops, which attempts to determine the
latency of basic operations, such as add, multiply and divide,
for a variety of data types, such as int, int64, float and
double.
Added new benchmark par_ops, which attempts to determine the
available scaling of the various basic operations for various
data types.
Added new benchmark stream, which reports memory bandwidth
numbers using benchmark kernels from John McCalpin's STREAM
and STREAM version 2 benchmarks.
Added new benchmark lat_sem, which reports SysV semaphore latency.
Added getopt() command line parsing to most benchmarks.
Added a new benchmark timing harness, benchmp(), which makes
it relatively easy to design and build benchmarks which
measure system performance under a fixed load. It takes
a few parameters:
- initialize: a function pointer. If this is non-NULL
the function is called in the child processes after
the fork but before any benchmark-related work is
done. The function is passed a cookie from the
benchmp() call. This can be a pointer to a
data structure which lets the function know what
it needs to do.
- benchmark: a function pointer. This function
takes two parameters, an iteration count "iters",
and a cookie. The benchmarked activity must be
run "iters" times (or some integer multiple of
"iters". This function must be idempotent; ie.,
the benchmark harness must be able to call it
as many times as necessary.
- cleanup: a function pointer. If this is non-NULL
the function is called after all benchmarking is
completed to cleanup any resources that may have
been allocated.
- enough: If this is non-zero then it is the minimum
amount of time, in micro-seconds, that the benchmark
must be run to provide reliable results. In most
cases this is left to zero to allow the harness to
autoscale the timing intervals to the system clock's
resolution/accuracy.
- parallel: this is the number of child processes
running the benchmark that should be run in parallel.
This is really the load factor.
- warmup: a time period in micro-seconds that each
child process must run the benchmarked process
before any timing intervals can begin. This is
to allow the system scheduler time to settle in
a parallel/distributed system before we begin
measurements. (If so desired)
- repetitions: If non-zero this is the number of
times we need to repeat each measurement. The
default is 11.
- cookie: An opaque value which can be used to
pass information to the initialize(), benchmark(),
and cleanup() routines.
This new harness is now used by: bw_file_rd, bw_mem, bw_mmap_rd,
bw_pipe, bw_tcp, bw_unix, lat_connect, lat_ctx, lat_fcntl,
lat_fifo, lat_mem_rd, lat_mmap, lat_ops, lat_pagefault, lat_pipe,
lat_proc, lat_rpc, lat_select, lat_sem, lat_sig, lat_syscall,
lat_tcp, lat_udp, lat_unix, lat_unix_connect, and stream.