-
Notifications
You must be signed in to change notification settings - Fork 31
/
README
172 lines (147 loc) · 6.01 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
/***********************************************************************
*
* NAS Parallel Benchmarks 3.0
*
* Unofficial OpenMP C Version
*
* Copyright 2014 University of Versailles Saint Quentin en Yvlines
* Copyright 2000 Omni OpenMP Compiler Project
* Copyright 1991-2014 NASA Advanced Supercomputing Division
*
* November 08, 2014
*
***********************************************************************/
1. Introduction
This package contains an unofficial C version of the NAS Parallel Benchmarks
OpenMP 3.0. The benchmarks are derived from the Omni OpenMP Compiler Project
2.3 unofficial C version of NPB 2.3.
The benchmarks were modified to match the new official 3.0 Fortran NPB.
benchmarks. In particular, benchmarks in this release follow the same parallel
region structure than the official 3.0 Fortran NPB.
2. Change Log
This section tracks all the modifications that were made to update the
Omni OpenMP 2.3 unofficial C version to the 3.0 version.
Each modification includes an annotation of the form XXX:YYY,
where XXX is the line number in the 2.3 version and YYY is the
line number in the 3.0 version.
BT
* Transforms the initialization process into distinct parallel regions:
129:127 - initialize
131:128 - lhsinit
133:129 - exact_rhs
* Splits adi function into separated parallel regions:
211:206 - compute_rhs
213:209 - x_solve
215:212 - y_solve
217:215 - z_solve
219:218 - add
* TODO - Initialization part different
CG
* Updates sparse in makea
761:756 - add preload data pages parallel region loop
* Splits the first main parallel region
187:195 - initialization parallel region in main are reduced and single regions are replaced by serial parts
356:347 - conj grad (see below)
209:219 - three loops parallelization
* Decompose Conj grad function into parallel regions
402:395 - initialize conj grad algorithm
413:405 - conj grad iteration loop
551:551 - compute residual norm explicitly
* Split the second main parallel region
261:260 - conj grad
275:271 - post conj grad two loops parallelization
* 78:78 - Remove temporarry array w
* 375:367 - Remove static variables in conj grad since they are initialized at each iteration
* 500:494 - Add barrier after reduction because LLVM OpenMP does not support implicit synchronization after a reduction
* 504:498 - Remove single because all the variables are private due to parallel regions updates
EP
* Parallelizes x-array initialization
109:110 - main
FT
* Splits the first main parallel region
123:123 - compute_indexmap
131:130 - fft (see below)
* Splits the second main parallel region
176:168 - evolve
164:158 - fft
187:177 - fft
845:849 - checksum
* Decompose fft into parallel regions
514:501 - cffts1
562:556 - cffts2
607:606 - cffts3
* define y0 and y1 in cffts[123] on the stack because there is no pointer in fortran
* TODO - need to insert init_ui to touch the initial data
IS
* Already in C
LU
* Splits into parallel regions boundaries and initialization of dependent variables and also forcing term computing
125:123 - setbv
130:128 - setiv
135:133 - erhs
* Produces setparte parallel regions for ssor
3073:3092 - l2norm
3068:3087 - rhs
3064:3083 - SSOR initialization
3094:3112 - SSOR iteration
* TODO - Parallelize post computation part
- error
- pintgr
MG
* Turns omp parallel region into omp parallel loop for
1239:1211 - zero3
* Split first big parallel region
238:233 - norm2u3
257:249 - mg3P
258:243 - resid
* Splits the main iteration big parallel region
273:262- resid
274:263 - norm2u3
277:266 - mg3P
* Transforms the omp parallel region into omp parallel loops from main and mg3P
846:826 - norm2u3
527:516 - resid
608:595 - rprj3
1245:1217 - zero3
463:454 - psinv
684:669 - interep (produces two parallel regions)
* 820:835 - Remove static because of algorithmic changes
* 836:862 - Algorithmic changes
SP
* Splits adi function into separated parallel regions:
205:204 - compute_rhs
208:207 - txinvr
211:210 - x_solve and ninvr
214:213 - y_solve and tzetar
216:216 - z_solve and pinvr
220:219 - add
* TODO - Initialize has to be updated
- 654:659 parallelize the function
* TODO - Post computation verify has to be parallelized
221:226 - error_norm
3. Installation
THe package should contain the following files/directories:
README - this file
README.omc - Readme of the Omni OpenMP Compiler release
LOG.omc - Change log of the Omni OpenMP Compiler release
Makefile - makefile for the suite (not modified from NPB2.3-omp)
Doc - documentations (not modified from NPB2.3-serial)
BT, CG, EP, FT, IS, LU, MG, SP - directory for each program
bin - directory to put executable files
common - common routines (only change version display from NPB2.3-omp)
config - configuration files used by 'make' (not modified from NPB2.3-omp)
sys - utilities (only change version display from NPB2.3-omp)
To use the suite, edit file 'make.def' in directory 'config'.
You must specify the name of compiler and linker, and compiler options.
For more details, refer to file "README.install" in subdirectory "Doc" and to "README.omc".
4. Information
- Original benchmark suite
Information on the NAS Parallel Benchmarks is available at
http://www.nas.nasa.gov/NAS/NPB/.
- C translation
Information on the OpenMP C versions and the Omni OpenMP compiler is
available at http://pdplab.trc.rwcp.or.jp/pdperf/Omni/.
- 3.0 NAS update
Information on the NAS Parallel Benchmarks translation from 2.3 to 3.0 is available at http://benchmark-subsetting.github.io/cNPB/
Note that NAS does not support the OpenMP C versions nor the Omni OpenMP compiler supports the 3.0 structured NAS.
License informations about the NAS NPB benchmarks can be found at http://opensource.org/licenses/nasa1.3.php