first commit

sg0 · Jun 6, 2020 · ce3e69c · ce3e69c
commit ce3e69c
Show file tree

Hide file tree

Showing 7 changed files with 3,452 additions and 0 deletions.
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,29 @@
+BSD 3-Clause License
+
+Copyright (c) 2019, Battelle Memorial Institute
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+1. Redistributions of source code must retain the above copyright notice, this
+   list of conditions and the following disclaimer.
+
+2. Redistributions in binary form must reproduce the above copyright notice,
+   this list of conditions and the following disclaimer in the documentation
+   and/or other materials provided with the distribution.
+
+3. Neither the name of the copyright holder nor the names of its
+   contributors may be used to endorse or promote products derived from
+   this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
diff --git a/Makefile b/Makefile
@@ -0,0 +1,36 @@
+CXX = mpicxx
+# use -xmic-avx512 instead of -xHost for Intel Xeon Phi platforms
+OPTFLAGS = -O3 -xHost -DPRINT_DIST_STATS -DPRINT_EXTRA_NEDGES
+# -DPRINT_EXTRA_NEDGES prints extra edges when -p <> is passed to 
+#  add extra edges randomly on a generated graph
+# use export ASAN_OPTIONS=verbosity=1 to check ASAN output
+SNTFLAGS = -std=c++11 -fsanitize=address -O1 -fno-omit-frame-pointer
+CXXFLAGS = -std=c++11 -g $(OPTFLAGS)
+
+ENABLE_DUMPI_TRACE=0
+ENABLE_SCOREP_TRACE=0
+
+ifeq ($(ENABLE_DUMPI_TRACE),1)
+	TRACERPATH = $(HOME)/builds/sst-dumpi/lib 
+	LDFLAGS = -L$(TRACERPATH) -ldumpi
+else ifeq ($(ENABLE_SCOREP_TRACE),1)
+	SCOREP_INSTALL_PATH = /usr/common/software/scorep/6.0/intel
+	INCLUDE = -I$(SCOREP_INSTALL_PATH)/include -I$(SCOREP_INSTALL_PATH)/include/scorep -DSCOREP_USER_ENABLE
+	LDAPP = $(SCOREP_INSTALL_PATH)/bin/scorep --user --nocompiler --noopenmp --nopomp --nocuda --noopenacc --noopencl --nomemory
+endif
+
+OBJ = main.o
+TARGET = neve
+
+all: $(TARGET)
+
+%.o: %.cpp
+	$(CXX) $(INCLUDE) $(CXXFLAGS) -c -o $@ $^
+
+$(TARGET):  $(OBJ)
+	$(LDAPP) $(CXX) -o $@ $^ 
+
+.PHONY: clean
+
+clean:
+	rm -rf *~ *.dSYM nc.vg.* $(OBJ) $(TARGET)
diff --git a/README b/README
@@ -0,0 +1,133 @@
+*-*-*-*-*-*-*-*-*
+| | | | | | | | |
+*-*-*-neve-*-*-*-
+| | | | | | | | |
+*-*-*-*-*-*-*-*-*
+
+*****
+About
+*****
+
+`neve` is a microbenchmark that distributes a graph (real-world or synthetic) across
+processes and then performs variable-sized message exchanges (equivalent to the number 
+of ghost vertices shared between processes) between process neighbors and reports 
+bandwidth/latency. Further, neve allows options to shrink a real-world graph (by a 
+percent of ghost vertices shared across processes), study performance of message exchanges 
+for a specific process neighborhood, analyze the impact of an edge-balanced real-world 
+graph distribution, and generate a graph in-memory. neve reports bandwidth/latency, and 
+follows the best practices of existing MPI microbenchmarks.
+
+We require the total number of processes to be a power of 2 and total number of vertices 
+to be perfectly divisible by the number of processes when parallel RGG generation options 
+are used. This constraint does not apply to real world graphs passed to neve.
+
+neve has an in-memory random geometric graph generator (just like our other mini-application,
+miniVite[?]) that can be used for weak-scaling analysis. An n-D random geometric graph (RGG) 
+is generated by randomly placing N vertices in an n-D space and connecting pairs of vertices 
+whose Euclidean distance is less than or equal to d. We only consider 2D RGGs contained within 
+a unit square, [0,1]^2. We distribute the domain such that each process receives N/p vertices 
+(where p is the total number of processes). Each process owns (1 * 1/p) portion of the unit 
+square and d is computed as (please refer to Section 4 of miniVite paper[?] for details): 
+
+d = (dc + dt)/2;
+where, dc = sqrt(ln(N) / pi*N); dt = sqrt(2.0736 / pi*N)
+
+Therefore, the number of vertices (N) passed during execution on p processes must satisfy 
+the condition -- 1/p > d. Unlike miniVite, the edge weights of the generated graph is always 
+1.0, and not the Euclidean distance between vertices (because we only perform communication in 
+this microbenchmark, and no computation, so edge weights are irrelevant).
+
+[?] Ghosh, Sayan, Mahantesh Halappanavar, Antonio Tumeo, Ananth Kalyanaraman, and 
+Assefaw H. Gebremedhin. "miniVite: A Graph Analytics Benchmarking Tool for Massively 
+Parallel Systems." In 2018 IEEE/ACM Performance Modeling, Benchmarking and Simulation of 
+High Performance Computer Systems (PMBS), pp. 51-56. IEEE, 2018.
+
+Please note, the default distribution of graph generated from the in-built random 
+geometric graph generator causes a process to only communicate with its two 
+immediate neighbors. If you want to increase the communication intensity for 
+generated graphs, please use the "-p" option to specify an extra percentage of edges 
+that will be generated, linking random vertices. As a side-effect, this option 
+significantly increases the time required to generate the graph, therefore low values 
+are preferred. The max number of edges that can be added randomly must be less than 
+equal to INT_MAX, at present we don't handle cases in which "-p <percent>" resolves to 
+extra edges more than INT_MAX.
+
+We also allow users to pass any real world graph as input. However, we expect an input graph 
+to be in a certain binary format, which we have observed to be more efficient than reading 
+ASCII format files. The code for binary conversion (from a variety of common graph formats) 
+is packaged separately with another software called Vite, which is our distributed-memory
+implementation of graph community detection. Please follow instructions in Vite README for 
+binary file conversion. Vite could be downloaded from (please don't use the past PNNL/PNL 
+link to download Vite, the following GitHub link is the correct one): 
+https://github.com/Exa-Graph/vite
+
+Please contact the following for any queries or support:
+
+Sayan Ghosh, PNNL (sayan dot ghosh at pnnl dot gov)
+Mahantesh Halappanavar, PNNL (hala at pnnl dot gov)
+
+*******
+Compile
+*******
+
+neve is a C++ header-only library and requires an MPI implementation. It uses MPI Send/Recv and 
+collectives (for synthetic graph generation). Please update the Makefile with compiler flags and 
+use a C++11 compliant compiler of your choice. Invoke `make clean; make` after setting paths 
+to MPI for generating the binary. Use `mpirun` or `mpiexec` or `srun` to execute the code with 
+specific runtime arguments mentioned in the next section.
+
+Pass -DPRINT_DIST_STATS while building for printing distributed graph characteristics.
+
+*****************
+Execution options
+*****************
+E.g.: 
+mpiexec -n 2 bin/./neve -f karate.bin -w -t 0 -z 5
+mpiexec -n 4 bin/./neve -f karate.bin -w -t 0 -s 2 -g 5
+mpiexec -n 2 bin/./neve -l -n 100 -w 
+mpiexec -n 2 bin/./neve -n 100 -t 0
+mpiexec -n 2 bin/./neve -p 2 -n 100 -w 
+
+Possible options (can be combined):
+
+1.  -f <bin-file>   : Specify input binary file after this argument. 
+2.  -b              : Only valid for real-world inputs. Attempts to distribute approximately 
+                      equal number of edges among processes. Irregular number of vertices
+                      owned by a particular process. Increases the distributed graph creation
+                      time due to serial overheads, but may improve overall execution time.
+3.  -n <vertices>   : Only valid for synthetically generated inputs. Pass total number of 
+                      vertices of the generated graph.
+4.  -l              : Use distributed LCG for randomly choosing edges. If this option 
+                      is not used, we will use C++ random number generator (using 
+                      std::default_random_engine).
+5.  -p <percent>    : Only valid for synthetically generated inputs. Specify percent of overall 
+                      edges to be randomly generated between processes.
+6.  -r <nranks>     : This is used to control the number of aggregators in MPI I/O and is
+                      meaningful when an input binary graph file is passed with option "-f".
+                      naggr := (nranks > 1) ? (nprocs/nranks) : nranks;
+7   -w              : Report Bandwidth in MB/s[*].
+8.  -t <0|1|2>      : Report Latency in microseconds[*]. Option '0' uses nonblocking Send/Recv, 
+                      Option '1' uses MPI_Neighbor_alltoall and Option '2' uses MPI_Neighbor_allgather.
+9.  -x <bytes>      : Maximum data exchange size (in bytes).
+10. -m <bytes>      : Minimum data exchange size (in bytes).
+11. -s <PE>         : Analyze a single process neighborhood. Performs bidirectional message 
+                      exchanges between the process neighbors of a particular PE passed by the 
+                      user. This transforms the neighbor subgraph of a particular PE as a fully
+                      connected graph.
+12. -g <count>      : Specify maximum number of ghosts shared between process neighbors of a 
+                      particular PE. This option is only valid when -s <PE> is passed (see #11).
+13. -z <percent>    : Select a percentage of ghost vertices of a real-world graph distributed
+                      across processes as actual ghosts. For e.g., lets assume that a graph is
+                      distributed across 4 processes, and PE#0 has two neighbors, PE#1 and PE#3
+                      with whom it shares a 100 ghost vertices each. In such a configuration, 
+                      the b/w test would perform variable-sized message exchanges a 100 times 
+		      between PE#0 and {PE#1, PE#3}. The -z <percent> option can limit the 
+                      number of message exchanges by selecting a percentage of the actual the number 
+                      of ghosts, but maintaining the overall structure of the original graph. 
+                      Hence, this option can help 'shrink' a graph. Only valid for b/w test, because
+		      latency test will just perform message transfers between process neighborhoods.
+
+[*]Note: Unless -w or -t <...> is passed, the code will just load the graph and create the graph data 
+structure. This is deliberate, in case we want to measure the overhead of loading a graph and 
+time only the file I/O part (like measuring the impact of different #aggregators through the 
+-r <nranks> option).