- The following instructions assume that:
- a CockroachDB secure cluster has been set up.
- the client certificates to access the CockroachDB cluster is placed in a
certs
directory.
-
Log into one of the CockroachDB cluster node.
-
Download the provided project files into the CockroachDB cluster node.
wget http://www.comp.nus.edu.sg/~cs4224/project_files_4.zip
- For each of the provided data files in the
project_files
directory, run the following command to upload the provided project data files to the local CockroachDB cluster node.
cockroach nodelocal upload <source file> <destination file> --certs-dir=certs
# e.g.
cockroach nodelocal upload cockroachdb_data_setup/project_files_4/data_files/warehouse.csv warehouse.csv --certs-dir=certs
- The output should show the id of the CockroachDB node where the file is uploaded to. The id should be the id of the current CockroachDB cluster node. Jot down the id.
-
Upload the
.sql
files under thescripts/load_data
directory of the project root directory to the cluster node. -
Run the command below to create the wholesale database. Enter the password of the CockroachDB user when prompted.
cockroach sql --user=<CockroachDB username> --certs-dir=certs --file create_table.sql
- Substitute the id of the node in
load_data.sql
accordingly with the id of the node where the data files were uploaded to, i.e.
# Replace 4 with the id of the node where the project files were uploaded to
... CSV DATA ('nodelocal://4/warehouse.csv') ...
- Run the command below to load the data into the wholesale database.
cockroach sql --user=<CockroachDB username> --certs-dir=certs --file load_data.sql
- Run the command below to create indices in the wholesale database.
cockroach sql --user=<CockroachDB username> --certs-dir=certs --file create_index.sql
- Note that this command should only be run after the data has been loaded into the database because CockroachDB currently
does not support
IMPORT
statement on tables on which secondary indices have been created.
The following compilation has been tested on macOS.
- Install the following software on your local machine:
- Gradle (version 7.2)
- Java (version 11.0.12)
- Make sure that
JAVA_HOME
variable is pointing to the installed Java 11 directory.
- Make sure that
- To compile, run the following command in the project root directory.
gradle shadowJar
- The compiled jar file can be found in the
build/libs
directory.
The following instructions assumes that:
- a database has been created following the instructions in the set up database section.
- a user has been created in the CockroachDB cluster with a password.
usage: CS4224_CDB-1.0-SNAPSHOT-all.jar
-d,--Database <arg> Database name
-f,--fileName <arg> Name of transaction query file
-i,--ip <arg> IP address of CockroachDB cluster
-l,--logFileName <arg> Name of log file
-p,--port <arg> Port of CockroachDB cluster
-pw,--password <arg> Password of CockroachDB user provided
-t,--task <arg> Type of task: transaction or dbstate
-u,--user <arg> Username of CockroachDB user
- Required arguments for all type of tasks:
-t
- Required arguments for processing input transaction file:
-d, -f, -u, -pw
- Required arguments for computing final state of database:
-d, -u, -pw
- Other arguments are optional.
- Default value of optional argument:
-l
:out.log
-i
:localhost
-p
:26257
- Example 1: Runs the jar file on the cluster node that runs the CockroachDB instance:
java -jar CS4224_CDB-1.0-SNAPSHOT-all.jar -t transaction -f xact_files_B/0.txt -d wholesaledb -u [username] -pw [password] -l 0-out.log 1> out/workload_B/0.out 2> out/workload_B/0.err
- Example 2: Runs the jar file on a remote machine (i.e. not on the cluster node that runs the CockroachDB instance)
java -jar CS4224_CDB-1.0-SNAPSHOT-all.jar -t transaction -f xact_files_B/0.txt -d wholesaledb -u [username] -pw [password] -l 0-out.log -ip [CockroachDB node IP address] 1> out/workload_B/0.out 2> out/workload_B/0.err
The final state of the database is saved to a file called dbstate.csv
.
- Example 1: Runs the jar file on the cluster node that runs the CockroachDB instance:
java -jar CS4224_CDB-1.0-SNAPSHOT-all.jar -t dbstate -d wholesaledb -u [username] -pw [password]
- Example 2: Runs the jar file on a remote machine.
java -jar CS4224_CDB-1.0-SNAPSHOT-all.jar -t dbstate -d wholesaledb -u [username] -pw [password] -ip [CockroachDB node IP address]
A few Bash scripts have been created for running 40 clients simultaneously. The scripts are prep.sh
, launch.sh
, and
run-clients.sh
. They can be found under scripts/profiling
of the project root directory.
The scripts assume that:
- there are 5 CockroachDB cluster nodes.
tmux
is installed on those nodes.
- Upload the scripts in
scripts/profiling
to one of the CockroachDB cluster node. - Create a directory in the
/temp
directory of the cluster node, e.g.mkdir -p /temp/cs4224o/profiling/cockroachdb
- In the created directory, create a directory called
profiling_files
. - Upload the compiled jar file to the
profiling_files
directory. - Copy the provided transaction files directories (
xact_files_A
andxact_files_B
) into theprofiling_files
directory. - Copy
run-clients.sh
into theprofiling_files
directory. cd
to the parent directory of theprofiling_files
directory.- Place
prep.sh
,launch.sh
, andgather_outputs.sh
in the current directory. - In
prep.sh
, substitute theservers
variable with the list of hostnames of other nodes to run the clients on. - Run
prep.sh
to send theprofiling_files
archive to the group of CockroachDB cluster nodes. - In
launch.sh
, substitute theservers
variable with the list of hostnames of other nodes to run the clients on. - Run
launch.sh
to launch 40 clients simultaneously.
- The script launches 8 client instances at each node, following the server requirement S(i mod 5). For example,
clients 0, 5, 10, 15, 20, 25, 30, 35 execute on
xnc40
, clients 1, 6, 11, 16, 21, 26, 31, 36 execute onxcnc41
and so on. - The script runs run.sh in profiling_files subdirectory of the current directory on every node.
Usage: launch <database_name> <workload_type>
database_name: Name of database for the workload
workload_type: A or B
# e.g.
./launch.sh wholesaledb A
- Run
tmux ls
to check the status of the running clients on the current node. - Add the following to your
~/.bash_profile
on the cluster node to check the status of running clients on the other nodes. Runsource ~/.bash_profile
to reload the Bash profile.
# Replace the list of servers (xcnc4{1..4}) accordingly.
alias checkstatus='for server in xcnc4{1..4}; do ssh $server "tmux ls"; done'
- Run
checkstatus
to check the status of the running clients on other nodes. - Once the running clients finish, you can run
gather_outputs.sh
to gather all the output files back to current node.
- Replace the list of nodes in
gather_outputs.sh
before running the script.
- A python script called
stats_cal.py
is provided in the cassandra implementation repository to consolidate the statistics output of each client. Please refer toREADME.md
of the cassandra implementation on where to find the script and how to use the script. The same script can be used for consolidating the statistics output of clients for cockroachdb implementation.