Skip to content

Commit

Permalink
Use touchstone-tools for stats and plots
Browse files Browse the repository at this point in the history
Start using touchstone-tools for collecting OS and postgresql stats.  It
also handles plotting those stats.  Touchstone-tools is the evolution of
those respective scripts from their OSDL Database Test kit origins.

As time has gone on and usage of the kit has changed, these changes are
trying to make this benchmarking kit easier to use by utilizing plotting
engines and alternative statistical tools that can meet this kit's needs
as well as reduce package dependencies and process data in a reasonable
amount of time.

* Prefer gnuplot over R or Julia for plotting
* Prefer sqlite over R or Julia for manipulating data

Also update the AppImage accordingly.
  • Loading branch information
markwkm committed Apr 25, 2023
1 parent a6de000 commit 9d1559e
Show file tree
Hide file tree
Showing 11 changed files with 217 additions and 611 deletions.
17 changes: 15 additions & 2 deletions ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,27 @@ vX.Y.Z (YYYY-MM-DD)

- Add "version" script to show kit version and version of software that may be
used by the kit
- Update AppImage to install dbttools v0.4.1 for its version information
reporting
- Update AppImage to install dbttools v0.5.0 for its version information
reporting, and updated reporting scripts
- Use .svg file for AppImage; remove .png file.
- Use ARGV0 instead of hard coding name in AppRun
- Remove Julia version of dbt2-post-process
- Add SQLite3 version of dbt2-post-process
- Calculate type 1 quantiles
- Rename container directory to tools
- Use touchstone-tools v0.4.2 for gnuplot based sar, pidstat and postgresql
stats plotting
- Remove dbt2-sysstat in favor of touchstone-tools ts-sysstat version.
- Update general and postgresql report generation based on touchstone-tools
differing chart filenames and directory layout
- Stop generating PDF reports with pandoc unless specified to save time
- Remove sar post-processing since ts-sysstat handles that
- Move profiling, stats and information collection into the db directory for
1-tier test execution with run-workload script
- Remove postgresql stats collection in favor of touchstone-tools ts-pgsql-stat
- Update appimage container file with dbttools v0.5.0 and touchstone-tools
v0.4.2
- Add OS memory paging charts to report; previously omitted

v0.50.1 (2023-03-03)

Expand Down
25 changes: 24 additions & 1 deletion Containerfile.appimage
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,15 @@ ENV ARCH=x86_64
RUN yum -qy update && \
yum -qy install bison \
bzip2 \
cairo-devel \
cmake \
file \
flex \
gcc \
libev-devel \
libffi-devel \
make \
pango-devel \
patch \
perl \
sqlite-devel \
Expand Down Expand Up @@ -89,7 +91,7 @@ RUN ./configure --silent --without-ldap --without-readline --without-zlib \

# Install DBT Tools

ARG DBTTOOLSVER="0.4.1"
ARG DBTTOOLSVER="0.5.0"
RUN curl -o /tmp/v${DBTTOOLSVER}.tar.gz \
-sOL https://github.com/osdldbt/dbttools/archive/refs/tags/v${DBTTOOLSVER}.tar.gz && \
tar -C /usr/local/src -xf /tmp/v${DBTTOOLSVER}.tar.gz
Expand All @@ -116,6 +118,27 @@ RUN ./configure --quiet --silent --prefix=/usr/local/AppDir/usr && \
make -s -j $(nproc) install && \
sed -i -e 's#/usr#././#g' /usr/local/AppDir/usr/bin/sqlite3

ARG GNUPLOTVER="5.4.6"
RUN curl -o /tmp/gnuplot-${GNUPLOTVER}.tar.gz \
-sOL https://sourceforge.net/projects/gnuplot/files/gnuplot/5.4.6/gnuplot-5.4.6.tar.gz/download && \
tar -C /usr/local/src -xf /tmp/gnuplot-${GNUPLOTVER}.tar.gz
WORKDIR /usr/local/src/gnuplot-${GNUPLOTVER}
RUN ./configure -q --prefix=/usr/local/AppDir/usr && \
make -s -j$(nproc) install && \
sed -i -e 's#/usr#././#g' /usr/local/AppDir/usr/bin/gnuplot

ARG TSTOOLSVER="0.4.2"
RUN curl -o /tmp/touchstone-tools-v${TSTOOLSVER}.tar.gz \
-sOL https://gitlab.com/touchstone/touchstone-tools/-/archive/v${TSTOOLSVER}/touchstone-tools-v{TSTOOLSVER}.tar.gz && \
mkdir -p /usr/local/src/touchstone-tools-v${TSTOOLSVER} && \
tar -C /usr/local/src/touchstone-tools-v${TSTOOLSVER} \
--strip-components=1 \
-xf /tmp/touchstone-tools-v${TSTOOLSVER}.tar.gz
WORKDIR /usr/local/src/touchstone-tools-v${TSTOOLSVER}
RUN cmake -H. -Bbuilds/release -DCMAKE_INSTALL_PREFIX=/usr && \
(cd builds/release && make -s install DESTDIR=/usr/local/AppDir) && \
rm -f /tmp/touchstone-tools-v${TSTOOLSVER}.tar.gz

WORKDIR /usr/local
RUN curl -sOL https://github.com/AppImage/AppImageKit/releases/download/13/appimagetool-${ARCH}.AppImage && \
chmod +x appimagetool-${ARCH}.AppImage && \
Expand Down
16 changes: 14 additions & 2 deletions doc/user-guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,20 @@ Setup
External Software Dependencies
------------------------------

SQLite3 is required to do basic statistical calculations in order to report
test metrics.
`SQLite3 <https://www.sqlite.org/index.html>`_ is required to do basic
statistical calculations in order to report test metrics.

System and database statistic collection and post processing are handled by
`Touchstone Tool <https://gitlab.com/touchstone/touchstone-tools>`_. Install
this package in order to characterize system performance. This package is
included in the DBT-2 AppImage.

A test report can be generated with the aid of `DBT Tools
<https://github.com/osdldbt/dbttools>`_. This package is included in the DBT-2
AppImage.

`gnuplot <https://www.gnuplot.info/>`_ is used when generating charts. This
package is also included in the DBT-2 AppImage.

Linux AppImage
--------------
Expand Down
1 change: 0 additions & 1 deletion src/scripts/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ install (
${CMAKE_SOURCE_DIR}/src/scripts/dbt2-post-process.r
${CMAKE_SOURCE_DIR}/src/scripts/dbt2-run-workload
${CMAKE_SOURCE_DIR}/src/scripts/dbt2-run-workload-autoscale
${CMAKE_SOURCE_DIR}/src/scripts/dbt2-sysstats
DESTINATION "bin"
)

Expand Down
170 changes: 61 additions & 109 deletions src/scripts/dbt2-generate-report
Original file line number Diff line number Diff line change
Expand Up @@ -48,26 +48,17 @@ create_pidstat_page()

mkdir -p $DIR

PIDSTATINDEX="$(dirname "${PIDSTATCSV}")/pidstat-index.txt"

cat > ${DIR}/index.rst << __EOF__
================================================================================
Database Test 2 pidstat $TAG Charts
================================================================================
$(show_images_pidstat $TAG $DIR)
$(show_images_pidstat "${TAG}" "${DIR}" "${PIDSTATINDEX}")
__EOF__
}

create_pidstat_plots()
{
MYPIDSTATCSV=$1
METRIC=$2
COMMAND=$3
MYTAG=$4

dbt-plot pidstat -i "$MYPIDSTATCSV" -m "$METRIC" -c "$COMMAND" \
-t "$MYTAG" -o "$(dirname "$MYPIDSTATCSV")/${MYTAG}"
}

list_multiple_systems_summary()
{
TITLE="$1"
Expand All @@ -87,30 +78,20 @@ list_multiple_systems_summary()
echo " - $(show_system_links ${MYTAG}/$SNAME)"
echo " - $(show_profile_links ${MYTAG}/$SNAME)"

if [ -f "${INDIR}/${MYTAG}/${SNAME}/sar_raw.out" ]; then
if [ -f "${INDIR}/${MYTAG}/${SNAME}/sysstat/sar.datafile" ]; then
THISDIR="${INDIR}/${MYTAG}/$SNAME"
create_stat_page $MYTAG cpu $THISDIR ${THISDIR}/cpu
create_stat_page $MYTAG mem $THISDIR ${THISDIR}/mem
create_stat_page $MYTAG blockdev $THISDIR ${THISDIR}/blockdev
create_stat_page $MYTAG net $THISDIR ${THISDIR}/net
create_stat_page $MYTAG paging $THISDIR ${THISDIR}/paging
create_stat_page $MYTAG swap $THISDIR ${THISDIR}/swap
fi
done
}

list_processes()
{
if [ "$DBTLANG" = "julia" ]; then
METRICS="%CPU %usr %system %wait %MEM kB_rd/s kB_wr/s cswch/s"
METRICS="$METRICS nvcswch/s fd-nr"
else
# R automatically replaces special characters with one of more other
# chars.
METRICS="X.CPU X.usr X.system X.wait X.MEM kB_rd.s kB_wr.s cswch.s"
METRICS="$METRICS nvcswch.s fd.nr"
fi
METRICS="$METRICS VSZ RSS iodelay threads"

list_processes2 "Driver System(s):" driver
list_processes2 "Client System(s):" client
list_processes2 "Database System(s):" db
Expand All @@ -125,28 +106,25 @@ list_processes2()
echo ""
find "${INDIR}/${MYPTAG}" -name pidstat.csv | while IFS= read -r PIDSTATCSV
do
PIDSTATDIR="$(dirname "${PIDSTATCSV}")"
ts-plot-pidstat -i "${PIDSTATCSV}" -o "${PIDSTATDIR}/pidstat"

# Try not to count execution of the driver/client binaries on remote
# systems.
COUNTDRIVER=$(grep dbt2-driver "$PIDSTATCSV" | grep -c -v ssh)
COUNTCLIENT=$(grep dbt2-client "$PIDSTATCSV" | grep -c -v ssh)
SNAME=$(dirname $PIDSTATCSV)
SNAME=$(dirname "$(dirname "${PIDSTATCSV}")")
SNAME=$(basename $SNAME)

echo "* $SNAME"

if [ $COUNTDRIVER -gt 0 ]; then
PTAG="driver"
for M in $METRICS; do
create_pidstat_plots "$PIDSTATCSV" "$M" dbt2-driver $PTAG
done
create_pidstat_page $PTAG ${INDIR}/${MYPTAG}/${SNAME}/${PTAG}
echo " * \`$PTAG <${MYPTAG}/${SNAME}/${PTAG}/>\`__"
fi
if [ $COUNTCLIENT -gt 0 ]; then
PTAG="client"
for M in $METRICS; do
create_pidstat_plots "$PIDSTATCSV" "$M" dbt2-client $PTAG
done
create_pidstat_page $PTAG ${INDIR}/${MYPTAG}/${SNAME}/${PTAG}
echo " * \`$PTAG <${MYPTAG}/${SNAME}/${PTAG}/>\`__"
fi
Expand All @@ -172,13 +150,19 @@ show_images_pidstat()
{
TAG=$1
DIR=$2

CHARTS=$(cd $DIR && ls -1v pidstat-${TAG}-*.png)
for CHART in $CHARTS; do
echo ".. image:: $CHART"
echo " :target: $CHART"
echo " :width: 100%"
echo ""
PINDEX=$3

TAGPIDS=$(grep "dbt2-${TAG}" "${PINDEX}" | cut -d " " -f 1)
for P in ${TAGPIDS}; do
CHARTS=$(find "${DIR}/../sysstat/pidstat" \
-name "pidstat-${P}-*.png" | sort)
for CHART in ${CHARTS}; do
BCHART=$(basename "${CHART}")
echo ".. image:: ../sysstat/pidstat/${BCHART}"
echo " :target: ../sysstat/pidstat/${BCHART}"
echo " :width: 100%"
echo ""
done
done
}

Expand Down Expand Up @@ -230,12 +214,14 @@ show_system_links()
{
SYSTEM=$1

if [ -f "${INDIR}/${SYSTEM}/sar_raw.out" ]; then
echo -n "\`CPU <${SYSTEM}/cpu/>\`__ "
echo -n "\`Memory <${SYSTEM}/mem/>\`__ "
echo -n "\`Blockdev <${SYSTEM}/blockdev/>\`__ "
echo -n "\`Network <${SYSTEM}/net/>\`__ "
echo -n "\`Swap <${SYSTEM}/swap/>\`__ "
if [ -f "${INDIR}/${SYSTEM}/sysstat/sar.datafile" ]; then
LINE="\`CPU <${SYSTEM}/cpu/>\`__"
LINE="${LINE} \`Memory <${SYSTEM}/mem/>\`__"
LINE="${LINE} \`Blockdev <${SYSTEM}/blockdev/>\`__"
LINE="${LINE} \`Network <${SYSTEM}/net/>\`__"
LINE="${LINE} \`Paging <${SYSTEM}/paging/>\`__"
LINE="${LINE} \`Swap <${SYSTEM}/swap/>\`__"
echo "${LINE}"
else
echo ".."
fi
Expand Down Expand Up @@ -273,9 +259,9 @@ if [ "x$DBNAME" = "x" ]; then
fi

echo "Processing pidstat files..."
for PIDSTATFILE in $(find $INDIR -name pidstat.txt); do
DIRNAME="$(dirname $PIDSTATFILE)"
(cd "$DIRNAME" && dbt-process-pidstat pidstat.txt)
find "${INDIR}" -type f -name pidstat.txt | while IFS= read -r PIDSTATFILE; do
PDIR=$(dirname "${PIDSTATFILE}")
ts-process-pidstat -i "${PIDSTATFILE}"
done

echo "Generating transaction distribution charts..."
Expand All @@ -296,46 +282,9 @@ find ${INDIR}/driver -name 'mix*.log' | \
xargs dbt-plot transaction-distribution "Stock Level" s "${OUTDIR}/txn" 5 \
|| warning "Could not create Stock Level response time distribution char"

mkdir -p ${OUTDIR}/db/sar || exit 1

echo "Generating sar io charts..."
for SARBLOCKDEVFILE in `find $INDIR -name sar-blockdev.csv`; do
DIRNAME="$(dirname $SARBLOCKDEVFILE)/sar"
mkdir -p $DIRNAME
dbt-plot sar-blockdev "$SARBLOCKDEVFILE" "$DIRNAME" \
> /dev/null 2>&1 || warning "Could not create sar blockdev charts"
done

echo "Generating memory charts..."
for SARMEMFILE in `find $INDIR -name sar-mem.csv`; do
DIRNAME="$(dirname $SARMEMFILE)/sar"
mkdir -p $DIRNAME
dbt-plot sar-mem -i "$SARMEMFILE" -o "$DIRNAME" > /dev/null 2>&1 \
|| warning "Could not create sar mem charts from $SARMEMFILE"
done

echo "Generating processor utilization charts..."
for SARCPUFILE in `find $INDIR -name sar-cpu.csv`; do
DIRNAME="$(dirname $SARCPUFILE)/sar"
mkdir -p $DIRNAME
dbt-plot sar-cpu "$SARCPUFILE" "$DIRNAME" > /dev/null 2>&1 \
|| warning "Could not create sar cpu charts from $SARCPUFILE"
done

echo "Generating swap charts..."
for SARSWAPFILE in `find $INDIR -name sar-swap.csv`; do
DIRNAME="$(dirname $SARSWAPFILE)/sar"
mkdir -p $DIRNAME
dbt-plot sar-swap "$SARSWAPFILE" "$DIRNAME" > /dev/null 2>&1 \
|| warning "Could not create sar swap charts from $SARSWAPFILE"
done

echo "Generating network charts..."
for SARNETFILE in `find $INDIR -name sar-net.csv`; do
DIRNAME="$(dirname $SARNETFILE)/sar"
mkdir -p $DIRNAME
dbt-plot sar-net "$SARNETFILE" "$DIRNAME" > /dev/null 2>&1 \
|| warning "Could not create sar net charts from $SARNETFILE"
echo "Generating sar charts..."
find "${INDIR}" -type d -name sysstat | while IFS= read -r SARDIR; do
ts-plot-sar -i "${SARDIR}/sar"
done

echo "Generating transaction rate charts..."
Expand Down Expand Up @@ -368,7 +317,7 @@ Summary
* Date: $(head -n 1 "${INDIR}/readme.txt")
* Scale Factor: $(grep "Database Scale Factor:" "${INDIR}/readme.txt" | cut -d ":" -f 2 | xargs)
* DBMS: $(head -n 1 "$(find "${INDIR}" -name readme-dbms.txt)")
* DBMS: $(head -n 1 "$(find "${INDIR}" -name "readme-${DBMS}.txt")")
$(cat $SUMMARY)
Expand Down Expand Up @@ -429,6 +378,8 @@ __EOF__
# smarter such that we should actually do some combination of generating
# formats requested or throw codes depending on what actually failed.

GENERATE_PDF=0

RST2HTML5=""
which rst2html5.py > /dev/null 2>&1
if [ $? -eq 0 ]; then
Expand All @@ -455,28 +406,29 @@ if [ ! "x${RST2HTML5}" = "x" ]; then
find "${OUTDIR}" -maxdepth 1 -name '*.html'
echo ""

# A pdf could be produced other ways, but I believe pandoc produces the most
# minimally styled LaTeX looking document, as opposed to using rst2latex,
# with default settings.
which pandoc > /dev/null 2>&1
if [ $? -eq 0 ]; then
(cd $OUTDIR && \
pandoc -s report.html -f html -t pdf -o report.pdf \
2> /dev/null)
for HTML in $(find $OUTDIR -name "*.html"); do
HTMLDIR=$(dirname "$HTML")
HTMLNAME=$(basename "$HTML")
NAME="${HTMLNAME%.*}"
pandoc -s "$HTML" -f html -t pdf \
-o "${HTMLDIR}/${NAME}.pdf" 2> /dev/null
done
if [ $GENERATE_PDF -eq 1 ]; then
# A pdf could be produced other ways, but I believe pandoc produces the
# most minimally styled LaTeX looking document, as opposed to using
# rst2latex, with default settings.
if ! which pandoc > /dev/null 2>&1; then
(cd "${OUTDIR}" && \
pandoc -s report.html -f html -t pdf -o report.pdf \
2> /dev/null)
find "${OUTDIR}" -name "*.html" | while IFS= read -r HTML; do
HTMLDIR=$(dirname "$HTML")
HTMLNAME=$(basename "$HTML")
NAME="${HTMLNAME%.*}"
pandoc -s "$HTML" -f html -t pdf \
-o "${HTMLDIR}/${NAME}.pdf" 2> /dev/null
done

echo "Generated top level PDF reports:"
find "${OUTDIR}" -maxdepth 1 -name '*.pdf'
echo ""
else
echo "WARNING: pandoc required to generate pdf report"
exit 0
echo "Generated top level PDF reports:"
find "${OUTDIR}" -maxdepth 1 -name '*.pdf'
echo ""
else
echo "WARNING: pandoc required to generate pdf report"
exit 0
fi
fi
else
echo "WARNING: rst2html5 required to generate html report"
Expand Down
Loading

0 comments on commit 9d1559e

Please sign in to comment.