Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RTL Simulation Enhancements #562

Closed
wants to merge 15 commits into from
2 changes: 1 addition & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ version: 2.1
parameters:
tools-cache-version:
type: string
default: "v5"
default: "v6"
ssteffl marked this conversation as resolved.
Show resolved Hide resolved

# default execution env.s
executors:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ target
*.stamp
*.vcd
*.swp
*.swo
*.log
*#
*~
Expand Down
74 changes: 74 additions & 0 deletions bin/enable_printfs
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
#!/usr/bin/env python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is useful, but I'd argue this is also out of scope for the PR. Is this worked into the build flow at all? That would be a nice feature to add.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think its in the scope of this pr. this is my "verilator enhancements" pr. i'm removed the build-script stuff

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this script is a hack. this could be done in a firrtl pass, and hopefully the firrtl-pass would not be too slow compared to this hacky solution.

however, the firrtl pass does not exist, so we must live with this hack for now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR title is Multithread verilator. This is what goes into the release notes, merge commit, etc. If this is a Verilator enhancement PR you need to adjust the title/description.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree with John here. I think the multi-threading work will take significantly longer to get in/reviewed, whereas the printf stuff would be a quick merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR scope adjusted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like to see this worked into the build flow in this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the other hand, it is nice to not have to invoke the JVM and run all the firrtl passes again just to change which printfs get displayed. So I can see a case for having the printf filtering happen after the make verilog target completes.

Although I admit again, that the current solution is a hack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you need to invoke the JVM? Just have another intermediate make target that runs this script

ssteffl marked this conversation as resolved.
Show resolved Hide resolved
#=============================================================================
# this enables the PRINTF_COND around points of interest
#=============================================================================

import sys
import re
from textwrap import dedent as dd

#=============================================================================
# subroutines
#=============================================================================
def print_usage():
print(dd("""
enable_printfs <in_file> <out_file> [pattern1] [pattern2] ...
-------------------------------------------------------------
Selectively enables verbose printf() in your verilog. Right
now, you either get no verbose logging, or 100% verbose logging, but many
times, you might want to just enable logging in a specific module. THIS
SCRIPT IS A TEMPORARY HACK: post-processes the verilog file to remove the
`ifdef PRINTF_COND wrappers around the specific printf()s you care about.
You must manually specify python-compatible regex patterns on the cmdline
to match() the strings in the printf you care about. Multiple patterns are
treated as logical OR, not AND.
"""))
exit(1)

#=============================================================================
if __name__ == "__main__":
if len(sys.argv) < 3:
print_usage()

# parse inputs
in_file = sys.argv[1]
out_file = sys.argv[2]
trigger = re.compile(".*`ifdef PRINTF_COND.*")
targets = []
for idx in range(3, len(sys.argv)):
if len(sys.argv[idx]) > 0:
targets.append(re.compile(sys.argv[idx]))

# read input lines
in_lines = []
with open(in_file, "r") as f:
in_lines = f.readlines()

# enable PRINTF_COND lines around points of interest
enables = 0
out_lines = []
idx = 0
idx_end = len(in_lines)
while(idx < idx_end):
if trigger.match(in_lines[idx]):
matches_target = False
for target in targets:
if target.match(in_lines[idx+4]):
matches_target = True
break
if matches_target:
enables += 1
out_lines += in_lines[idx+3:idx+6]
else:
out_lines += in_lines[idx:idx+9]
idx += 9
else:
out_lines.append(in_lines[idx])
idx += 1

# write to output file
with open(out_file, "w") as f:
f.writelines(out_lines)

print("INFO: enabled {} printfs in {}".format(enables, out_file))

67 changes: 67 additions & 0 deletions bin/numa_prefix
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
#!/usr/bin/env perl
ssteffl marked this conversation as resolved.
Show resolved Hide resolved

#============================================================================
# - really simple script, which just prints out the numactl cmd to
# prefix before your actual command. it determines this based on free
# memory size attached to every node.
# - when you run this on a machine without `numactl`, the output is empty,
# so `$(numa_prefix) <cmd> <args>` turns in to `<cmd> <args>`.
# - when the machine has `numactl` installed, regardless of the socket-count
# on the machine, the resulting command is:
# `numactl -m <socket> -C <core-id list> -- <cmd> <args>`
# - example output from `numactl -H` on a 2 socket machine:
# available: 2 nodes (0-1)
# node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22
# node 0 size: 131026 MB
# node 0 free: 7934 MB
# node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23
# node 1 size: 65536 MB
# node 1 free: 429 MB
# node distances:
# node 0 1
# 0: 10 20
# 1: 20 10
#============================================================================

use strict;
use warnings;

my $path = `which numactl`;
if(length($path) > 0) {
my ($head_line, @rest) = map {chomp; $_} `numactl -H`;

if($head_line =~ /available: (\d+) nodes/) {
my $node_count = $1;
my $best_node_id = undef
my $best_cpus = undef;
my $best_free_size = undef;

# loop through available nodes, selecting the node with the most free mem
foreach my $num (1..$node_count) {
my $cpus_line = shift(@rest);
my $mem_size_line = shift(@rest);
my $mem_free_line = shift(@rest);

if($cpus_line =~ /node (\d+) cpus: (\d.*\d)$/) {
my ($node_id, $cpus) = ($1, $2);
$cpus =~ s/\s+/,/g;

if($mem_free_line =~ /node $node_id free: (\d+) \S+$/) {
my $free_size = $1;
if(!defined($best_free_size) || ($free_size > $best_free_size)) {
$best_node_id = $node_id;
$best_cpus = $cpus;
$best_free_size = $free_size;
}
} else {
die("malformed mem-free line: $mem_free_line\n");
}
} else {
die("malformed cpus line: $cpus_line\n");
}
}
print("numactl -m $best_node_id -C $best_cpus --");
} else {
die("malformed head line: $head_line\n");
}
}
105 changes: 86 additions & 19 deletions common.mk
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,43 @@
SHELL=/bin/bash

#########################################################################################
# extra make variables/rules from subprojects
#
# EXTRA_GENERATOR_REQS - requirements needed for the main generator
# EXTRA_SIM_FLAGS - runtime simulation flags
# EXTRA_SIM_CC_FLAGS - cc flags for simulators
# EXTRA_SIM_SOURCES - simulation sources needed for simulator
# EXTRA_SIM_REQS - requirements to build the simulator
# specify user-interface variables
#########################################################################################
HELP_COMPILATION_VARIABLES += \
" EXTRA_GENERATOR_REQS = requirements needed for the main generator" \
" EXTRA_SIM_CFLAGS = CFLAGS for building simulators" \
" EXTRA_SIM_CXXFLAGS = CXXFLAGS for building simulators" \
" EXTRA_SIM_LDFLAGS = LDFLAGS for building simulators" \
" EXTRA_SIM_SOURCES = simulation sources needed for simulator" \
" EXTRA_SIM_REQS = requirements to build the simulator"

EXTRA_GENERATOR_REQS ?=
EXTRA_SIM_CXXFLAGS ?=
EXTRA_SIM_CFLAGS ?=
EXTRA_SIM_LDFLAGS ?=
EXTRA_SIM_SOURCES ?=
EXTRA_SIM_REQS ?=

#----------------------------------------------------------------------------
HELP_SIMULATION_VARIABLES += \
" EXTRA_SIM_FLAGS = runtime simulation flags (passed within +permissive)" \
" NUMACTL = set to '1' to wrap simulator in the appropriate numactl command"
ssteffl marked this conversation as resolved.
Show resolved Hide resolved

EXTRA_SIM_FLAGS ?=
NUMACTL ?= 0

NUMA_PREFIX = $(if $(filter $(NUMACTL),0),,$(shell numa_prefix))

#----------------------------------------------------------------------------
HELP_COMMANDS += \
" run-binary = run [./$(shell basename $(sim))] and output instructions" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not immediately obvious to me what "output instructions" means. I might suggest "log instructions to a file"

" run-binary-fast = run [./$(shell basename $(sim))] and don't output instructions" \
" run-binary-debug = run [./$(shell basename $(sim_debug))] and output instructions and waveform" \
" run-binary-debug-fast = run [./$(shell basename $(sim_debug))] and don't output instructions or waveform" \
" verilog = generate intermediate verilogs from chisel elaboration and firrtl passes"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list is incomplete, which might actually do more harm than good. It's also probably important to note that there are Make targets that come from the included .d files (like the run-*-tests targets).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of run-binary-debug-fast?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not trying to list all 1000 make targets, just the ones that non-power-users find useful. i can add the run-*-tests. if any other generally useful targets are missing, please enumerate them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i use run-binary-debug-fast all the time to run the debug binary without generating the waveform or instruction trace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think run-binary-debug-fast makes sense. It should just be run-binary-fast if you don't want the waveform or instruction trace.


#########################################################################################
# include additional subproject make fragments
#########################################################################################
include $(base_dir)/generators/ariane/ariane.mk
include $(base_dir)/generators/tracegen/tracegen.mk
Expand Down Expand Up @@ -128,21 +158,58 @@ verilog: $(sim_vsrcs)
#########################################################################################
# helper rules to run simulations
#########################################################################################
.PHONY: run-binary run-binary-fast run-binary-debug run-fast
run-binary: $(sim)
(set -o pipefail && $(sim) $(PERMISSIVE_ON) $(SIM_FLAGS) $(EXTRA_SIM_FLAGS) $(VERBOSE_FLAGS) $(PERMISSIVE_OFF) $(BINARY) </dev/null 2> >(spike-dasm > $(sim_out_name).out) | tee $(sim_out_name).log)
.PHONY: run-binary run-binary-fast
.PHONY: run-binary-debug run-binary-debug-fast
.PHONY: run-fast

#########################################################################################
# helper rules to run simulator as fast as possible
#########################################################################################
# run normal binary with hardware-logged insn dissassembly
run-binary: $(sim)
(set -o pipefail && $(NUMA_PREFIX) $(sim) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think this is less readable than it was before, but even if I didn't, style changes make PRs harder to review, as it's less obvious where functionality has and has not changed. I'm not just saying this to be stubborn; you're taking more of your reviewers' time by doing this, and it's annoying that I keep having to repeat this.

$(PERMISSIVE_ON) \
$(SIM_FLAGS) \
$(EXTRA_SIM_FLAGS) \
$(VERBOSE_FLAGS) \
$(PERMISSIVE_OFF) \
$(BINARY) \
</dev/null \
2> >(spike-dasm > $(sim_out_name).out) \
| tee $(sim_out_name).log)

# run simulator as fast as possible (no insn disassembly)
run-binary-fast: $(sim)
(set -o pipefail && $(sim) $(PERMISSIVE_ON) $(SIM_FLAGS) $(EXTRA_SIM_FLAGS) $(PERMISSIVE_OFF) $(BINARY) </dev/null | tee $(sim_out_name).log)

#########################################################################################
# helper rules to run simulator with as much debug info as possible
#########################################################################################
(set -o pipefail && $(NUMA_PREFIX) $(sim) \
$(PERMISSIVE_ON) \
$(SIM_FLAGS) \
$(EXTRA_SIM_FLAGS) \
$(PERMISSIVE_OFF) \
$(BINARY) \
</dev/null \
| tee $(sim_out_name).log)

# run simulator with as much debug info as possible
run-binary-debug: $(sim_debug)
(set -o pipefail && $(sim_debug) $(PERMISSIVE_ON) $(SIM_FLAGS) $(EXTRA_SIM_FLAGS) $(VERBOSE_FLAGS) $(WAVEFORM_FLAG) $(PERMISSIVE_OFF) $(BINARY) </dev/null 2> >(spike-dasm > $(sim_out_name).out) | tee $(sim_out_name).log)
(set -o pipefail && $(NUMA_PREFIX) $(sim_debug) \
$(PERMISSIVE_ON) \
$(SIM_FLAGS) \
$(EXTRA_SIM_FLAGS) \
$(VERBOSE_FLAGS) \
$(WAVEFORM_FLAG) \
$(PERMISSIVE_OFF) \
$(BINARY) \
</dev/null \
2> >(spike-dasm > $(sim_out_name).out) \
| tee $(sim_out_name).log)

# run debug simulator as fast as possible (no insn disassembly or waveform)
run-binary-debug-fast: $(sim_debug)
(set -o pipefail && $(NUMA_PREFIX) $(sim_debug) \
$(PERMISSIVE_ON) \
$(SIM_FLAGS) \
$(EXTRA_SIM_FLAGS) \
$(PERMISSIVE_OFF) \
$(BINARY) \
</dev/null \
| tee $(sim_out_name).log)

run-fast: run-asm-tests-fast run-bmark-tests-fast

Expand Down
32 changes: 30 additions & 2 deletions docs/Simulation/Software-RTL-Simulation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,7 @@ For a proprietry VCS simulation, enter the ``sims/vcs`` directory
# Enter VCS directory
cd sims/vcs


.. _sim-default:
.. _sw-sim-help:

Simulating The Default Example
-------------------------------
Expand Down Expand Up @@ -76,6 +75,22 @@ For example:

.. _sw-sim-custom:

Makefile Variables and Commands
-------------------------------
You can get a list of useful Makefile variables and commands available from the Verilator or VCS directories. simply run ``make help``:

.. code-block:: shell

# Enter Verilator directory
cd sims/verilator
make help

# Enter VCS directory
cd sims/vcs
make help

.. _sim-default:

Simulating A Custom Project
-------------------------------

Expand Down Expand Up @@ -138,3 +153,16 @@ An open-source vcd-capable waveform viewer is `GTKWave <http://gtkwave.sourcefor
For a VCS simulation, this will generate a vpd file (this is a proprietary waveform representation format used by Synopsys) that can be loaded to vpd-supported waveform viewers.
If you have Synopsys licenses, we recommend using the DVE waveform viewer.

.. _sw-sim-verilator-opts:

Additional Verilator Options
-------------------------------
When building the verilator simulator there are several additional options:

.. code-block:: shell

make VERILATOR_THREADS=8 ENABLE_PRINTF_PATTERN='<python-regex>'

The ``VERILATOR_THREADS=<num>`` option enables the compiled verilator simulator to use ``<num>`` parallel threads. on a multi-socket machine, you will want to make sure all threads are on the same socket by using ``numactl``. You can also just use the ``numa_prefix`` wrapper, which is a simple wrapper around ``numactl`` that runs your verilated simulator like this: ``$(numa_prefix) ./simulator-<name> <simulator-args>``.

The ``ENABLE_PRINTF_PATTERN`` option selectively enables hardware printf's based on a text match of the format string. For example, if you wanted to enable the rocket-core's instruction log, and nothing else, you could run ``make VERILATOR_THREADS=8 ENABLE_PRINTF_PATTERN='.*DASM\(%x\)'``. Don't forget to start the line with a ``.*``, since the pattern matches start at the beginning of the line.
Loading