RTL Simulation Enhancements #562

ssteffl · 2020-05-22T00:14:40Z

Related issue:

Type of change: new features

Impact: simulation infrastructure

Release Notes

Enabled verilator multithreading, profiling, fst waveforms
Added bin/numa_prefix wrapper for multi-socket multithreaded verilator
Enabled selective hardware printfs in verilator.
created make help command with variable and command usage
change $DIR evaluation strategy in scripts/build-toolchains.sh and env.sh

bin/numa_prefix

jwright6323

I can't comment much on the actual verilator stuff, but this looks like it is based on an older commit, so we should make sure we capture all the changes back into this. Some general questions/style nits too.

sims/verilator/Makefile

scripts/build-util.sh

generators/utilities/src/main/resources/csrc/emulator.cc

scripts/build-isa-sim.sh

scripts/build-toolchains.sh

sims/verilator/Makefile

colinschmidt

There are a lot of changes here that seem to regress functionality we currently have.

generators/utilities/src/main/resources/csrc/emulator.cc

sims/verilator/Makefile

Co-authored-by: John Wright <[email protected]>

scripts/build-toolchains.sh

abejgonzalez · 2020-05-22T02:16:00Z

Do you know if Verilator multi-threading works for BOOM targets? Last time I tried doing it, it failed.

ssteffl · 2020-05-22T02:23:24Z

i'm pretty sure it failed for this reason: #562 (comment)

jwright6323 · 2020-05-22T02:23:48Z

A few other questions/comments in addition to @abejgonzalez's.

Does this run in multithreaded mode by default now, or do we need to set flags (documentation looks like flags)?
If the latter, we probably need to add CI for it. If the former, that needs to go in the documentation (and probably needs to be discussed in the chipyard meeting).
Have you benchmarked this? Does it actually improve performance? I was under the impression that these types of simulations don't parallelize well, even in proprietary simulators, but if it does that would be good to note.
Is VL8 a commonly used prefix for verilator? I don't necessarily think you need to change it back, but I find it a bit unintuitive. Maybe others can comment.

common.mk

ssteffl · 2020-05-22T02:40:48Z

@jwright6323

default is 1 thread
i've already updated the docs, and we can surely add some MT test case to CI
yes. i get a 6.7x speedup on a bwrc using 20 cores (25 kHz vs 4 kHz)
i like it, since you can use 3 letters for the digital simulators: VCS, IVL, VL8

bin/numa_prefix

Co-authored-by: Abraham Gonzalez <[email protected]>

jwright6323 · 2020-05-22T02:44:30Z

yes. i get a 6.7x speedup on a bwrc using 20 cores (25 kHz vs 4 kHz)

@abejgonzalez how many cores does ferry have? Will this speed up CI?

ssteffl · 2020-05-22T02:48:15Z

yes. i get a 6.7x speedup on a bwrc using 20 cores (25 kHz vs 4 kHz)

@abejgonzalez how many cores does ferry have? Will this speed up CI?

@jwright6323 keep in mind that compilation for me took 40% longer when compiling for 20 threads on my test case.

but verilator compilation is only < 20% of the total latency in the chisel->verilated flow anyways, so it not really that big of a compile-time hit

sims/verilator/Makefile

jwright6323 · 2020-05-22T02:52:20Z

bin/enable_printfs

@@ -0,0 +1,74 @@
+#!/usr/bin/env python


This file is useful, but I'd argue this is also out of scope for the PR. Is this worked into the build flow at all? That would be a nice feature to add.

i think its in the scope of this pr. this is my "verilator enhancements" pr. i'm removed the build-script stuff

this script is a hack. this could be done in a firrtl pass, and hopefully the firrtl-pass would not be too slow compared to this hacky solution.

however, the firrtl pass does not exist, so we must live with this hack for now

The PR title is Multithread verilator. This is what goes into the release notes, merge commit, etc. If this is a Verilator enhancement PR you need to adjust the title/description.

I tend to agree with John here. I think the multi-threading work will take significantly longer to get in/reviewed, whereas the printf stuff would be a quick merge.

PR scope adjusted.

I'd still like to see this worked into the build flow in this PR

on the other hand, it is nice to not have to invoke the JVM and run all the firrtl passes again just to change which printfs get displayed. So I can see a case for having the printf filtering happen after the make verilog target completes.

Although I admit again, that the current solution is a hack.

Why would you need to invoke the JVM? Just have another intermediate make target that runs this script

bin/enable_printfs

sims/verilator/Makefile

generators/utilities/src/main/resources/csrc/emulator.cc

sims/verilator/Makefile

generators/utilities/src/main/resources/csrc/emulator.cc

ssteffl · 2020-05-22T19:03:58Z

@colinschmidt i agree that multithreading should not be blindly applied in all scenarios. and it seems less useful when running a large batch of tiny simulations.

a 6.7x speedup is assuming your only running 1 simulation at a time. i never did a performance comparison of full machine utilization with 1-thread per task in parallel, versus all-cores per tasks serially, but it probably won't a 6.7x speed difference.

scripts/build-toolchains.sh

sims/verilator/Makefile

ssteffl

incorporated feedback

ssteffl

finished review

common.mk

sims/verilator/Makefile

abejgonzalez

I would prefer to have this in the next Chipyard release instead of the most recent one so that we have more time to dogfood this and update formatting/VCS simulation. So that means wait to merge this for a week or two.

sims/verilator/Makefile

variables.mk

jwright6323

This is still kind of all over the place, and hard to review. I think the build system changes need to be in with this, and VCS needs to mirror the verilator changes (e.g. printf stuff, etc), or they will diverge. We want to try to keep them as similar as possible, or features get dropped and it causes a bigger maintenance burden. I haven't really reviewed the verilator Makefile changes, becuase there are too many formatting changes to really tell what's different.

.circleci/config.yml

jwright6323 · 2020-05-24T04:46:29Z

bin/enable_printfs

@@ -0,0 +1,74 @@
+#!/usr/bin/env python


I'd still like to see this worked into the build flow in this PR

jwright6323 · 2020-05-24T04:55:37Z

common.mk

-#########################################################################################
+# run normal binary with hardware-logged insn dissassembly
+run-binary: $(sim)
+	(set -o pipefail && $(NUMA_PREFIX) $(sim) \


I actually think this is less readable than it was before, but even if I didn't, style changes make PRs harder to review, as it's less obvious where functionality has and has not changed. I'm not just saying this to be stubborn; you're taking more of your reviewers' time by doing this, and it's annoying that I keep having to repeat this.

common.mk

jwright6323 · 2020-05-24T04:59:12Z

common.mk

+
+#----------------------------------------------------------------------------
+HELP_COMMANDS += \
+"   run-binary             = run [./$(shell basename $(sim))] and output instructions" \


It's not immediately obvious to me what "output instructions" means. I might suggest "log instructions to a file"

jwright6323 · 2020-05-24T05:14:22Z

sims/verilator/Makefile

 sim_prefix = simulator
 sim = $(sim_dir)/$(sim_prefix)-$(MODEL_PACKAGE)-$(CONFIG)
 sim_debug = $(sim_dir)/$(sim_prefix)-$(MODEL_PACKAGE)-$(CONFIG)-debug

+# verilator doesn't use +permissive, but common.mk expects the simulator's


This is incorrect

colin's PR was going to fix this.

This statement is not true. +permissive is a valid HTIF argument, and is currently able to be passed to verilator if you set it so, so this comment is wrong, and there's no reason to add it.

sims/verilator/Makefile

jwright6323 · 2020-05-24T05:18:38Z

sims/verilator/Makefile

 	--unroll-count 256 \
-	-Werror-PINMISSING \
-	-Werror-IMPLICIT \


Are these important?

so... why are they removed if these are important?

jwright6323 · 2020-05-24T05:22:20Z

sims/verilator/Makefile

@@ -1,112 +1,207 @@
-#########################################################################################
+#############################################################################


There are just too many formatting changes to give this file the detailed review it needs. Many of these flags are not exhaustively tested by CI, so we need to be pretty careful about what we change.

the solution sounds like beefing up the regression tests. i will do that so we can be confident that formatting changes are not causing regressions.

No, the right way to do this is to separate the refactor into multiple PRs.

jwright6323 · 2020-05-24T06:51:42Z

@ssteffl Please stop marking conversations as resolved, it's hard to find them to reply.

abejgonzalez · 2020-08-18T03:33:48Z

I have just started splitting this work into multiple PR's to get individual feedback / fasttrack portions of this PR. #649 #650.

abejgonzalez · 2020-08-21T20:43:20Z

Closing since this was split into #655 #654 #650

Sam Steffl added 4 commits May 21, 2020 14:16

multithreading verilator

1012e15

added docs, verified ENABLE_PRINTF_PATTERN

57a4b52

reverted gemmini change

d2de2e7

cleaned up whitespace in verilator Makefile

40d4d47

jwright6323 reviewed May 22, 2020

View reviewed changes

bin/numa_prefix Show resolved Hide resolved

added gemmini branch for CISC isa

376656a

jwright6323 reviewed May 22, 2020

View reviewed changes

colinschmidt requested changes May 22, 2020

View reviewed changes

Update sims/verilator/Makefile

69b455f

Co-authored-by: John Wright <[email protected]>

a0u reviewed May 22, 2020

View reviewed changes

scripts/build-toolchains.sh Outdated Show resolved Hide resolved