add llama perf test case #555

SharzyL · 2024-05-03T12:54:57Z

[nix] use single-float abi compilerrt
[tests] fix t1.ld DDR map
[emurt] support uart/print, export header
[cases] add perf.llama

sequencer · 2024-05-03T16:08:54Z

tests/perf/llama/default.nix

+  checkpoint_bin = fetchurl {
+    url = "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories15M.bin";
+    sha256 = "sha256-zVkGRNljhnorbloRB/UfrWY8QdecFJ++y7sflfqB9Jo=";
+  };
+
+  tokenizer_bin = fetchurl {
+    url = "https://github.com/karpathy/llama2.c/raw/b3c4b6c3c4bbff42e5211293280307019368ccb5/tokenizer.bin";
+    sha256 = "sha256-UKUu+CLunoPeXOnQvgoCWnc9AZQ39Ytf+dyvsGPs42E=";
+  };


sequencer · 2024-05-03T16:09:57Z

tests/perf/llama/run.c

+#if defined _WIN32
+    #include "win.h"
+#else


Just following the upstream but totally ok to remove

https://github.com/karpathy/llama2.c/blob/master/run.c#L10-L12

sequencer · 2024-05-03T16:11:08Z

tests/perf/llama/run.c

+    ssize_t file_size; // size of the checkpoint file in bytes
+} Transformer;
+
+void malloc_run_state(RunState* s, Config* p) {


we may need to allocate them to specific memory(SRAM) range in the future.

sequencer · 2024-05-03T16:13:18Z

nice!

sequencer · 2024-05-04T02:25:43Z

please add documentation to run the test case.

sequencer · 2024-05-14T11:09:53Z

RTL still has some bug running llama. Will be fixed in the following PRs

sequencer · 2024-05-15T00:42:17Z

The llama still hang at spike side, in order to not block others commit, Let get this into master branch, and debug it later.

SharzyL force-pushed the llama branch from 185bdde to 131778c Compare May 3, 2024 12:57

sequencer reviewed May 3, 2024

View reviewed changes

SharzyL force-pushed the llama branch from 131778c to 03ac12d Compare May 3, 2024 16:22

SharzyL force-pushed the llama branch from 8c54bc8 to 03ab6cc Compare May 5, 2024 08:24

sequencer force-pushed the llama branch from 56fc08c to be8f9ec Compare May 8, 2024 07:44

SharzyL force-pushed the llama branch from be8f9ec to 133f35d Compare May 14, 2024 09:17

SharzyL added 10 commits May 14, 2024 21:12

[nix] use single-float abi compilerrt

1707019

[tests] fix t1.ld DDR map

989a24d

[emurt] support uart/print, export header

fe9a4b2

[cases] add perf.llama

b7510bc

[emurt] cleanup

4ff9380

[doc] add doc for perf cases

dca81eb

[cosim] support print call frame

ce765fe

[cosim] fix handling of elf symbols

f5fdef9

[cosim] less verbose logging

22fe716

[cosim] handle non-save-ra call for frame tracing

5222689

sequencer force-pushed the llama branch from 11d393a to 5222689 Compare May 14, 2024 13:13

sequencer merged commit f8c72c9 into master May 15, 2024
82 checks passed

sequencer deleted the llama branch May 15, 2024 00:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add llama perf test case #555

add llama perf test case #555

SharzyL commented May 3, 2024

sequencer May 3, 2024

sequencer May 3, 2024

SharzyL May 3, 2024

sequencer May 3, 2024

sequencer commented May 3, 2024

sequencer commented May 4, 2024

sequencer commented May 14, 2024

sequencer commented May 15, 2024

add llama perf test case #555

add llama perf test case #555

Conversation

SharzyL commented May 3, 2024

sequencer May 3, 2024

Choose a reason for hiding this comment

sequencer May 3, 2024

Choose a reason for hiding this comment

SharzyL May 3, 2024

Choose a reason for hiding this comment

sequencer May 3, 2024

Choose a reason for hiding this comment

sequencer commented May 3, 2024

sequencer commented May 4, 2024

sequencer commented May 14, 2024

sequencer commented May 15, 2024