Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failure during transpiling in demo_chameleon #95

Open
wmdi opened this issue Oct 4, 2024 · 0 comments
Open

Assertion failure during transpiling in demo_chameleon #95

wmdi opened this issue Oct 4, 2024 · 0 comments

Comments

@wmdi
Copy link
Collaborator

wmdi commented Oct 4, 2024

I run into the following issue in the catalyst cluster:

$ python demo/demo_chameleon.py 
========== Search Configuration ==========
  max num threadblock graph op: 9
  max num kernel_graph op: 7
  max num threadblock graphs: 1
  max num threadblock graph inputs: 3
  max num threadblock graph outputs: 2
  search_thread: 8
  imaps to explore:
  imap combs to explore:
  omaps to explore:
  grid dims to explore:
  block dims to explore:
  fmaps to explore:
  franges to explore:4 16 64 
[Search] States: 901, Random tests: 1, Valid mugraphs: 0, Time: 4.766257
[Search] First step finished. Time elapsed: 4.766711sec
[Search] States: 743301, Random tests: 4307, Valid mugraphs: 16, Time: 172.965338
[Search] Second step finished. Time elapsed: 172.990405sec
[Search] Total states explored: 743323
[Search] Random tests performed: 4307
[Serach] Valid kernel graphs explored: 16
Transpiling muGraph 0...
Profiling muGraph 0 performance (ms) = 0.06813900756835937
Transpiling muGraph 1...
Profiling muGraph 1 performance (ms) = 0.06811238098144531
Transpiling muGraph 2...
python: /home/mengdiwu/mirage/src/threadblock/element_binary.cc:51: mirage::threadblock::STensor mirage::threadblock::Graph::elementbinary(const mirage::threadblock::STensor&, const mirage::threadblock::STensor&, mirage::type::TBOperatorType): Assertion `op != nullptr' failed.
Aborted (core dumped)

A potential reason is the inconsistency between memory checking in search and in transpiler: we modify the graph before transpilation, which may increase the memory usage, and then cause oom issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant