Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

require an input file #7

Closed
ptheywood opened this issue Mar 15, 2024 · 23 comments
Closed

require an input file #7

ptheywood opened this issue Mar 15, 2024 · 23 comments

Comments

@ptheywood
Copy link
Member

This example does not create a default non-empty map if no input file is provided with navmap agents.

In this case, a visualiastion will show the loading splash screen and never progress.

It would proabbly be a good idea to error and exit if no input map file is passed, or if no navmap agents exist (exit condition?).

Adding some usage info to the reamde wouldn't hurt either.

@Robadob
Copy link
Member

Robadob commented Mar 15, 2024

I'll have overlooked this given that I use visual studio heavily.

I've never had access to the code that was used to generate the included map (although I made a crappy implementation during my ug dissertation). Would be nice to explain that too, maybe link the original flamegpu1 ped nav paper?

@lj-cug
Copy link

lj-cug commented Sep 30, 2024

I have also run the code, but it's as saying, the program hanged at the loading ... screen. Could you tell me how to run the code?
Li Jian

@Robadob
Copy link
Member

Robadob commented Sep 30, 2024

Hi

You need to tell it where to find the input file that contains the map, e.g.

./flamegpu -i ../../map.xml -s 0

This example also requests unlimited steps, otherwise the model will only run 1 step.

This should be added to the readme.

@gubbsjuk
Copy link

Seems the provided map.xml might be faulty aswell?
.\pedestrian_navigation.exe -i ..\..\..\map.xml -s 0:

Warning: Input file '..\..\..\map.xml' does not contain macro environment node.
Warning: Agent 'navmap' variable 'is_active' is missing from, input file '..\..\..\map.xml'.```

@Robadob
Copy link
Member

Robadob commented Nov 20, 2024

The reason for this, is that this particular model is a copy of one from FLAMEGPU1 and uses the exact same input file. As noted earlier in this issue, the code that generated the input file is long since lost to the best of my knowledge (unless @twinkarma knows). So although we designed the FLAMEGPU2 input file to be compatible, both the input file format and the model have some minor differences.

Warning: Input file '..\..\..\map.xml' does not contain macro environment node.

This particular model doesn't use macro environment (it predates it) so that node would just be empty, that's a case of an overzealous warning. Not sure if the IO has access to check whether that's true when it reports the warning.

Warning: Agent 'navmap' variable 'is_active' is missing from, input file '..\..\..\map.xml'.

This variable is initialised at the start of the simulation, so it doesn't matter that it's not set.

@gubbsjuk
Copy link

Aha, gotcha.

Cant get the SDL window to open when the map.xml is provided. But opens (loading continusly) when its not provided.
I guess thats probably a problem on my end though.

@Robadob
Copy link
Member

Robadob commented Nov 20, 2024

It's working for me (on Windows/Debug).

image

I don't have a Linux box that I can run the visualisation on, which I presume you're using, @ptheywood may be able to test that side though.

I can reproduce the no input file behaviour, off the top of my head that's a side effect of the loading screen waiting for agents to exist (there's been no end of back and forth around when it should be closed).

Less clear why the SDL window would fail to open when an input file is provided though, particularly if no errors are being reported to console. It could be a case of the first CUDA kernel hanging when agents exist? But that's not something I've seen in a long while, related to CUDA itself getting stuck dynamically building device code, so very speculative and probably wrong.

@ptheywood
Copy link
Member Author

The visualisation window opens for me with the input map.xml file under linux.

  • Ubuntu 22.04
  • single discrete 3060ti
  • driver 565.57.01

image

@gubbsjuk are you on an optimus laptop, a multi-gpu system or attempting to forward over x?

@gubbsjuk
Copy link

Im on Windows. Tried running Debug through visual studio now.

CUDA error 1 [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\include\cub/device/dispatch/dispatch_radix_sort.cuh, 2323]: invalid argument
CUDA error 1 [C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\include\cub/device/dispatch/dispatch_radix_sort.cuh, 2742]: invalid argument
C:\dev\FLAMEGPU2-pedestrian_navigation-example\build\_deps\flamegpu2-src\include\flamegpu/simulation/detail/CUDAErrorChecking.cuh(28): CUDA Error: C:\dev\FLAMEGPU2-pedestrian_navigation-example\build\_deps\flamegpu2-src\src\flamegpu\simulation\detail\CUDAAgent.cu(209): cudaErrorInvalidValue invalid argument

Havent looked into this anything yet, but points towards something iffy with my CUDA.
@ptheywood Dell Precision Laptop, not multi-gpu (aside from Intel Integrated Graphics i guess) RTX 3500 Ada GPU. Unfortunally unsure what you mean about this " or attempting to forward over x?".

@Robadob
Copy link
Member

Robadob commented Nov 20, 2024

" or attempting to forward over x?"

x-forwarding is when you are streaming graphics from a linux machine (typically over ssh). Unrelated if you're a windows user.

..src\flamegpu\simulation\detail\CUDAAgent.cu(209): cudaErrorInvalidValue invalid argument

That appears to be coming from cub, a 3rd party library we use. It's possible there's a particular bad version that we've missed, @ptheywood ?

https://github.com/FLAMEGPU/FLAMEGPU2/blob/v2.0.0-rc.1/src/flamegpu/simulation/detail/CUDAAgent.cu#L209

@ptheywood
Copy link
Member Author

It could also be triggered by something in our code which is unchecked prior to the appropriate CUB call.


@gubbsjuk If you're on windows then don't worry about my x forwarding comment.

With laptops that include integrated graphics (used to be called optimus), if the visualisation is running on the integrated graphics then errors can occur. That's (probably) not the current issue though.

Do you know which version of CUDA you have installed and built FLAME GPU with?

Knowing the CUB/Thrust version found by CMake would be helpful too, in your build directory there should be a file CMakeCache.txt, it should contain a line starting iwth THRUST_VERSION: which includes the version found

@gubbsjuk
Copy link

gubbsjuk commented Nov 20, 2024

Cuda v12.6
THRUST_VERSION:INTERNAL=2.5.0.0

Im sorry for hi-jacking this issue with a seemingly unrelated problem...

Any way of forcing wich GPU the viz is running on then?

@Robadob
Copy link
Member

Robadob commented Nov 20, 2024

It could also be triggered by something in our code which is unchecked prior to the appropriate CUB call.

No there's a checked cudaMalloc() on the preceding line.

Any way of forcing wich GPU the viz is running on then?

On Windows, if the executable is linked against CUDA Optimus already forces the NVidia GPU to be used.

This fun is required to force Optimus otherwise

@ptheywood
Copy link
Member Author

ptheywood commented Nov 20, 2024

Cuda v12.6
THRUST_VERSION:INTERNAL=2.5.0.0

I've been unable to reproduce the issue using the same CUDA version and thrust version on my Windows install on a 3060ti (which is Ampere not Ada like your 3500).

I don't believe we currently have access to any Ada devices to try and reproduce this ourselves unfortunately.


Trying an alternate version of thrust would be my next suggestion, but right now for this repository the only way to do that is to downgrade to an older CUDA version.
However, with recent MSVC updates only 12.4, 12.5 and 12.6 are supported.

There is a newer thrust/cub release, 2.6.0 which will presumably ship with CUDA 12.7, but there's nothing in the release notes clearly related to this particular problem (but there might be a difference).

Currently changing the version of thrust/cub used by FLAME GPU 2 from within this example repository is non trivial, but we do have a change in progress which should actually enable that. The goal is to get a new release out this week including that change, which would then make it simple to try and older cub/thrust. Mixed up the dependencies I'm adjusting handling of at the mo, Thrust / would be changed in FLAMEGPU/FLAMEGPU2#cccl which is not actively being worked on

Have you tried to run any other FLAME GPU models than this one? I.e. the examples in the FLAMEGPU/FLAMEGPU2 repository?

Do you also know which compute capability you compiled for? (CMAKE_CUDA_ARCHITECTURES in CMakeCache.txt). There's an low chance that changing this to another compatible value might help.

@ptheywood
Copy link
Member Author

I've potentially managed to encounter/reproduce this within the FLAME GPU 2 test suite rather than this pedestrian navigation example, using CUDA 12.6 on Windows.

For my SM86 device, compiling for SM 80 or SM 86 behaves as expected, while compiling for SM 50 or SM 70 with PTX embedding for JIT compilation then invalid argument errors in dispatch_radix_sort.cuh are encountered.

This implies you might be able to resolve these issues by setting CMAKE_CUDA_ARCHITECTURES to 80, or possibly 89 for your Ada GPU (if it is not already).

It'll take a bit more time to work around this, but I'll promote this to it's separate issue for tracking tomorrow.

@gubbsjuk
Copy link

gubbsjuk commented Nov 20, 2024

Hiya!
Cheers @ptheywood !
I pulled down the main repo and build the Boids_bruteforce example with visualisation=ON. Worked perfectly. Left Cuda Architectures blank for (I'm assuming) all-major default.
Rebuilding this repo with the same settings now, aswell as enforcing CUDA Arch 80 or 89 (allthough I believe I've tried both).

Will edit this comment in a few.

EDIT:
CUDA Arch ommited - Same result. Crashing.
-DCUDA_ARCH=80 - cmake configure outputs Cuda Architecture 52 whilst configuring project flamegpu Might not be relevant, just an observation whilst waiting for the build to complete...

  Manually-specified variables were not used by the project:

    CUDA_ARCH

@ptheywood
Copy link
Member Author

The CMake variable for setting the architectures flag is -DCMAKE_CUDA_ARCHITECTURES=80.

I've downgrade to CUDA 12.5 which is still showing the error unfortunately.

@gubbsjuk
Copy link

Whoooops! Rookie mistake...

No matter, figured I'd mainly use this example as building blocks. As the main repo is functional I guess I can start from scratch using inspiration from this repo.

Let me know if there is anything in specific you want me to test with an Ada GPU at some point :-) Also have a 4060 (Ti?) available aswell if thats of any interest. :-)

@Robadob
Copy link
Member

Robadob commented Nov 21, 2024

the Boids_bruteforce example with visualisation=ON. Worked perfectly.

The error is occurring when agents are being sorted, this behaviour occurs when spatial messaging is being used (which isn't the case in the brute force models).

No matter, figured I'd mainly use this example as building blocks. As the main repo is functional I guess I can start from scratch using inspiration from this repo.

This is by far the most complete example if you want a visualisation, e.g. it has the controls shown in my screenshot above.

The intended template is here, but again it's circles spatial3D, so will suffer from the agent sort.

You can replace the contents of src/main.cu with any of the brute force models for the time being. Likewise line 5 of CMakeLists.txt can be updated to point to master or a recent commit (this will require a clean build directory to take effect). v2.0.0-rc.1 is now about 12 months old, and we're hoping to push out v2.0.0-rc.2 before the end of the month.

@gubbsjuk
Copy link

gubbsjuk commented Nov 21, 2024

The intended template is here, but again it's circles spatial3D, so will suffer from the agent sort.

Interesting, as the intended template builds and runs (Debug) just fine...!
image

By the way (haven't looked into it yet) any idea why IntelliSense might be misbehaving on a fresh clone and configure of the template repo? "namespace flamegpu has no member 'XXX'"

@Robadob
Copy link
Member

Robadob commented Nov 21, 2024

Ah, correction. Your error is occurring when agent IDs are being validated for collision (following agent's loaded from an input file), thought I expect it should also occur with agent sort.

IntelliSense might be misbehaving

You may want to try Project -> Rescan Solution and/or restart Visual Studio.

image

Intellisense has had poor support for CUDA since about 2008 (back then the CUDA add-in seemingly was able to modify Intellisense to recognise CUDA), likewise it probably gets a little confused with some of the esoteric approaches we use to abstract away what's required for the simulation. So I've never known it to be perfect when working with CUDA (e.g. the 3rd? chevron in a kernel launch is always red lined).

@gubbsjuk
Copy link

gubbsjuk commented Nov 21, 2024

Finally!

-DCMAKE_CUDA_ARCHITECTURES=89 built perfectly in Visual Studio x64/Windows/Debug

@ptheywood
Copy link
Member Author

-DCMAKE_CUDA_ARCHITECTURES=89 built perfectly in Visual Studio x64/Windows/Debug

Glad to hear that's worked. I've opened FLAMEGPU/FLAMEGPU2#1253 with my notes on this issue so far, but will have limited time to look into this in the immediate future now we have an apparent workaround.

You can also set the CUDAARCHS environment variable on your machine to 89, which will provide the default value without having to specify -DCMAKE_CUDA_ARCHITECTURES for every build directory you configure (requires CMake >= 3.2)

re: CUDA_ARCH vs CMAKE_CUDA_ARCHITECTURES, that's on us as we seem to have missed updating this repositories README when we adopted the modern CMake way to specify cuda architectures, which I've opened #8 for.

Robadob added a commit that referenced this issue Nov 22, 2024
Robadob added a commit that referenced this issue Nov 22, 2024
Exit early if input file not provided at runtime.

Closes #7

Closes #8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants