-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
very unstable #1
Comments
We haven't tested it on Windows; the error seems to be from the pointnet2_ops_lib (https://github.com/erikwijmans/Pointnet2_PyTorch), we will find a Windows machine and test it later. |
Thank you very much. Almost all errors occur during the model generation phase. Several Python processes start and either an error occurs or there is no error but no feedback for a long time (no unresponse) |
I was running into a similar issue and was able to resolve it by editing setup.py before installing pointnet2_ops to include the cuda arch I am using. It appears it's not configured to include architectures from anything newer than 20xx cards. My steps:
|
I have enabled the script with administrator privileges, but I still have this issue. Probability includes the following issues The above exception was the direct cause of the following exception: Traceback (most recent call last): The above exception was the direct cause of the following exception:(It seems that it has not affected the operation) Traceback (most recent call last):
|
Hi @salier, maybe try this and see if it is a solution? |
I followed this operation and there were still errors. |
I used cuda12.1 pytorch 2.1.2 and although it was successfully deployed, I have not yet successfully generated a model. I hope to receive help.
Contains the following errors:
CUDA kernel failed : no kernel image is available for execution on the device void group_points_kernel_wrapper(int, int, int, int, int, const float *, const int *, float *) at L:38 in D:\TriplaneGaussian\tgs\models\snowflake\pointnet2_ops_lib\pointnet2_ops_ext-src\src\group_points_gpu.cu
(It seems that my architecture does not support it) (Occasionally)
Traceback (most recent call last): File "", line 1, in File "C:\Users\Sariel\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "C:\Users\Sariel\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main prepare(preparation_data) File "C:\Users\Sariel\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare _fixup_main_from_path(data['init_main_from_path']) File "C:\Users\Sariel\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path main_content = runpy.run_path(main_path, File "C:\Users\Sariel\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path return _run_module_code(code, init_globals, run_name, File "C:\Users\Sariel\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code _run_code(code, mod_globals, init_globals, File "C:\Users\Sariel\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "D:\TriplaneGaussian\gradio_app.py", line 39, in model = TGS(cfg=base_cfg.system).to(device) File "D:\TriplaneGaussian\infer.py", line 94, in init self.load_weights(self.cfg.weights, self.cfg.weights_ignore_modules) File "D:\TriplaneGaussian\infer.py", line 50, in load_weights state_dict = load_module_weights( File "D:\TriplaneGaussian\tgs\utils\misc.py", line 37, in load_module_weights ckpt = torch.load(path, map_location=map_location) File "D:\TriplaneGaussian\env\lib\site-packages\torch\serialization.py", line 993, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "D:\TriplaneGaussian\env\lib\site-packages\torch\serialization.py", line 447, in init super().init(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
(Abnormal model reading) (Occasional)
Traceback (most recent call last): File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils\data\dataloader.py", line 1132, in _try_get_data data = self._data_queue.get(timeout=timeout) File "C:\Users\Sariel\AppData\Local\Programs\Python\Python310\lib\multiprocessing\queues.py", line 114, in get raise Empty _queue.Empty
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "D:\TriplaneGaussian\env\lib\site-packages\gradio\queueing.py", line 456, in call_prediction output = await route_utils.call_process_api( File "D:\TriplaneGaussian\env\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api output = await app.get_blocks().process_api( File "D:\TriplaneGaussian\env\lib\site-packages\gradio\blocks.py", line 1522, in process_api result = await self.call_function( File "D:\TriplaneGaussian\env\lib\site-packages\gradio\blocks.py", line 1144, in call_function prediction = await anyio.to_thread.run_sync( File "D:\TriplaneGaussian\env\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "D:\TriplaneGaussian\env\lib\site-packages\anyio_backends_asyncio.py", line 2134, in run_sync_in_worker_thread return await future File "D:\TriplaneGaussian\env\lib\site-packages\anyio_backends_asyncio.py", line 851, in run result = context.run(func, *args) File "D:\TriplaneGaussian\env\lib\site-packages\gradio\utils.py", line 674, in wrapper response = f(*args, **kwargs) File "D:\TriplaneGaussian\gradio_app.py", line 111, in run infer(image_path, cam_dist, only_3dgs=True) File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "D:\TriplaneGaussian\gradio_app.py", line 96, in infer for batch in dataloader: File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils\data\dataloader.py", line 630, in next data = self._next_data() File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils\data\dataloader.py", line 1328, in _next_data idx, data = self._get_data() File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils\data\dataloader.py", line 1294, in _get_data success, data = self._try_get_data() File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils\data\dataloader.py", line 1145, in _try_get_data raise RuntimeError(f'DataLoader worker (pid(s) {pids_str}) exited unexpectedly') from e RuntimeError: DataLoader worker (pid(s) 33816) exited unexpectedly Traceback (most recent call last): File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils\data\dataloader.py", line 1132, in _try_get_data data = self._data_queue.get(timeout=timeout) File "C:\Users\Sariel\AppData\Local\Programs\Python\Python310\lib\multiprocessing\queues.py", line 114, in get raise Empty _queue.Empty
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "D:\TriplaneGaussian\env\lib\site-packages\gradio\queueing.py", line 456, in call_prediction output = await route_utils.call_process_api( File "D:\TriplaneGaussian\env\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api output = await app.get_blocks().process_api( File "D:\TriplaneGaussian\env\lib\site-packages\gradio\blocks.py", line 1522, in process_api result = await self.call_function( File "D:\TriplaneGaussian\env\lib\site-packages\gradio\blocks.py", line 1144, in call_function prediction = await anyio.to_thread.run_sync( File "D:\TriplaneGaussian\env\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "D:\TriplaneGaussian\env\lib\site-packages\anyio_backends_asyncio.py", line 2134, in run_sync_in_worker_thread return await future File "D:\TriplaneGaussian\env\lib\site-packages\anyio_backends_asyncio.py", line 851, in run result = context.run(func, *args) File "D:\TriplaneGaussian\env\lib\site-packages\gradio\utils.py", line 674, in wrapper response = f(*args, **kwargs) File "D:\TriplaneGaussian\gradio_app.py", line 111, in run infer(image_path, cam_dist, only_3dgs=True) File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "D:\TriplaneGaussian\gradio_app.py", line 96, in infer for batch in dataloader: File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils\data\dataloader.py", line 630, in next data = self._next_data() File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils\data\dataloader.py", line 1328, in _next_data idx, data = self._get_data() File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils\data\dataloader.py", line 1294, in _get_data success, data = self._try_get_data() File "D:\TriplaneGaussian\env\lib\site-packages\torch\utils\data\dataloader.py", line 1145, in _try_get_data raise RuntimeError(f'DataLoader worker (pid(s) {pids_str}) exited unexpectedly') from e RuntimeError: DataLoader worker (pid(s) 33816) exited unexpectedly
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "D:\TriplaneGaussian\env\lib\site-packages\gradio\queueing.py", line 501, in process_events response = await self.call_prediction(awake_events, batch) File "D:\TriplaneGaussian\env\lib\site-packages\gradio\queueing.py", line 465, in call_prediction raise Exception(str(error) if show_error else None) from error
Then accompanied by various error free crashes,
I have checked the online version using A10G, and I am using 4090 and 64G memory, which should not be insufficient. But the process often stops
The text was updated successfully, but these errors were encountered: