triton-inference-server / server Public

Notifications You must be signed in to change notification settings
Fork 1.5k
Star 8.2k

Code
Issues 551
Pull requests 56
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: triton-inference-server/server

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

551 Open 3,191 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Encountering stuck situations when using both Triton client and multiprocessing simultaneously

#7690 opened Oct 9, 2024 by Soul-Code

Segmentation fault

#7689 opened Oct 9, 2024 by lizhenneng

Possible bug in reference counting with shared memory regions investigating

The developement team is investigating this issue

#7688 opened Oct 8, 2024 by hcho3

The pth model to the triton pt model failed

#7685 opened Oct 8, 2024 by linsistqb

Ability to do casting between datatypes within backend

#7680 opened Oct 4, 2024 by kronoker

are FP8 models supported in Triton ?? question

Further information is requested

#7678 opened Oct 4, 2024 by jayakommuru

Triton ONNX runtime backend slower than onnxruntime python client on CPU performance

A possible performance tune-up

#7677 opened Oct 3, 2024 by Mitix-EPI

Histogram Metric for multi-instance tail latency aggregation

#7672 opened Oct 1, 2024 by AshwinAmbal

DCGM unable to start: DCGM initialization error，Error: Failed to initialize NVML verify to close

Verifying if the issue can be closed

#7670 opened Sep 29, 2024 by coder-2014

Error: ensemble of tensorrt + python_be + tensorrt is supported on jetson?

#7667 opened Sep 27, 2024 by olivetom

When there are multiple GPU, only one GPU is used question

Further information is requested

verify to close

Verifying if the issue can be closed

#7664 opened Sep 27, 2024 by gyr66

Direct Streaming of Model Weights from Cloud Storage to GPU Memory enhancement

New feature or request

#7660 opened Sep 26, 2024 by azsh1725

wired inference time for tritonserver

#7655 opened Sep 25, 2024 by qiuzhewei

Deploy TTS model with Triton and onnx backend, failed:Protobuf parsing failed investigating

The developement team is investigating this issue

question

Further information is requested

#7654 opened Sep 25, 2024 by AnasAlmana

Big performance drop when using ensemble model over separate calls investigating

The developement team is investigating this issue

#7650 opened Sep 24, 2024 by jcuquemelle

[Critical] Triton stops processing requests and crashes bug

Something isn't working

#7649 opened Sep 24, 2024 by appearancefnp

python_backend pytorch example as_numpy() error investigating

The developement team is investigating this issue

#7647 opened Sep 24, 2024 by flian2

Make State Tensor Stay in Device Memory question

Further information is requested

#7643 opened Sep 24, 2024 by poor1017

How many instances can Triton support for parallel inference at most?

#7641 opened Sep 22, 2024 by wwdok

incompatible constructor arguments for c_python_backend_utils.InferenceRequest investigating

The developement team is investigating this issue

question

Further information is requested

#7639 opened Sep 20, 2024 by adrtsang

triton gpu deploy suddenly become very slow from 0.03s to 12s, how to solve it ? question

Further information is requested

#7638 opened Sep 20, 2024 by yiluzhuimeng

./fetch_models.sh - unable to resolve host address

#7630 opened Sep 19, 2024 by surprisedPikachu007

[feature request] ffmpeg backend for simplifying decoding of audio/video inputs investigating

The developement team is investigating this issue

#7629 opened Sep 19, 2024 by vadimkantorov

Does triton inference server support customers custom feature but do not need to modify the origin code, like some plugin feature? question

Further information is requested

#7627 opened Sep 19, 2024 by GGBond8488

First invocation of model - Dynamic batching doesn't work - Python Backend

#7623 opened Sep 18, 2024 by ChristosCh00

Previous 1 2 3 4 5 … 22 23 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly