triton-inference-server / server Public

Notifications You must be signed in to change notification settings
Fork 1.5k
Star 8.6k

Code
Issues 621
Pull requests 60
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: triton-inference-server/server

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

621 Open 3,222 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

All Models Fail with Message: Internal: unable to create stream: operation not supported

#7958 opened Jan 21, 2025 by huggingfacename

unexpected throughput results - Increasing instance group count VS deploying the count distributed on the same card using shared computing windows

#7956 opened Jan 21, 2025 by ariel291888

How to start/expose the metrics end point of the Triton Server via openai_frontend/main.py arguments

#7954 opened Jan 21, 2025 by shuknk8s

Segmentation Fault error when crafting pb_utils.Tensor object Triton BLS model

#7953 opened Jan 18, 2025 by carldomond7

Failed to launch triton-server：”error: creating server: Internal - failed to load all models“ module: backends

Issues related to the backends

#7950 opened Jan 17, 2025 by pzydzh

build.py broken in r24.11

#7939 opened Jan 15, 2025 by prm-james-hill

Triton crashes with SIGSEGV crash

Related to server crashes, segfaults, etc.

#7938 opened Jan 15, 2025 by ctxqlxs

[Question] Are the libnvinfer_builder_resources necessary in the triton image ?

#7932 opened Jan 14, 2025 by MatthieuToulemont

Server build with python BE failing due to missing Boost lib

#7925 opened Jan 9, 2025 by buddhapuneeth

How to get access to the vllm backend model

#7916 opened Jan 3, 2025 by lianyiyi

running triton as a inference service on host

#7915 opened Jan 3, 2025 by sriram-dsl

OpenAI-Compatible Frontend should support world_size larger than 1 enhancement

New feature or request

#7914 opened Jan 3, 2025 by cocodee

vllm_backend: What is the right way to use downloaded model + model.json together?

#7912 opened Jan 2, 2025 by kyoungrok0517

Detected NVIDIA NVIDIA GeForce RTX 4090 D GPU, which is not yet supported in this version of the container

#7911 opened Jan 2, 2025 by mailliw2010

Python backend with multiple instances cause unexpected and non-deterministic results

#7907 opened Dec 25, 2024 by NadavShmayo

MIG deployment of triton cause "CacheManager Init Failed. Error: -17"

#7906 opened Dec 25, 2024 by LSC527

Shared memory io bottleneck?

#7905 opened Dec 24, 2024 by wensimin

Support for guided decoding for vllm backend

#7897 opened Dec 20, 2024 by Inkorak

How Triton inference server always compare the current frame infer result with the previous one

#7893 opened Dec 19, 2024 by Komoro2023

async execute is not run concurrently

#7888 opened Dec 17, 2024 by ShuaiShao93

Unable to open shared memory region

#7887 opened Dec 17, 2024 by zjhong12581

Error when using ONNX with TensorRT (ORT-TRT) Optimization on Multi-GPU

#7885 opened Dec 16, 2024 by efajardo-nv

Manual warmup per model instance / specify warmup config dynamically using c api

#7884 opened Dec 16, 2024 by asaff1

Triton documentation inconsistency

#7878 opened Dec 12, 2024 by BenHaItay

Segfault/Coredump in grpc::ModelInferHandler::InferResponseComplete crash

Related to server crashes, segfaults, etc.

grpc

Related to the GRPC server

#7877 opened Dec 12, 2024 by andyblackheel

Previous 1 2 3 4 5 … 24 25 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly