Model Serving | Shakudo Docs

📄️ NVIDIA Triton

The Shakudo Platform comes with a build-in NVIDIA Triton Inference Server that simplifies the deployment of AI models at scale in production. Triton is an open-source inference serving software that lets teams deploy trained AI models from any framework (TensorFlow, NVIDIA® TensorRT®, PyTorch, ONNX Runtime, or custom) from local storage or cloud platform on any GPU- or CPU-based infrastructure (cloud, data center, or edge).

📄️ TorchServe

Coming soon

📄️ TensorFlow Serving

Coming soon

📄️ FastAPI

Coming soon

📄️ Flask

Coming soon