How much will I be charged to deploy a model?

Last updated: October 10, 2025

Deployed models incur continuous per-minute hosting charges even when not actively processing requests. This applies to both fine-tuned models and dedicated endpoints. When you deploy a model, you should see a pricing prediction. This will change based on the hardware you select, as dedicated endpoints are charged based on the hardware used rather than the model being hosted.

You can find full details of our hardware pricing on our pricing page.

To avoid unexpected charges, make sure to set an auto-shutdown value, and regularly review your active deployments in the models dashboard to stop any unused endpoints. Remember that serverless endpoints are only charged based on actual token usage, while dedicated endpoints and fine-tuned models have ongoing hosting costs.