· published-list estimate · runs in your browser
Model GPU infrastructure savings — utilization, batching, spot, rightsize, quantization, commitment — gated on workload feasibility. Not per-token model pricing. Published-list estimate only. The math runs in your browser — your numbers never leave this page.
Computed on your device. No accounts, no bill upload, nothing sent to a server or any vendor.
Representative published-list prices, re-verified on a freshness schedule. When a rate goes stale or a resource is not in the verified table, the total is suppressed rather than guessed.