huggingface-blog

DeepInfra 登陆 Hugging Face 推理服务提供商 🔥

DeepInfra on Hugging Face Inference Providers 🔥

二〇二六年五月十二日 · 英文原文

摘要

DeepInfra 成为 Hugging Face Hub 的 Inference Provider，提供 serverless AI inference 和按 token 计费，目录含 100 多个模型。首批支持 conversational、text-generation，可通过 UI、Python/JS SDK 和 HF router 调用，计费支持自有 API key 或 HF 账号路由。

](https://huggingface.co/araikin)

我们很高兴地宣布，DeepInfra 现已成为 Hugging Face Hub 支持的 Inference Provider！

DeepInfra 加入了我们不断扩展的生态系统，增强了 Hub 模型页面上 serverless inference 的覆盖范围和能力。Inference Providers 也已无缝集成到我们的客户端 SDK（包括 JS 和 Python）中，让你可以非常方便地使用偏好的提供商调用各种模型。

DeepInfra 是一个 serverless AI inference 平台，提供业内极具成本效益的按 token 计价方案之一。DeepInfra 拥有超过 100 个模型的目录，使开发者能够以最少的配置，将广泛的 AI 能力集成到自己的应用中。

DeepInfra 支持多种模型类型——从 LLM 到 text-to-image、text-to-video、embeddings 等等。作为本次初始集成的一部分，DeepInfra 在 Hugging Face 上推出对 conversational 和 text-generation 任务的支持，可访问热门 open-weight LLM，例如 DeepSeek V4、Kimi-K2.6、GLM-5.1 等。对更多任务的支持（text-to-image、text-to-video、embeddings 等）将很快推出！

如需了解如何将 DeepInfra 作为 Inference Provider 使用，请阅读其专门的文档页面。

在这里查看 DeepInfra 支持的完整模型列表。

在 Hugging Face 上关注 DeepInfra：https://huggingface.co/DeepInfra。

工作原理

在网站 UI 中

在你的用户账号设置中，你可以：

为已注册的提供商设置自己的 API keys。如果未设置自定义 key，你的请求将通过 HF 路由。
按偏好对提供商排序。这会应用到模型页面中的 widget 和代码片段。

图片 10：Inference Providers

如前所述，调用 Inference Providers 时有两种模式：

自定义 key（调用会直接发送到 inference provider，并使用你在对应 inference provider 的 API key）
由 HF 路由（在这种情况下，你不需要来自该提供商的 token，费用会直接计入你的 HF 账号，而不是提供商账号）

图片 11：Inference Providers

模型页面会展示第三方 inference providers（与当前模型兼容的提供商，并按用户偏好排序）

通过客户端 SDK

DeepInfra 可通过 Hugging Face SDK 使用——Python 使用 huggingface_hub（>= 1.11.2），JavaScript 使用 @huggingface/inference。

以下示例展示如何通过 DeepInfra 使用 DeepSeek V4 Pro。使用 Hugging Face token 进行认证——请求会自动路由到 DeepInfra。

通过你偏好的 Agent Harness

Hugging Face Inference Providers 已集成到大多数 Agent Harness 中，包括 Pi、OpenCode、Hermes Agents、OpenClaw 等。这意味着你可以直接将 DeepInfra 托管的模型接入你偏好的工具，而不需要额外的胶水代码。请在这里浏览完整的集成列表。

通过 Python

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://router.huggingface.co/v1",
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V4-Pro:deepinfra",
    messages=[
        {
            "role": "user",
            "content": "Write a Python function that returns the nth Fibonacci number using memoization."
        }
    ],
)

print(completion.choices[0].message)

通过 JS

import { OpenAI } from "openai";

const client = new OpenAI({
    baseURL: "https://router.huggingface.co/v1",
    apiKey: process.env.HF_TOKEN,
});

const chatCompletion = await client.chat.completions.create({
    model: "deepseek-ai/DeepSeek-V4-Pro:deepinfra",
    messages: [
        {
            role: "user",
            content: "Write a Python function that returns the nth Fibonacci number using memoization.",
        },
    ],
});

console.log(chatCompletion.choices[0].message);

计费

对于直接请求，也就是使用 inference provider 的 key 时，你将由对应提供商计费。例如，如果你使用 DeepInfra API key，费用会计入你的 DeepInfra 账号。

对于路由请求，也就是通过 Hugging Face Hub 进行认证时，你只需支付标准的提供商 API 费率。我们不会额外加价；我们只是直接转嫁提供商成本。（未来，我们可能会与提供商合作伙伴建立收入分成协议。）

重要说明 ‼️ PRO 用户每月可获得价值 $2 的 Inference credits。你可以在各提供商之间使用这些 credits。🔥

订阅 Hugging Face PRO 计划，即可获得 Inference credits、ZeroGPU、Spaces Dev Mode、20x 更高限额等权益。

我们也为已登录的免费用户提供少量额度的免费 inference，但如果可以的话，请升级到 PRO！

反馈与后续步骤

我们很希望收到你的反馈！请在这里分享你的想法和/或评论：https://huggingface.co/spaces/huggingface/HuggingDiscussions/discussions/49

译自 huggingface-blog · 录于二〇二六年五月十二日