huggingface-blog

CyberSecQwen-4B：为什么防御性网络安全需要小型、专用、可本地运行的模型

CyberSecQwen-4B: Why Defensive Cyber Needs Small, Specialized, Locally-Runnable Models

二〇二六年五月八日 · 英文原文

摘要

CyberSecQwen-4B 是为 AMD Developer Hackathon 构建的 Apache 2.0 cybersecurity specialist model，基于 Qwen3-4B-Instruct-2507，在单张 AMD Instinct MI300X 192GB 上用 LoRA、bf16、FlashAttention-2 fine-tune。CTI-Bench 中 CTI-MCQ 为 0.5868，CTI-RCM 为 0.6664。

](https://huggingface.co/athena129)

为 AMD Developer Hackathon 构建 · 在单张 AMD Instinct MI300X 上训练 · Apache 2.0

为什么这很重要

Frontier models 非常擅长很多事情。它们的调用成本也很高，会把每个 prompt 都发送到他人的 datacenter，并且被明确训练为拒绝真实防守者在 incident write-ups、你自己日志里发现的 attacker-grade payloads、vulnerability disclosure 草稿中会遇到的那些复杂边界场景。

防御性 cybersecurity 不是一个可以接受这些取舍的领域。

敏感证据应留在内部。 SOC analyst 分析泄露的 credential dump、malware reverse-engineer 解剖样本、vulnerability researcher 撰写 CVE — 他们都不应把这些内容粘贴到 hosted API 中。数据本身就可能构成泄露。
按次 API 成本会累积。 中型 SOC 每天处理数千条低置信度告警。为“解释这个 CVE”或“这里适用哪个 CWE”支付 hosted-API 成本，会把防御自动化变成预算问题。
Air-gapped 和部分联网环境在 critical infrastructure、healthcare、government 工作中是常态，而不是例外。 如果你的工具不能在 laptop 或单张 on-prem GPU 上运行，它就无法在那里落地。
Adversaries 正变得更加自动化。 Ransomware 团伙使用 LLMs 用 30 种语言撰写 phishing；bug-bounty automators 串联 agentic tools 来 fuzz、triage 和 exploit，速度快过人类 review。要以同样速度防御，需要防守者自己拥有并能运行的 models。

所以：local 很重要。但仅有“local”还不够。

为什么是小型 specialized model，而不只是小模型

一个在四张 GPUs 上本地运行的 70B generalist 是“local”，但它并不 deployable。一个在单张 consumer GPU 上本地运行的 4B generalist 可部署，但在你实际需要它完成的工作上，它打不过 8B specialist。

CyberSecQwen-4B 背后的判断是：对于狭窄且经过良好评估的 cyber threat intelligence 任务 — CWE classification、CVE-to-CWE mapping、结构化 CTI Q&A — 一个谨慎 fine-tune 的 4B model 可以匹配或超过 8B specialist，同时能装进 12 GB consumer card。

我们用能找到的最强公开 baseline 做了测试：Cisco 的 Foundation-Sec-Instruct-8B，并按其在 CTI-Bench 上公布的协议进行评估。

Metric（CTI-Bench，n=5，temp 0.3）	CyberSecQwen-4B	Foundation-Sec-Instruct-8B	Δ
CTI-MCQ（2,500 items）	0.5868 ± 0.0029	0.4996	+8.7 pp
CTI-RCM（1,000 CVE→CWE items）	0.6664 ± 0.0023	0.6850	−1.9 pp
Parameters	4 B	8 B	一半大小

CyberSecQwen-4B 保留了 Foundation-Sec-Instruct-8B 的 CTI-RCM accuracy 的 97.3 %，同时 CTI-MCQ score 高出 +8.7 points，且 parameter count 只有一半。 对于选择部署什么的防守者来说，这才是最应该关注的数字。

5 分钟 walkthrough

下面的 5 分钟视频以更可视化的方式介绍 training methodology、AMD MI300X workflow 和 benchmark results。如果你更想阅读完整细节，本文其余部分会用精确 configs 覆盖相同内容。

为什么选择 AMD MI300X

整个 pipeline — training、adapter merging、evaluation — 通过 AMD Developer Cloud 在单个 AMD Instinct MI300X 192 GB instance 上端到端运行。192 GB HBM3 与 ROCm 7 的 vLLM stack 结合，意味着我们完全不必考虑 quantization tricks、gradient checkpointing，或把 model 拆到多个 devices 上。完整 bf16、FlashAttention-2 forward+backward、batch size 4、sequence length 4096 — 全部在单张 GPU 上完成。

Component	Version
Hardware	AMD Instinct MI300X 192 GB · gfx942
ROCm	7.0
Docker	`vllm/vllm-openai-rocm:latest`
PyTorch	2.6.0 (ROCm)
flash-attn	2.8.3
vLLM	0.10.1
transformers / peft / trl	训练时的 latest

train.sh 中的 recipe 是 hardware-agnostic。要在其他 40 GB+ datacenter GPUs 上运行，去掉 AMD-specific environment variables（在其他地方它们是 no-ops），并从合适的 wheel 重新安装 flash-attn。我们通过在不同 stack 上训练一个 sister model 来测试 portability — 下文会详述。

Training data

两个 corpora，均为 Apache-2.0-clean，可发布：

2021 CVE → CWE mappings，来自 MITRE / NVD public records。关键是，所有与 CTI-Bench evaluation set 重叠的内容都在 training 之前去重，因此上面的 benchmark 数字是可信的 out-of-distribution holdouts，而不是 contamination。
Synthetic defensive-analyst Q&A，基于去重后的 CVE descriptions。由更强的 teacher 生成，并以 Apache-2.0 许可重新分发。

Base model 是 Qwen3-4B-Instruct-2507，这是一个 Apache-2.0 instruction-tuned 4B model，也是训练时可用的 4B-class IT models 中表现最好的。我们刻意在 IT checkpoint（而不是 base）上 fine-tune — 它保留了 IT 阶段已经建立的简洁答案 multiple-choice format priors，而这些 priors 可能会被 IT-then-SFT collapse 抹掉。

这里有一个值得指出的可测量效果：

Model	CTI-RCM	CTI-MCQ
Qwen3-4B-Instruct-2507（raw IT）	0.519	0.473
CyberSecQwen-4B（本次 fine-tune）	0.6664	0.5868

与底层 pre-trained base 相比，IT base 的 MCQ accuracy 明显下降 — 这与 Cisco 在 Foundation-Sec-Instruct vs Foundation-Sec base 中报告的“instruction-tuning collapses MCQ”模式完全一致。我们的 fine-tune 在两个 benchmarks 上都恢复并超过了 IT 起点，恢复了被 IT 削弱的 format binding，同时带来 domain lift。

Recipe

LoRA r       = 64
LoRA alpha   = 64        # alpha/r = 1.0
LoRA dropout = 0.05
LR           = 5e-5      # cosine, warmup ratio 0.03
Epochs       = 10
Precision    = bf16
Attention    = FlashAttention-2 (forward + backward)
Max seq len  = 4096
Batch        = 4 (no accumulation)
Optimizer    = paged_adamw_8bit

FlashAttention-2 在 Qwen 上启用，因为它的 head dimension（128）很适合 MI300X（gfx942）的 shared-memory budget。在此 config 下，step time 稳定在 ~7.85 s/step 左右 — 比同一 recipe 在 companion Gemma-4-E2B base model 上快约 1.6×；后者无法在其 global-attention layers 上使用 FA2（head_dim=512 超过 LDS budget），会 fallback 到 sdpa。

Companion model：相同 recipe，不同 substrate

为了检查结果是由 recipe 驱动，还是由 substrate 特定因素导致，我们训练了一个 sister model — Gemma4Defense-2B — 使用 完全相同 的 training corpus 和 hyperparameters，只把 base model 换成 Gemma-4-E2B-it。

Model	CTI-RCM（5-trial mean ± std）	CTI-MCQ
CyberSecQwen-4B（Qwen base）	0.6664 ± 0.0023	0.5868 ± 0.0029
Gemma4Defense-2B（Gemma base）	0.6754 ± 0.0035	0.6042 ± 0.0090

两个 models 在 CTI-RCM 上收敛到 0.9 points 以内。这个 recipe 具备可迁移性 — 关键在于如何 fine-tune IT checkpoint，而不是具体属于哪个 family。CyberSecQwen-4B 是 Apache 2.0，当 Gemma 的 terms-of-use 成为问题时它是合适选择；当 2B 比 4B 更符合部署预算时，Gemma4Defense-2B 是合适选择。

挑战与修复

没有 AMD ROCm 项目能没有 war-stories 章节。下面是我们的简版：

Issue	Fix
FA2 在 Gemma-4 上失败，因为 `head_dim=512`	对 global-attention layers fallback 到 sdpa。Local-attention layers 仍使用 FA2。在相同 recipe 下比 Qwen 慢约 1.6×。
AITER kernels 与 CyberPal-2.0-20B serving 冲突	对该特定 eval 设置 `VLLM_ROCM_USE_AITER=0`。AMD env var 在 ROCm 之外是 no-op，因此保留在 recipe 中。
bitsandbytes 在 ROCm 上未获官方支持	我们本来也不需要 4-/8-bit — 192 GB 有足够余量。使用 `paged_adamw_8bit`（bnb 的 optimizer-only path 可用）。
vLLM ROCm + chat template 用于 evaluation	使用 `TRITON_ATTN` backend；显式传入 merged model dir 中的 `chat_template.jinja`，避免 IT base 的 template 覆盖。
demo 的 HF-Spaces ZeroGPU quota	匿名访客会遇到每 IP 每天 2 min 的上限。Demo Space（cybersecqwen-chat）在 client-side 使用 HF OAuth，因此每个访客的调用会计入其自己的 quota（免费 3.5 min/day，Pro 25 min/day）。

自己试试

Live demo（使用 HF 登录获取免费 quota）： 👉 https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/cybersecqwen-chat

Model： 👉 https://huggingface.co/lablab-ai-amd-developer-hackathon/CyberSecQwen-4B

三行 inference（任何 12 GB+ GPU）：

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "lablab-ai-amd-developer-hackathon/CyberSecQwen-4B"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

messages = [
    {"role": "system", "content": "You are a defensive cybersecurity assistant. Answer with the canonical CWE-ID first, then 1-3 sentences of justification."},
    {"role": "user", "content": "Path traversal in a Java web app where User-controlled input concatenates into a File() path. What's the CWE?"},
]
prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
out = model.generate(**tok(prompt, return_tensors="pt").to(model.device), max_new_tokens=256, temperature=0.3)
print(tok.decode(out[0], skip_special_tokens=True))

对于 high-throughput serving，vLLM 可通过官方 vllm/vllm-openai-rocm image 在 AMD MI300X 上开箱即用。精确 serving command 和 pinned config 见 GitHub repo。

预期用途

CyberSecQwen-4B 面向从事以下工作的安全实践者：

CWE classification — 将 vulnerability descriptions（CVEs、advisories）映射到 MITRE CWE categories
CTI Q&A — 回答关于 cybersecurity concepts、attacks、controls 的结构化问题
Defensive triage assistance — 支持对 CVEs 进行 triage、确定 patch 优先级、记录 threat-actor behavior 的人类 analysts

它 明确不适用于：生成 exploit code 或 weaponized PoCs；在没有合格人类 review 的情况下自动执行安全决策；legal/medical/regulated-advice contexts；或 cybersecurity 之外的 general chat / code generation。这个 recipe 是为狭窄实用性而构建，而不是为了广度。

下一步

我们希望拓展几个方向，按优先级大致如下：

1B variant，用于 laptop-class deployment。以 Qwen2.5-1.5B 或 Llama-3.2-1B 为 base，使用相同 recipe，目标 ≥0.55 CTI-RCM（与 4B 相差不超过 6 pp）。
Quantized GGUF release（Q4_K_M、Q5_K_M），让 model 能在 phones / edge boxes 上运行。Q4_K_M 约 ~2.5 GB，完全在 ARM laptop memory 范围内。
Continual evaluation，随着新的 CVE-to-CWE mappings 发布而持续评估。2021 cohort 是有意设置的 distribution-cap；未来版本会跟踪 NVD 的增长。
Adversarial-example resilience。 Specialist model 的质量取决于它最差的情况。我们希望发布一次 hardening pass，针对 CVE-description-as-input 攻击中常见的 prompt-injection patterns。

如果其中任何一项能解除你团队的阻碍，请在 GitHub repo 上打开 issue — 这是把它们提前的最快方式。

结语

过去两年，frontier-model 的讨论一直围绕 scale。防御性 cyber 的讨论则应该围绕 什么能适配你实际需要它出现的地方。 一个 4B specialist，以一半大小匹配 8B，能在研究者负担得起的卡上运行，并且永远不把敏感证据发送到场外 — 这是 design space 中一个有用的角落，而 AMD MI300X + ROCm 7 + Hugging Face 的 training stack 让我们能在一次 training run 中进入这个角落。

试用 demo，阅读 model card，提交 issues。如果这个 recipe 能迁移到我们还没试过的东西上，那会是最有意思的下一个 data point。

— athena129 · AMD Developer Hackathon submission

译自 huggingface-blog · 录于二〇二六年五月八日