將 DeepSeek 透過 Transformers.js 在你的瀏覽器上運行!

用了一陣子的 transformers
好像還沒有玩過 transformers.js
因此用當紅的 DeepSeek 來玩看看
transformers.js
transformers.js 可以讓你在瀏覽器上運行,不需要架設任何伺服器
HuggingFace 在 Python 上有 transformers 的 library 可以使用
而在 js 上就是 transformers.js
模型選擇
由於 transformers.js 底層是使用 ONNX Runtime 來在瀏覽器上運行模型,因此我們要讀取的模型的格式要是 onnx 格式,要找到支援 transformers.js 的模型,我們可以在左邊的欄位選擇 transformers.js,這樣右邊列出的模型就是可以使用 transformers.js 的模型。
https://huggingface.co/models?library=transformers.js

讀取模型
首先我們要先創建 pipeline,第一個參數是 task,支援的 task 可以看這裡 https://huggingface.co/docs/transformers.js/pipelines#available-tasks
由於我們是要做文字生成,因此選擇 text-generation"
模型我們選擇 DeepSeek 的模型 onnx-community/DeepSeek-R1-Distill-Qwen-1.5B-ONNX
由於 onnx-community/DeepSeek-R1-Distill-Qwen-1.5B-ONNX 有提供不同類型的量化模型,我們選擇了 q4f16 的 dtype
運算的裝置,為了效率,我們採用 webgpu,
反正 wasm 跑起來會有問題
import { pipeline } from "https://cdn.jsdelivr.net/npm/@huggingface/transformers@latest";
generator = await pipeline("text-generation", "onnx-community/DeepSeek-R1-Distill-Qwen-1.5B-ONNX",
{ dtype: "q4f16", device: "webgpu" });
推理
初始化和讀取模型完成,我們就可以準備輸入和推理了
- 首先我們先創建要給模型的輸入 prompt (messages)
- 接著我們可以用
TextStreamer
來將我們的輸出一個字一個字輸出出來 (callback_function)
import { TextStreamer } from "https://cdn.jsdelivr.net/npm/@huggingface/transformers@latest";
const messages = [
{ role: "user", content: "Solve the equation: x^2 - 3x + 2 = 0" },
];
const streamer = new TextStreamer(generator.tokenizer, {
skip_prompt: true,
callback_function: (text) => {}
});
const output = await generator(messages, { max_new_tokens: 512, do_sample: false, streamer });
Marked + Katex
模型的輸出是 Markdown 格式,因此我另外加上透過 marked 來將 Markdown 格式轉成 html 的格式
Latex 數學的部分則是透過 Katex 來處理,除了引入必要的 library 之外,也要引入 css
使用上:
- 先透過 Katex 的
renderMathInElement
先處理元素內數學的部分 - 再透過 Marked 的
marked.parse
將字串轉成 html 元素
<!-- markdown -->
<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
<!-- katex -->
<script src="https://cdn.jsdelivr.net/npm/katex@latest/dist/katex.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/katex@latest/dist/contrib/auto-render.min.js"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@latest/dist/katex.min.css" crossorigin="anonymous">
<script>
renderMathInElement(result_ele, {
delimiters: [
{ left: '$$', right: '$$', display: true },
{ left: '$', right: '$', display: false },
{ left: '\\(', right: '\\)', display: false },
{ left: '\\[', right: '\\]', display: true }
],
throwOnError: false
});
result_ele.innerHTML = marked.parse(result_ele.innerHTML)
</script>
result_ele
是 div 元素。Demo
Sample Code:
<!DOCTYPE html>
<html>
<head>
<title>Example</title>
<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/katex@latest/dist/katex.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/katex@latest/dist/contrib/auto-render.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/dompurify@latest/dist/purify.min.js"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@latest/dist/katex.min.css" crossorigin="anonymous">
<style>
#prompt_textarea {
width: 100%;
height: 150px;
padding: 12px 16px;
box-sizing: border-box;
border: 2px solid #ccc;
border-radius: 4px;
font-size: 16px;
resize: none;
margin-top: 10px;
}
.model_btn {
border-radius: 20px;
padding: 4px 8px;
border: 2px solid #ccc;
}
</style>
<script type="module">
import { pipeline, TextStreamer } from "https://cdn.jsdelivr.net/npm/@huggingface/transformers@latest";
let generator = undefined;
let cached_output = "";
const status_ele = document.getElementById("status");
const init_btn_ele = document.getElementById("init_btn");
const infer_btn_ele = document.getElementById("infer_btn");
const result_ele = document.getElementById("result");
const prompt_ele = document.getElementById("prompt_textarea");
async function initial_handler() {
status_ele.textContent = "Initializing...";
init_btn_ele.disabled = true;
await initial("text-generation",
"onnx-community/DeepSeek-R1-Distill-Qwen-1.5B-ONNX",
{ dtype: "q4f16", device: "webgpu" });
status_ele.textContent = "Ready";
infer_btn_ele.disabled = false;
}
async function infer_handler() {
status_ele.textContent = "Infering...";
result_ele.textContent = "";
infer_btn_ele.disabled = true;
cached_output = "";
await generate(prompt_ele.value);
status_ele.textContent = "Ready";
infer_btn_ele.disabled = false;
}
async function initial(task, mode_name, config) {
generator = await pipeline(
task,
mode_name,
config,
);
}
async function generate(prompt) {
if (generator === undefined) {
return;
}
// Define the list of messages
const messages = [
{ role: "user", content: prompt },
];
// Create text streamer
const streamer = new TextStreamer(generator.tokenizer, {
skip_prompt: true,
callback_function: (text) => {
cached_output += text;
cached_output = DOMPurify.sanitize(cached_output);
result_ele.innerHTML = cached_output;
renderMathInElement(result_ele, {
// customised options
delimiters: [
{ left: '$$', right: '$$', display: true },
{ left: '$', right: '$', display: false },
{ left: '\\(', right: '\\)', display: false },
{ left: '\\[', right: '\\]', display: true }
],
// • rendering keys, e.g.:
throwOnError: false
});
result_ele.innerHTML = marked.parse(result_ele.innerHTML)
}, // Optional callback function
});
return generator(messages, { max_new_tokens: 1024, do_sample: false, streamer });
}
init_btn_ele.addEventListener("click", initial_handler);
infer_btn_ele.addEventListener("click", infer_handler);
</script>
</head>
<body>
<h1 id="status">Uninitialized</h1>
<button id="init_btn" class="model_btn" >Initial</button>
<button id="infer_btn" class="model_btn" disabled>Start</button>
<div>
<textarea id="prompt_textarea" rows="4" cols="50">Solve the equation: x^2 - 3x + 2 = 0</textarea>
</div>
<p id="result"></p>
</body>
</html>
- Model: onnx-community/DeepSeek-R1-Distill-Qwen-1.5B-ONNX
- Quantization: q4f16
- Max Token Length: 1024
- Device: webgpu
使用說明:
- 按下 Initial 的按鈕開始讀取模型,第一次需要下載會比較久
- 在文字空輸入好 prompt,按下 Start 按鈕就會開始生成輸出
Uninitialized
Reference
- https://yucj.gitbooks.io/ecmascript-6/content/docs/module-loader.html
- https://github.com/markedjs/marked
- https://github.com/KaTeX/KaTeX
- https://github.com/cure53/DOMPurify
- https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.TextGenerationPipeline
- https://huggingface.co/onnx-community/DeepSeek-R1-Distill-Qwen-1.5B-ONNX
如果你覺得這篇文章有用 可以考慮贊助飲料給大貓咪