文章目录
1、简介
1.1 CTranslate2
1.2 Intel MKL
1.3 cuDNN
1.4 Transformer
2、下载和安装
2.1 命令行
2.2 代码
3、模型下载
3.1 在线测试
3.1.1 tiny
3.1.2 large-v2
3.2 离线测试
3.2.1 tiny
3.2.1 large-v2
结语
1、简介
https://github.com/SYSTRAN/faster-whisper
https://pypi.org/project/faster-whisper/

Faster-Whisper是Whisper开源后的第三方进化版本，它对原始的 Whisper 模型结构进行了改进和优化。
faster-whisper 是使用 CTranslate2 重新实现 OpenAI 的 Whisper 模型，CTranslate2 是 Transformer 模型的快速推理引擎。

此实现比 openai/whisper 快 4 倍，同时使用更少的内存实现相同的准确性。通过对 CPU 和 GPU 进行 8 位量化，可以进一步提高效率。

1.1 CTranslate2
https://github.com/OpenNMT/CTranslate2/
CTranslate2 是一个 C++ 和 Python 库，用于使用 Transformer 模型进行高效推理。
该项目实现了一个自定义运行时，该运行时应用了许多性能优化技术，例如权重量化、层融合、批量重新排序等，以加速和减少 Transformer 模型在 CPU 和 GPU 上的内存使用。

CTranslate2 可以用 pip 安装：

pip install ctranslate2
AI写代码
bash
1
1.2 Intel MKL
获取英特尔® oneAPI 数学核心函数库（oneMKL）

https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html
https://www.intel.cn/content/www/cn/zh/developer/tools/oneapi/onemkl-download.html

CTranslate2 的核心是其高度优化的模型推理实现。它支持多种硬件平台，包括 CPU 和 GPU，并利用了底层的并行计算库如 Intel MKL 或者 cuDNN 来最大化性能。

1.3 cuDNN
https://developer.nvidia.com/cudnn
NVIDIA CUDA® 深度神经网络库（cuDNN）是用于深度神经网络的 GPU 加速基元库。cuDNN 为标准例程（如前向和后向卷积、注意力、matmul、池化和归一化）提供高度优化的实现。

1.4 Transformer
https://arxiv.org/pdf/1706.03762
目前，自然语言处理中，有三种特征处理器：卷积神经网络、递归神经网络和后起之秀 Transformer。Transformer 风头已经盖过两个前辈，它抛弃了传统的卷积神经网络和递归神经网络，整个网络结构完全是由注意力机制组成。准确地讲，Transformer 仅由自注意力和前馈神经网络组成。

2、下载和安装
2.1 命令行
https://github.com/Purfview/whisper-standalone-win
Whisper & Faster-Whisper 独立可执行文件，适合那些不想打扰 Python 的人。

解压文件夹如下：

测试如下：

whisper-faster.exe "D:\videofile.mkv" –language English –model medium –output_dir source
whisper-faster.exe "D:\videofile.mkv" -l English -m medium -o source –sentence
whisper-faster.exe "D:\videofile.mkv" -l Japanese -m medium –task translate –standard
whisper-faster.exe –help

faster-whisper-xxl.exe –language zh –model "large-v2" –compute_type=int8 –sentence -prompt auto –beep_off –print_progress –vad_alt_method pyannote_v3 –ff_mdx_kim2 –mdx_device cpu "yxy_audio.mp3"
AI写代码
bash

2.2 代码
从 PyPI 安装：

pip install faster-whisper
AI写代码
bash
1

3、模型下载
3.1 在线测试
3.1.1 tiny
from faster_whisper import WhisperModel

model = WhisperModel("tiny")

segments, info = model.transcribe("yxy_audio.mp3")
for segment in segments:
print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
AI写代码
python
运行
1
2
3
4
5
6
7

3.1.2 large-v2
from faster_whisper import WhisperModel

model = WhisperModel("large-v2")

segments, info = model.transcribe("yxy_audio.mp3")
for segment in segments:
print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
AI写代码
python
运行

3.2 离线测试
large-v3模型：https://huggingface.co/Systran/faster-whisper-large-v3/tree/main
large-v2模型：https://huggingface.co/guillaumekln/faster-whisper-large-v2/tree/main
large-v1模型：https://huggingface.co/guillaumekln/faster-whisper-large-v1/tree/main
medium模型：https://huggingface.co/guillaumekln/faster-whisper-medium/tree/main
small模型：https://huggingface.co/guillaumekln/faster-whisper-small/tree/main
base模型：https://huggingface.co/guillaumekln/faster-whisper-base/tree/main
tiny模型：https://huggingface.co/guillaumekln/faster-whisper-tiny/tree/main

or
https://aifasthub.com/models/guillaumekln
AI写代码
bash

3.2.1 tiny
这里下载tiny模型到本地：

from faster_whisper import WhisperModel

model_size = "large-v2"
path = r"C:\Users\tomcat\Desktop\tiny"

Run on GPU with FP16

model = WhisperModel(model_size_or_path=path, device="cpu", local_files_only=True)

or run on GPU with INT8

model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")

or run on CPU with INT8

model = WhisperModel(model_size, device="cpu", compute_type="int8")

segments, info = model.transcribe("yxy_audio2.mp3", beam_size=5, language="zh", vad_filter=True, vad_parameters=dict(min_silence_duration_ms=1000))

print("Detected language ‘%s’ with probability %f" % (info.language, info.language_probability))

for segment in segments:
print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

AI写代码
python
运行

local_files_only=True 表示加载本地模型
model_size_or_path=path 指定加载模型路径
device="cuda" 指定使用cuda or cpu
compute_type="int8_float16" 量化为8位
language="zh" 指定音频语言
vad_filter=True 开启vad
vad_parameters=dict(min_silence_duration_ms=1000) 设置vad参数
AI写代码
bash

3.2.1 large-v2
这里下载large-v2模型到本地：

from faster_whisper import WhisperModel

model_size = "large-v2"
path = r"C:\Users\tomcat\Desktop\large-v2"

Run on GPU with FP16

model = WhisperModel(model_size_or_path=path, device="cpu", local_files_only=True)

or run on GPU with INT8

model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")

or run on CPU with INT8

model = WhisperModel(model_size, device="cpu", compute_type="int8")

segments, info = model.transcribe("yxy_audio2.mp3", beam_size=5, language="zh", vad_filter=True, vad_parameters=dict(min_silence_duration_ms=1000))

print("Detected language ‘%s’ with probability %f" % (info.language, info.language_probability))

for segment in segments:
print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
AI写代码
python
运行

下载cuBLAS and cuDNN：

https://github.com/Purfview/whisper-standalone-win/releases/tag/libs
AI写代码
bash
1

————————————————
版权声明：本文为CSDN博主「爱看书的小沐」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/hhy321/article/details/139895906

转载请注明：SuperIT » Python实现语音识别（faster-whisper

SuperIT 专业IT技术社区之后端大数据与Devops

Python实现语音识别（faster-whisper

Run on GPU with FP16

or run on GPU with INT8

model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")

or run on CPU with INT8

model = WhisperModel(model_size, device="cpu", compute_type="int8")

Run on GPU with FP16

or run on GPU with INT8

model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")

or run on CPU with INT8

model = WhisperModel(model_size, device="cpu", compute_type="int8")

您必须登录才能发表评论！

Run on GPU with FP16

or run on GPU with INT8

model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")

or run on CPU with INT8

model = WhisperModel(model_size, device="cpu", compute_type="int8")

Run on GPU with FP16

or run on GPU with INT8

model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")

or run on CPU with INT8

model = WhisperModel(model_size, device="cpu", compute_type="int8")

您必须 登录 才能发表评论！

您必须登录才能发表评论！