coqui-ai/TTS ¶

一、檢查和準備環境¶

# 目前TTS，最高可使用版本3.10，高版本可能依賴可能安裝失敗
python3 --version
# 安裝必要庫
python3 -m venv tts_env
source tts_env/bin/activate

二、從源碼安裝 coqui-ai/TTS¶

# 克隆代碼庫
git clone https://github.com/coqui-ai/TTS.git
cd TTS
# 安裝依賴
pip install -r requirements.txt
# 這將自動創建 tts 可用的 CLI 工具
pip install -e .

三、運行項目¶

# 測試安裝是否成功
tts --list_models

開始使用（中文模型）¶

1.使用聲音方案

tts --model_name "tts_models/multilingual/multi-dataset/xtts_v2" \
    --text "你好，这是一段中文语音测试。" \
    --language_idx zh-cn \
    --speaker_idx "Asya Anara" \
    --out_path output.wav

2.使用指定的聲音樣本

# 單個
tts --model_name "tts_models/multilingual/multi-dataset/xtts_v2" \
    --text "你好，这是一段中文语音测试。" \
    --language_idx zh-cn \
    --speaker_wav sample.wav \
    --out_path output.wav

# 多個
tts --model_name "tts_models/multilingual/multi-dataset/xtts_v2" \
    --text "你好，这是一段中文语音测试。" \
    --language_idx zh-cn \
    --speaker_wav ./sample/*.wav \
    --out_path output.wav

常用命令¶

如何檢查可用的模型與語言¶

1.列出所有模型名稱：tts --list_models

2.使用 xtts_v2 列出支持的語言編號（language_idx）：tts --model_name "tts_models/multilingual/multi-dataset/xtts_v2" --list_language_idx 你應該會看到類似如下的輸出，其中 zh-cn 為中文的語言編號： 0: en (English) 1: zh-cn (Simplified Chinese) 2: es (Spanish) ...

3.列出所有模型支持的說話人（聲音方案）：tts --model_name "tts_models/multilingual/multi-dataset/xtts_v2" --list_speaker_idx

音頻轉換¶

使用ffmpeg來進行轉換

ffmpeg -i input.mp3 output.wav
ffmpeg -i input.mp4 output.wav

遇到的問題¶

1.PyTorch 版本改變了默認行為：在 PyTorch 2.6 中，torch.load 的默認行為改變了，把 weights_only 的默認值設置為 True，導致某些模型檢查點無法正常加載。

2.自定義類 XttsConfig 的安全性限制： PyTorch 遇到了檢查點中定義的 XttsConfig 類，卻因為未被安全允許（safe globals）而被阻止。

(1) In PyTorch 2.6, we changed the default value of the weights_only argument in torch.load from False to True. Re-running torch.load with weights_only set to False will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.

(2) Alternatively, to load with weights_only=True please check the recommended steps in the following error message.

解決方案：

# 打開 xtts.py 文件，找到以下代碼：
checkpoint = load_fsspec(model_path, map_location=torch.device("cpu"))["model"]
# 將其替換為以下代碼：
checkpoint = torch.load(model_path, map_location=torch.device("cpu"), weights_only=False)["model"]