Java语言的语音识别框架--CMUSphinx-4

2021-12-15 | 2,252 浏览

简介
搭建步骤
1、新建maven项目
2、下载中文语言识别模型
3、编写测试代码
4、 FAQ

简介

CMUSphinx4是卡内基梅隆大学的开放源代码语音识别系统的最新Java版。官方网站，gitee，github。

搭建步骤

1、新建maven项目

在pom.xml中添加依赖：

<dependencies>
    <dependency>
        <groupId>edu.cmu.sphinx</groupId>
        <artifactId>sphinx4-core</artifactId>
        <version>5prealpha-SNAPSHOT</version>
    </dependency>
</dependencies>
<repositories>
    <repository>
        <id>snapshots-repo</id>
        <url>https://oss.sonatype.org/content/repositories/snapshots</url>
        <releases>
            <enabled>false</enabled>
        </releases>
        <snapshots>
            <enabled>true</enabled>
        </snapshots>
    </repository>
</repositories>

2、下载中文语言识别模型

下载地址：cmusphinx-zh-cn-5.2.tar.gz
将下载的压缩包解压到maven项目的人resources目录下

3、编写测试代码

public static void main(String[] args) throws IOException {
    Configuration configuration = new Configuration();
    configuration.setAcousticModelPath("resource:/cmusphinx-zh-cn-5.2/zh_cn.cd_cont_5000");
    configuration.setDictionaryPath("resource:/cmusphinx-zh-cn-5.2/zh_cn.dic");
    configuration.setLanguageModelPath("resource:/cmusphinx-zh-cn-5.2/zh_cn.lm.bin");
    StreamSpeechRecognizer recognizer = new StreamSpeechRecognizer(configuration);
    InputStream stream = new FileInputStream("E:/zq/1.mp3");
    recognizer.startRecognition(stream);
    SpeechResult result;
    while ((result = recognizer.getResult()) != null) {
        System.out.format("识别内容: %s\n", result.getHypothesis());
    }
}

4、 FAQ

A：部分音频识别报数据越界
Q：修改官方源码：edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstLookaheadSearchManager类中的第260行,将数组大小调大即可：

// TODO more precise range of baseIds, remove magic number
float[] frameCiScores = new float[100];

A：中文识别销率不高
Q：训练自己的小范围语言模型。新建lang.txt文件，在其中编辑中文词汇，一行一个，访问网站,上传lang.txt文件，生成字典和语言模型，下载下来解压，主要需要里面的.dic和.lm两个文件，需要修改.dic文件，将读音标注上去，格式可以参考官方的中文字典文件zh_cn.dic。修改java代码，将字典文件和语言模型指向到刚下载的两个文件即可。