博文

[转载]【信息技术】【2005.02】音频分割与分类

已有 1007 次阅读 2021-1-16 17:22 |系统分类:科研笔记|文章来源:转载

这个项目描述了在音频分割和分类系统开发方面所做的工作。许多现有的音频分类工作都涉及到对已知同质音频片段进行分类的问题。在这项工作中，录音被分成声学上相似的区域，并被分为基本的音频类型，如语音、音乐或静音。

本项目使用的音频特征包括Mel倒谱系数（MFCC）、过零率和短时能量（STE）。这些特征是从以WAV格式存储的音频文件中提取的，还考虑了直接从MPEG音频文件中提取特征的可能用途。基于这些特征的统计方法被用来分割和分类音频信号。使用的分类方法包括一般混合模型（GMM）和k-最近邻（k-NN）算法。实验结果表明，该系统对离散音频分类的准确率达到95%以上。

This project describes the work done on the development of an audio segmentation and classification system. Many existing works on audio classification deal with the problem of classifying known homogeneous audio segments. In this work, audio recordings are divided into acoustically similar regions and classified into basic audio types such as speech, music or silence. Audio features used in this project include Mel Frequency Cepstral Coefficients (MFCC), Zero Crossing Rate and Short Term Energy (STE). These features were extracted from audio files that were stored in a WAV format. Possible use of features, which are extracted directly from MPEG audio files, is also considered. Statistical based methods are used to segment and classify audio signals using these features. The classification methods used include the General Mixture Model (GMM) and the k- Nearest Neighbour (k-NN) algorithms. It is shown that the system implemented achieves an accuracy rate of more than 95% for discrete audio classification.

大工至善|大学至真分享 http://blog.sciencenet.cn/u/lcj2212916

博文

[转载]【信息技术】【2005.02】音频分割与分类

1. 引言

2. 音频特征提取

3. 音频分类

4. 音频分割

5. 感知编码音频

6. 实验结果与结论

7. 结论与展望

更多精彩文章请关注公众号：

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

刘春静

全部作者的其他最新博文

全部精选博文导读

大工至善|大学至真分享 http://blog.sciencenet.cn/u/lcj2212916

博文

[转载]【信息技术】【2005.02】音频分割与分类

1. 引言

2. 音频特征提取

3. 音频分类

4. 音频分割

5. 感知编码音频

6. 实验结果与结论

7. 结论与展望

更多精彩文章请关注公众号：

当前推荐数：0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

刘春静

全部作者的其他最新博文

全部精选博文导读

该博文允许注册用户评论请点击登录评论 (0 个评论)