PCM格式介绍

article/2025/9/30 19:23:23

转自:http://www.cnblogs.com/cheney23reg/archive/2010/08/08/1795067.html

http://wiki.multimedia.cx/index.php?title=PCM

PCM数据格式

 

PCM(Pulse Code Modulation)也被称为 脉码编码调制。PCM中的声音数据没有被压缩,如果是单声道的文件,采样数据按时间的先后顺序依次存入。(它的基本组织单位是BYTE(8bit)或WORD(16bit))

 

一般情况下,一帧PCM是由2048次采样组成的( 参考 http://discussion.forum.nokia.com/forum/showthread.php?129458-请问PCM格式的音频流,每次读入或输出的块的大小是必须固定为4096B么&s=e79e9dd1707157281e3725a163844c49 )。

 

如果是双声道的文件,采样数据按时间先后顺序交叉地存入。如图所示:

pcm_format_1 

 


PCM的每个样本值包含在一个整数i中,i的长度为容纳指定样本长度所需的最小字节数。

首先存储低有效字节,表示样本幅度的位放在i的高有效位上,剩下的位置为0,这样8位和16位的PCM波形样本的数据格式如下所示。

 

    样本大小      数据格式            最小值    最大值

    8位PCM       unsigned int         0       225

    16位PCM      int                -32767    32767




PCM Parameters

PCM audio is coded using a combination of various parameters.

Resolution/Sample Size

This parameter specifies the amount of data used to represent each discrete amplitude sample. The most common values are 8 bits (1 byte), which gives a range of 256 amplitude steps, or 16 bits (2 bytes), which gives a range of 65536 amplitude steps. Other sizes, such as 12, 20, and 24 bits, are occasionally seen. Some king-sized formats even opt for 32 and 64 bits per sample.

Byte Order

When more than one byte is used to represent a PCM sample, the byte order (big endian vs. little endian) must be known. Due to the widespread use of little-endian Intel CPUs, little-endian PCM tends to be the most common byte orientation.

Sign

It is not enough to know that a PCM sample is, for example, 8 bits wide. Whether the sample is signed or unsigned is needed to understand the range. If the sample is unsigned, the sample range is 0..255 with a centerpoint of 128. If the sample is signed, the sample range is -128..127 with a centerpoint of 0. If a PCM type is signed, the sign encoding is almost always 2's complement. In very rare cases, signed PCM audio is represented as a series of sign/magnitude coded numbers.

Channels And Interleaving

If the PCM type is monaural, each sample will belong to that one channel. If there is more than one channel, the channels will almost always be interleaved: Left sample, right sample, left, right, etc., in the case of stereo interleaved data. In some rare cases, usually when optimized for special playback hardware, chunks of audio destined for different channels will not be interleaved.

Frequency And Sample Rate

This parameter measures how many samples/channel are played each second. Frequency is measured in samples/second (Hz). Common frequency values include 8000, 11025, 16000, 22050, 32000, 44100, and 48000 Hz.

Integer Or Floating Point

Most PCM formats encode samples using integers. However, some applications which demand higher precision will store and process PCM samples using floating point numbers.

Floating-point PCM samples (32- or 64-bit in size) are zero-centred and varies in the interval [-1.0, 1.0], thus signed values.

PCM Types

Linear PCM

The most common PCM type.

Logarithmic PCM

Rather than representing sample amplitudes on a linear scale as linear PCM coding does, logarithmic PCM coding plots the amplitudes on a logarithmic scale. Log PCM is more often used in telephony and communications applications than in entertainment multimedia applications.

There are two major variants of log PCM: mu-law (u-law) and A-law. Mu-law coding uses the format number 0x07 in Microsoft multimedia files (WAV/AVI/ASF) and the fourcc 'ulaw' in Apple Quicktime files. A-law coding uses the format number 0x06 is Microsoft multimedia files and the fourcc 'alaw' in Apple Quicktime files.

Every byte of a log PCM data chunk maps to a signed 16-bit linear PCM sample. [TODO: Add either the conversion tables or conversion formulas]

Differential PCM

Values are encoded as differences between the current and the previous value. This reduces the number of bits required per audio sample by about 25% compared to PCM.

Adaptive DPCM

The size of the quantization step is varied to allow further reduction of the required bandwidth for a given signal-to-noise ratio.

Platform-Specific PCM Identifiers And Characteristics

This section describes how different computing platforms store PCM audio data and any format identifiers they use.

DOS/Windows

The first widely available, PC audio card that could play back PCM audio was the Creative Labs' Sound Blaster. This drove the audio format for a lot of early audio-capable DOS applications and games. The original Sound Blaster could only play mono, unsigned 8-bit PCM data. Later Sound Blaster cards were capable of playing back 16-bit audio data. However, while these cards still played unsigned 8-bit PCM data, 16-bit data needed be signed.

Likely owing to the DOS/Intel little endian architecture, 16-bit PCM for the Sound Blaster also needs to be little endian.

Further, the original Sound Blaster was somewhat limited in the frequencies that it could support. The digital to analog conversion hardware (DAC) had to be programmed with a byte value (frequency divisor) that was processed through the following formula to yield the final playback frequency:

frequency = 1000000 / (256 - frequency_divisor)

A common divisor is 211 which yields an integer frequency of 22222 Hz, a common rate in the days of the Sound Blaster. Note that while very low frequencies (all the way down to 3921 Hz) were supported, frequencies above 45454 Hz were not.

Microsoft WAV/AVI/ASF Identifiers

Microsoft multimedia file formats such as WAV, AVI, and ASF all share the WAVEFORMATEX data structure. The structure defines, among other properties, a 16-bit little endian audio identifier. The following audio identifiers correspond to various PCM formats:

  • 0x0001 denotes linear PCM
  • 0x0006 denotes A-law logarithmic PCM
  • 0x0007 denotes mu-law logarithmic PCM

Apple Macintosh

Native sample rates of early Apple Macintosh audio hardware included 11127 Hz and 22254 Hz. These sample rates are commonly seen in early QuickTime files.

Apple QuickTime Identifiers

Audio information in QuickTime files is stored along with an stsd atom that contains a FOURCC to indicate the format type. Apple QuickTime accomodates a number of different PCM formats:

  • 'raw ' (need space character, ASCII 0x20, to round out FOURCC) denotes unsigned, linear PCM. 16-bit data is stored in little endian format.
  • 'twos' denotes signed (i.e. twos-complement) linear PCM. 16-bit data is stored in big endian format.
  • 'sowt' ('twos' spelled backwards) also denotes signed linear PCM. However, 16-bit data is stored in little endian format.
  • 'in24' denotes 24-bit, big endian, linear PCM.
  • 'in32' denotes 32-bit, big endian, linear PCM.
  • 'fl32' denotes 32-bit floating point PCM. (Presumably IEEE 32-bit; byte order?)
  • 'fl64' denotes 64-bit floating point PCM. (Presumably IEEE 64-bit; byte order?)
  • 'alaw' denotes A-law logarithmic PCM.
  • 'ulaw' denotes mu-law logarithmic PCM.

Red Book CD Audio

The "Red Book" defines the format of a standard audio compact disc (CD). The audio data on a standard CD consists of 16-bit linear PCM samples stored in little endian format, replayed at 44100 Hz (hence the standard term "CD-quality audio"), with left-right stereo interleaving.

Sega CD

Games made for the Sega CD, an add-on for the Sega Genesis game console, all seem to use sign-magnitude coding to store PCM information. It is a good guess that the Sega CD unit has custom hardware to play this format natively.

Sega Saturn

Games made for the Sega Saturn video game console generally seem to store PCM data as signed, 8-bit data or signed, big endian, 16-bit data. The curious property of the PCM, however, is the stereo handling. Generally, multimedia files on Sega Saturn games (most often stored using the Sega FILM format) would store a block of left channel information followed by a block of right channel information rather than interleaving left and right samples. This is likely due to custom multi-channel audio hardware in which individual channels are assigned pan positions. For playing stereo data, one channel is assigned extreme left and another is assigned extreme right. The correct samples are sent to their respective channels. Interleaved data would require deinterleaving before playback.

DVD PCM

Standard Video-DVDs can contain 16-bit, 20-bit and 24-bit signed, linear PCM (often called LPCM) streams. A stream can consist of up to 8 channels as long as the maximum bandwidth of 6.144 mbit/sec for any LPCM audio stream is not exceeded. Two samplerates are supported: 48kHz and 96kHz.

  • technical info: [1]

24-Bit PCM

24-bit linear PCM is stored in blocks. Each block is divided into two parts. The first part contains the most significant two bytes of each channel for two samples in big endian order:

 < ---  sample 1 --- > < ---  sample 2 --- >T0 M0 T1 M1 ... Tx Mx T0 M0 T1 M1 ... Tx Mx

The second part contains all least significant bytes of each channel for the two samples in the same order:

 < sample 1 > < sample 2 >B0 B1 ... Bx B0 B1 ... Bx

The complete block looks like this:

 < ---  sample 1 --- > < ---  sample 2 --- > < sample 1 > < sample 2 >T0 M0 T1 M1 ... Tx Mx T0 M0 T1 M1 ... Tx Mx B0 B1 ... Bx B0 B1 ... Bx


  • T = top byte = bits 23..16
  • M = middle byte = bits 15..8
  • B = bottom byte = bits 7..0

Dvd-24bit-pcm.png

20-Bit PCM

The 20-bit packing is similar to the 24-bit packing. The only difference is that 2 channels use the nibbles of one byte as their least significant bits.

  • stereo 20-bit example:
 < s1 >  < s2 >  < s1 >  < s2 >  <s1>  <s2>T0  M0  T1  M1  T0  M0  T1  M1  L01   L01

The 4 high bits of L01 (higher nibble) being the least significant bits of channel 0 and the 4 lower bits (lower nibble) being the least significant bits of channel 1.

 byte = 2^7  2^6  2^5  2^4  2^3  2^2  2^1  2^0L01  =  B0   B0   B0   B0   B1   B1   B1   B1  (bitwise)

There are always 2 samples coded to not need to pad anything with 0s.

16-bit PCM

With this coding the block consists only of the first part described above.

Identifying PCM Data



http://chatgpt.dhexx.cn/article/qQ8U1264.shtml

相关文章

PCM文件格式简介(比较专业)

PCM文件&#xff1a;模拟音频信号经模数转换&#xff08;A/D变换&#xff09;直接形成的二进制序列&#xff0c;该文件没有附加的文件头和文件结束标志。Windows的Convert工具可以把PCM音频格式的文件转换成Microsoft的WAV格式的文件。 脉冲编码调制PCM文件格式简介 将音频数…

PCM数据格式介绍

目录 什么是PCM Sampling&#xff08;采样&#xff09; 采样率(Sample rate) Quantization&#xff08;量化&#xff09; Encoding&#xff08;编码&#xff09; PCM数据常用量化指标 PCM数据流 音量控制 采样率调整 什么是PCM PCM全称Pulse-Code Modulation&#xff…

PCM音频数据格式介绍

1. What is PCM? PCM(Pulse-code-modulation)是模拟信号以固定的采样频率转换成数字信号后的表现形式。 Sample Rate : 采样频率单位为&#xff1a;Hz。采样频率越高&#xff0c;音频质量越好&#xff0c;占用空间也越大。 Sign : 音频数据是否是有符号的。通常情况下都是有…

单调栈图文详解(附Java模板)

啥是"单调栈"&#xff0c;它能解决什么样的问题&#xff1f; &#x1f337; 仰望天空&#xff0c;妳我亦是行人.✨ &#x1f984; 个人主页——微风撞见云的博客&#x1f390; &#x1f433; 数据结构与算法专栏的文章图文并茂&#x1f995;生动形象&#…

算法之单调栈常见题目

什么时候需要使用单调栈&#xff1f; 通常是一维数组&#xff0c;要寻找任意一个右边或者左边第一个比自己大或小的元素的位置&#xff0c;此时我们就想到可以使用单调栈了。 单调栈的本质是空间换时间&#xff0c;因为在遍历的过程中需要用一个栈来记录右边第一个比当前元素高…

单调栈及单调栈的应用

什么是单调栈 单调递增栈&#xff1a;单调递增栈就是从栈底到栈顶数据是从大到小单调递减栈&#xff1a;单调递减栈就是从栈底到栈顶数据是从小到大 解决那类问题 要知道单调栈的适用于解决什么样的问题&#xff0c;我们首先需要知道单调栈的作用。单调栈分为单调递增栈和单调…

理解单调栈与单调队列

单调栈 单调栈&#xff1a;栈内的元素按照某种方式排序下单调递增或单调递减&#xff0c;如果新入栈的元素破坏的单调性&#xff0c;就弹出栈内元素&#xff0c;直到满足单调性。 单调栈分为单调递增栈和单调递减栈&#xff1a; 单调递增栈&#xff1a;栈中数据入栈或出栈的…

【栈 单调栈】浅谈单调栈与单调栈的理解

单调栈 定义&#xff1a; 单调栈&#xff0c;顾名思义&#xff0c;是栈内元素保持一定单调性&#xff08;单调递增或单调递减&#xff09;的栈。这里的单调递增或递减是指的从栈顶到栈底单调递增或递减。既然是栈&#xff0c;就满足后进先出的特点。与之相对应的是单调队列。 …

单调栈(一)

单调栈基本概念及实现 方案1&#xff1a;对于每一个数&#xff0c;遍历其左右位置&#xff0c;时间复杂度为O(N^2) 方案2&#xff1a;单调栈&#xff0c;每个元素入栈一次出栈一次&#xff0c;时间复杂度为O(N) &#xff08;一&#xff09;数组中没有重复值 示例&#xff1a;[…

第九章:单调栈与单调队列

单调栈与单调队列 一、单调栈1、什么是单调栈&#xff1f;2、单调栈的模板&#xff08;1&#xff09;问题&#xff1a;&#xff08;2&#xff09;分析&#xff1a; 二、单调队列1、什么是单调队列2、单调队列模板&#xff08;1&#xff09;问题&#xff08;2&#xff09;分析 一…

单调栈算法详解

单调栈算法详解 单调栈使用模板 stack<int> st; //此处一般需要给数组最后添加结束标志符&#xff0c;具体下面例题会有详细讲解 for (遍历这个数组){if (栈空 || 栈顶元素大于等于当前比较元素){入栈;}else{while (栈不为空 && 栈顶元素小于当前元素){栈顶元素…

单调队列和单调栈详解

这里是我的blog&#xff1a;有更多算法分享。排版可能也会更好看一点v https://endlesslethe.com/monotone-queue-and-stack-tutorial.html 前言 单调栈和单调队列算是栈和队列的高级应用吧&#xff0c;在公司面试中应该是不怎么会出现的&#xff08;除非算法岗&#xff1f;…

什么是单调栈

什么是单调栈 单调栈就是单调递增或者单调递减的栈&#xff0c;也就是栈底到栈顶递增或递减&#xff0c;根据单调栈的的这种结构&#xff0c;可以很容易想到运用单调栈可以很容易的把O(n)的时间复杂度优化到O(n),如果使用数组的话&#xff0c;相对的空间复杂度也不会太高 示例 …

Java实现之单调栈

目录 一.单调栈 二.每日温度 1.题目描述 2.问题分析 3.代码实现 三.下一个更大元素 I 1.题目描述 2.问题分析 3.代码实现 四.下一个更大元素 II 1.题目描述 2.问题分析 3.代码实现 一.单调栈 通常是一维数组&#xff0c;要寻找任一个元素的右边或者左边第一个比自…

[数据结构]单调栈

单调栈 这是笔者的第一篇博客&#xff0c;由于笔者自身水平的限制。用词可能不够准确&#xff0c;句子不太通顺&#xff0c;代码水平可能也不太行&#xff0c;敬请指出&#xff0c;感激不尽&#xff01; 我们都知道栈&#xff08;Stack&#xff09;是一种先入后出的数据结构&am…

单调栈和单调队列

本文摘自博客&#xff0c;欢迎前往博客以获得更好的体验。 单调栈 从名字上就听的出来&#xff0c;单调栈中存放的数据应该是严格单调有序的&#xff0c;具有以下两个性质。 满足从栈顶到栈底的元素具有严格的单调递增或单调递减性&#xff1b;满足栈的后进先出特性&#xff…

数据结构之单调栈(含代码实现)

目录 1.单调栈的基本概念 &#xff1a; 2.单调栈的应用 2.1单调栈 2.2单调栈进阶 2.3最大矩形面积 2.4最大矩形 2.5统计全为1的子矩阵数量 ​ 1.单调栈的基本概念 &#xff1a; 相信大家对栈都非常的熟悉&#xff1f;栈有一个非常鲜明的特点&#xff1a;先进后出 而所谓 单调栈…

C++之单调栈

单调栈的性质 单调栈是一种特殊的栈&#xff0c;特殊之处在于栈内的元素都保持一个单调性。 假设下图是一个栈内元素的排列情况(单调递增的栈)&#xff1a; 此时插入情况有两种&#xff1a; &#xff08;1&#xff09;插入元素大于栈顶元素&#xff1a; 因为7 > 6&#xf…

单调栈以及单调栈的应用

文章目录 单调栈的概念单调栈的应用CD101 单调栈结构&#xff08;无重复值&#xff09;CD188 单调栈结构(有重复值)496. 下一个更大元素 I739. 每日温度1856. 子数组最小乘积的最大值84. 柱状图中最大的矩形85. 最大矩形1504. 统计全 1 子矩形907. 子数组的最小值之和1307 验证…

单调栈完全解析

目录 单调栈的应用场景 为什么要使用单调栈&#xff1f; 单调栈作用的基本过程 单调栈的实现方式 栈里面的元素存放数字下标&#xff08;无重复元素&#xff09; 栈里面的元素存放数字下标组成的链表 &#xff08;有重复元素&#xff09; 单调栈的应用题目 直方图类型 …