Digital Audio

Digital audio recording works by recording, or sampling, an electronic audio signal at regular intervals (of time). An analog-to-digital (A/D) converter measures and stores each sample as a numerical value that represents the audio amplitude at that particular moment. Converting the amplitude of each sample to a binary number is called quantization. The number of bits used for quantization is referred to as bit depth. Sample rate and bit depth are two of the most important factors when determining the quality of a digital audio system.

Sample Rate

The sample rate is the number of times an analog signal is measured—or sampled—per second. You can also think of the sample rate as the number of electronic snapshots made of the sound wave per second. Higher sample rates result in higher sound quality because the analog waveform is more closely approximated by the discrete samples. Which sample rate you choose to work with depends on the source material you’re working with, the capabilities of your audio interface, and the final destination of your audio.

For years, the digital audio sample rate standards have been 44,100 Hz (44.1 kHz) and 48 kHz. However, as technology improves, 96 kHz and even 192 kHz sample rates are becoming common.

Audio sample rates
When used
8 kHz-22.225 kHz
These lower sample rates are used strictly for multimedia files.
32 kHz
32 kHz is generally used with 12-bit audio on DV.
44.1 kHz
This sample rate is used for music CDs and some DAT recorders.
48 kHz
Almost all digital video formats use this sample rate.
88.2 kHz
A multiple of 44.1 kHz. This is useful for high-resolution audio that needs to be compatible with 44.1 kHz. For example, if you eventually plan to burn an audio CD, this sample rate is a good choice.
96 kHz
A multiple of 48 kHz. This is becoming the professional standard for audio post-production and music recording.
192 kHz
A multiple of 48 and 96 kHz, this is a very high-resolution sample rate used mostly for professional music recording and mastering.

Bit Depth

Unlike analog signals, which have an infinite range of volume levels, digital audio samples use binary numbers (bits) to represent the strength of each audio sample. The accuracy of each sample is determined by its bit depth. Higher bit depths mean your audio signal is more accurately represented when it is sampled. Most digital audio systems use a minimum of 16 bits per sample, which can represent 65,536 possible levels (24-bit samples can represent over 16 million possible levels).

To better understand bit depth, think of each digital audio sample as a ladder with equally spaced rungs that climb from silence to full volume. Each rung on the ladder is a possible volume that a sample can represent, while the spaces between rungs are in between volumes that a sample cannot represent.

When a sample is made, the audio level of the analog signal often falls in the spaces between rungs. In this case, the sample must be rounded to the nearest rung. The bit depth of a digital audio sample determines how closely the rungs are spaced. The more rungs available (or, the less space between rungs), the more precisely the original signal can be represented.

Quantization errors occur when a digital audio sample does not exactly match the analog signal strength it is supposed to represent (in other words, the digital audio sample is slightly higher or lower than the analog signal). Quantization errors are also called rounding errors because imprecise numbers represent the original analog audio. For example, suppose an audio signal is exactly 1.15 volts, but the analog-to-digital converter rounds this to 1 volt because this is the closest bit value available. This rounding error causes noise in your digital audio signal. While quantization noise may be imperceptible, it can potentially be exacerbated by further digital processing. Always try to use the highest bit depth possible to avoid quantization errors.

The diagram on the far right shows the highest bit depth, and therefore the audio samples more accurately reflect the shape of the original analog audio signal.

Figure. Illustrations showing data points representing high, medium, and low bit depths.

For example, a 1-bit system (a ladder with only two rungs) can represent either silence or full volume, and nothing in between. Any audio sample that falls between these rungs must be rounded to full volume or silence. Such a system would have absolutely no subtlety, rounding smooth analog signals to a square-shaped waveform.

Figure. Illustrations showing sine and square waves.

When the number of bits per sample is increased, each sample can more accurately represent the audio signal.

Figure. Illustrations showing 1-bit, 2-bit, 4-bit, and 16-bit representations of an audio signal.

To avoid rounding errors, you should always use the highest bit depth your equipment supports. Most digital video devices use 16- or 20-bit audio, so you may be limited to one of these bit depths. However, professional audio recording devices usually support 24-bit audio, which has become the industry standard.

Bit depth
When used
32-bit floating point
This allows audio calculations, such as fader levels and effects processing, to be performed at very high resolution with a minimum of error, which preserves the quality of your digital audio.
24-bit
This has become the audio industry standard for most audio recording formats. Most professional audio interfaces and computer audio editing systems can record with 24-bit precision.
20-bit
Used in some video formats such as Digital Betacam and audio formats such as ADAT Type II.
16-bit
DAT recorders, Tascam DA-88 and ADAT Type I multitracks, and audio CDs all use 16-bit samples. Many digital video formats, such as DV, use 16-bit audio.1
8-bit
In the past, 8-bit audio was often used for CD-ROM and web video. Today, 16-bit audio is usually preferred, but available bandwidth and compatibility with your target user’s system are your chief considerations when outputting audio for multimedia use.

Many consumer DV camcorders allow you to record four audio channels using 12-bit mode, but this is not recommended for professional work.