e005020.pdf
(
469 KB
)
Pobierz
000044-UK Thema MP3
GENERAL
INTEREST
Basics and Chipsets
Staging a DiY Stand-Alone MP3 Player
By Prof. F. P. Volpe and P. Elsesser
Traditionally, an MP3
player requires a semi-
conductor memory to
store MP3 data. MP3 players
capable of reading data
directly from a CD and with-
out ‘help’ from a PC are few and far between. That may change, however,
with the home-brew MP3 player described in this short series of articles.
20
Elektor Electronics
5/2000
MP3
GENERAL
INTEREST
Strong Tone Signal
will cover ‘hands-on’ matters like the con-
struction of the MP3 decoder board, the
ATAPI support by a microcontroller and the
link to a CD-ROM drive.
Region where
Weaker Signals
are Masked
Why compression?
Frequency
000044 - 11
A digital audio signal typically consists of
samples with a size of 16 bits. According to
the sampling theorem, at a given bandwidth,
a sample has to be taken at a rate that equals
at least twice the frequency of the pro-
gramme material. If CD quality is required in
respect of bandwidth (44.1 kHz), then a data
rate of about 1.35 Mbit/s is needed to convey
an audio signal. In other words, one minute
of music requires about 10 Mbytes of data to
be conveyed or stored on a data carrier.
Unfortunately, that is not practicable even
with the huge capacity of today’s hard disks.
Obviously, the resultant transmission
(=download) times using media like the
Internet, Internet-Radio or Music On Demand
systems would be very long and therefore
prohibitive. In this case, the only solution is
to find a way to reduce the immense size of
the relevant data, the ‘real-world’ aim being
to convey a stereo music signal across an
ISDN link with a capacity of 64-kbit/s per
channel. In view of the required compression
rate of 1:12, that would seem to be possible
only if losses are accepted. Over a relatively
short period, a system called MPEG-1 Layer 3
has established itself as the
de facto
standard
for audio transmission via the Internet. MP3
employs compression algorithms that take
the real-life response of the human ear into
account. The resultant quality of reproduced
sound is so good that even trained experts
are unable to hear the difference between a
copy and its original. MP3 is also clearly
superior as compared to other MPEG Layers
such as the Digital Compact Cassette (MP1),
Digital Audio Broadcasting, the Video-CD
(MP2) and simpler systems like CELP,
Figure 1. A frequency range is masked (obliterated) by a loud signal.
Music programme material dis-
guised as MP3 files have become
big business on the Internet. How-
ever, before you can actually play a
music-CD with your own selection
of MP3 titles you typically have to
provide a link between your stereo
and your PC, by way of the sound-
card. Arguably, a stand-alone MP3
player with an internal standard
CD-ROM drive and a hardware MP3
decoder, ready for connection to
your stereo, represents a much bet-
ter solution because it avoids the
tedious process of having to ‘boot’
your PC every time you want to
play some music. Also, the often
objectionable noise level added by
the PC, and its very presence in the
living room, should be considered.
Such an MP3 player will be
described in the July/August 2000
issue of
Elektor Electronics
, and the
present article is intended as an
introduction to it. This month we
will describe the outlines and basics
of MP3 compression, as well as
some frequently seen chip sets for
MP3 decoding. We will also gaze
into the crystal ball for a bit in rela-
tion to future audio compression
technologies.
The follow-up article, to be pub-
lished in our annual Summer Cir-
cuits issue (i.e., July/August 2000),
120
120 phon (loudness level)
110
100
100
90
80
70
µ
-Law
80
or ADPCM.
60
60
50
40
The essential point about MP3 is that the
system is based on a psycho-acoustic model
whose elementary structure will be dis-
cussed below. A far more extensive tutorial
about MPEG and audio compression tech-
niques is available in the form of a down-
loadable file (mpeg-tutorial.pdf) from the
Download Area on the Elektor Electronics
website at
http://www.elektor-electronics.co.uk
(document reproduced courtesy IEEE).
Fortunately, this document does not rely on
higher mathematics to explain the underlying
principles!
40
30
20
20
10
Minimum audible Field (MAF)
0
0
100
1000
10,000
Frequency (Hz)
000044 - 12
Figure 2. The human ear is particularly sensitive to speech signals.
5/2000
Elektor Electronics
21
GENERAL
INTEREST
Psycho-acoustics
100
The science of Psycho-Acoustics studies the
behaviour of the human hearing system in
relation to processing of acoustic information
in the brain. Psycho-acoustic principles have
been extensively used in the development of
MP3 and indeed many other compression
techniques.
The audible spectrum may be thought of
as consisting of 26 frequency bands.
The frequency range below 500 Hz is sub-
divided into five bands of 100 Hz each.
Above this range, the bandwidth is about
1/5th of the centre frequency. In the human
hearing system, soft sounds become less dis-
tinct and even inaudible when loud, discrete
sound levels occur within these ‘critical’
bands. As an aside, you should note that the
frequency resolving capacity of the human
hearing system is much more accurate than
the critical bands.
The above phenomenon is a condition to
allow
masking
of a soft sound by a loud
sound which occurs at a nearby frequency
and/or instant. The spectral masking range of
an individual loud sound is shown in
Fig-
ure 1
. The masking (or, if you like, obliterat-
ing) sound may occur simultaneously with
the soft sound or even after it.
The actual effect of the masking depends
not only on the spectral and time-related
arrangement of the masking and masked
sound, but also on its
tonality
. Noise is far
easier to mask than a discrete frequency.
Conversely, a discrete frequency is a much
better mask than noise.
50
0
20
100
1k
10k
000044 - 13
20k
f
(Hz)
Figure 3. Three masks are applied to raise the co-hearing threshold.
the signal quality. In cases where no
compression is required, this is even
switched off completely.
process, a filter bank divides the
audible spectrum into 32 sub-bands
of equal width (750 Hz at a sam-
pling rate of 48 kHz). Next, 32-times
sub-sampling is applied to each
sub-band of the input signal, result-
ing in 32 PCM input data in a sub-
band sample. The filter bands equal
the previously mentioned critical
frequency bands of the human hear-
ing system, with three differences.
Firstly, the constant width of the
sub-bands is unlike that of the
human example, secondly, the filter
bank and its complementary func-
tion in the decoder are not loss-free,
and thirdly, an individual frequency
may affect the output signals
because of a certain overlap of the
sub-bands. Fortunately, the errors
Functional blocks
The elementary architecture of an
MPEG audio encoder is illustrated
in
Figure 4
. Decompression is the
reverse operation or encoding. To be
able to apply the psycho-acoustic
model to the digital audio input sig-
nal (a PCM datastream arriving at
768 kBit/s), the signal first has to be
transposed to the frequency
domain. A fast Fourier transforma-
tion with 1024 coefficients is
employed as part of the computa-
tion of the psycho-acoustic model.
Fully synchronous with this
The starting point for a psycho-acoustic
model is the frequency-dependent charac-
teristic of the human hearing system. The
curves in
Figure 2
illustrate that sounds
within the frequency range of human
speech are perceived with greater resolu-
tion than very high or very low sounds. The
lower, dashed, curve represent the
absolute
hearing threshold
, below which no sounds
are perceived by most of us. A masking
operation is applied to raise the absolute
threshold to the so-called
co-hearing
threshold
, as it appears three times as
dashed lines in
Figure 3
for three different
masking signals.
A psycho-acoustic model analyses the
audio signal and employs complex algorithms
to compute the usability of masking sounds
in the relevant frequency range. The closer
the model gets to reality, the higher the com-
pression rate that can be achieved at a given
quality level of the output signal. However,
the rules of the model are ‘relaxed’ to an
extent required by the transmission rate and
1
2
1
MDCT
Huffmann
Encoding
18
19
20
Audio Signal
(PCM 768 kbit/s)
2
MDCT
Non-linear
Quantisation
and
Bitrate Control
Bitstream-
formatting
and
Error
Correction
Encoded
Bitstream
Filter Bank
(32 sub-bands)
32
559
560
Aux. Info
Encoding
32
MDCT
576
FFT
Psycho-
acoustic
Model
000044 - 14
Figure 4. Block diagram of an encoder to MPEG 2.5 Layer III.
22
Elektor Electronics
5/2000
GENERAL
INTEREST
Audio
introduced by the filter bank are small
enough to be inaudible (<0.07 dB).
This is mainly caused by the final MDCT
(Modified Discrete Cosine Transformation)
which divides each of these 32 sub-bands
into 18 sub-sub-bands. In the end we have
32×18 = 576 sub-bands of 42 Hz each (at a
sampling rate of 48 kHz).
In the next operation, noise allocation, the
ratio of the quantisation noise and the co-
hearing threshold is recovered from the dif-
ference of the signal-to-noise ratio (filter
bank) and the signal-to-mask ratio of the psy-
cho-acoustic model. The number of bits avail-
able in the output datastream is then deter-
mined (depending on the selected overall
data rate minus data on scaling factor, head-
ers and other auxiliary data). The encoder
varies the quantisation in a certain sequence,
weighs the spectral values, counts the number
of bits required for the output signal (the
number being further reduced by Huffmann
encoding). The outcome is used to compute
the permissible quantisation noise level. If,
after quantisation, bands are found with
unacceptably high distortion, the encoder
amplifies these bands and so raises the size
of their quantisation stages. The operation is
repeated until distortion levels are below the
acceptable level. MP3 works with a variable
transmission rate and so creates a kind of
‘buffer’ for the duration of signals that require
a few bits only. The buffer is employed for
‘difficult’ encoding operations, at which the
data rate is fully exhausted.
The block diagram of the encoder is com-
pleted with a formatting unit that serves to
add the auxiliary data, and pack the lot into
frames ready for sending to the decoder.
CD-ROM
Audio Signal
Filter
ISO9660
left
ATAPI
I
2
C-Bus
AUX-In
LCD
PIC
16F874
MAS3507
DAC3550
Clock
Audio
Keyboard
Filter
right
000044 - 15
Figure 5. MP3 CD-player based on the MAS3507D decoder and DAC3550
digital-to-analogue converter, both from Micronas Intermetall. The setup is
controlled by a PIC17C756.
Audio
CD-ROM
Filter
ISO9660
left
LCD
ATAPI
Mic.
KS17C4000
DAC3550
Keyboard
Audio
Smart/media
Card
Filter
right
000044 - 16
Figure 6. MP3 CD player using the Samsung TL7231MD decoder. Besides
decoding MP3 data this chip also handles ADPCM-based audio signal compression.
From chipset to MP3 player
Audio
Different chipsets are currently available for
the purpose of decoding MP3 signals. In this
article, three options are discussed for realiz-
ing a stand-alone CD player for MP3 and
audio data, based on three different chip sets
from different manufacturers. The data source
is invariably a standard ATAPI-compatible
CD-ROM drive. A microcontroller is used to
look after the ATAPI protocol and supply the
data for the MP3 decoder. The same controller
also drives an LCD, and scans keys for navi-
gation within the CD directory structure. A
digital-to-analogue converter converts
decompressed data supplied by the MP3
decoder into plain audio signals.
At the time of writing, MP3 chipsets are
offered by three companies.
Figure 5
shows
an MP3 player based on a chipset supplied
by Micronas. This player is controlled by a
PIC17C756 which reads the MP3 data from a
CD-ROM
Filter
ISO9660
left
ATAPI
I
2
C-Bus
LCD
PIC
17C756
STA013
CS4331
Audio
Keyboard
Filter
right
000044 - 17
Figure 7. MP3 CD player based on the STA013 decoder from STMicroelectronics
and the CS4331 DAC from Crystal Semiconductor.
5/2000
Elektor Electronics
23
GENERAL
INTEREST
CD-ROM drive and transmits them serially to
a decoder chip type
MAS3507D
. You can
browse the CD contents and navigate by
means of a keyboard and a readout (LCD).
The display also indicates the ID-3 tag (data
information). Decoded data are applied to a
digital-to-analogue converter (DAC) type
DAC3550
, which supplies the analogue audio
signals. After cleaning in a third-order low-
pass filter, the audio signals are allowed to
leave the decoder.
Besides a digital input, the converter also
sports an analogue input. An I
2
C command is
used to select between these inputs. If the
microcontroller detects a music CD in the CD-
ROM drive, the DAC is ‘ordered’ to switch to
its analogue input.
Another chipset is shown in
Figure 6
.
Here, the Samsung
TL7231
is used as the
MP3 decoder. Because this chip integrates a
digital-to-analogue converter, the number of
components is drastically reduced. In addi-
tion to the MP3 decoder function, the
TL7231MD also has an ADPCM Codec. Using
this chip it is possible to store audio signals
in compressed form, and play them back
again.
If PCB space is at a premium, the solution
is to employ the
STA013
from STMicroelec-
tronics (
Figure 7
). This chip comes in a SO-28
case. The CS4331 digital-to-analogue con-
verter from Crystal Semiconductor comes in
a SO-8 enclosure. It should be noted that this
MP3 decoder wants a configuration file after
every reset. This file is supplied by the ST7
central controller in the system.
Compression and the future
A number of other audio compression methods exist besides MPEG Layer III.
These are partly well established, partly ‘being designed’. At least for the near
future MP3 seems to have won the battle for dominance on the Internet. How-
ever, Sony’s AC-3 system has pushed MPEG aside in the DVD market.
MPEG-2
allows ‘low’ sampling frequencies like 16 kHz, 22.05 kHz and 24 kHz besides the
more usual 32 kHz, 44.1 kHz and 48 kHz standards. It also supports 5.1-Sur-
round as well as multi-language broadcasts. Otherwise downwards compatible
with MPEG-1.
MPEG-2.5
again halves the sampling rates, allowing bit rates up to 8 kbit/s to be achieved.
Otherwise this standard is downwards compatible with MPEG-1. The Micronas
chipset used in the Stand-Alone MP3 Player is designed for MPEG-2.5.
MPEG-2-AAC (Advanced Audio Coding)
is not downwards compatible. It allows sampling frequencies of 32 kHz, 44.1 kHz
and 48 kHz as well as single-channel, dual-channel 5.1-Surround and multi-lan-
guage broadcasts. Provision is made for error detection/correction. AAC is
increasingly used by broadcast stations and, interestingly, for a new satellite radio
station covering the southern hemisphere. In Japan, HDTV and all broadcast sys-
tems are to employ AAC.
MPEG-4
allows acoustic events in a room to be described. This is of particular interest to
multimedia applications because a ‘speed control’ enables different media to be
synchronized.
ATRAC
is used in MiniDisc systems and supplies a data rate of 140 kbit/s/channel. The
main difference with MPEG-1 is that the filter bank uses a different method to
resolve the spectrum. In higher frequency bands, the frequency resolution is
reduced and the time-resolution is increased. (the ‘higher’ filter sections operate
faster).
(000044-1)
Note:
The construction of a Stand-Alone MP3 Player will
be discussed in the July/August 2000 issue of Elek-
tor Electronics.
Dolby AC-3
is MPEG’s biggest competitor and extensively applied with DVD and HDTV. AC-3
supports various formats including 5.1-Surround. The bitrate may lie between
32 kbit/s and 640 kbit/s. The quality is about the same as MPEG-2, however it is
below that of AAC with 5.1-Surround.
MS Audio
Impossible compression system devised by Microsoft.
Literature
- Micronas Intermetall: MAS 3507D MPEG 1/2
Layer 2/3 Audio Decoder. Preliminary Data
Sheet, Edition Oct. 21, 1998.
- STMicroelectronics: AN1090 STA013
MPEG 2.5 Layer III Source Decoder.
Application Note, March 1999.
- Micronas Intermetall: DAC 3550 Stereo Audio
DAC. Preliminary Data Sheet, Edition April 23,
1999.
- Draft ISO/IEC 9660: Information tech-
nology – Volume and file structure of
CD-ROM for information interchange.
- Samsung Semiconductor: TL7231MD Full Layer-
III ISO/IEC 11172-3 Audio Decoder. Data Sheet.
- A Tutorial on MPEG/Audio Compres-
sion, published in IEEE Multimedia
Journal, Summer 1995
- STMicroelectronics: STA013/STA013T MPEG
2.5 Layer III Audio Decoder. Data Sheet, Sep-
tember 1999.
24
Elektor Electronics
5/2000
Plik z chomika:
TirNaNog
Inne pliki z tego folderu:
bge.jpg
(22 KB)
detail10.htm
(5 KB)
detail1.htm
(5 KB)
detail12.htm
(15 KB)
detail11.htm
(7 KB)
Inne foldery tego chomika:
1974
1975
1976
1977
1978
Zgłoś jeśli
naruszono regulamin