Abstract— Beamforming based on microphone array is a method to identify
sound sources. It can visualize the sound field of the source plane and reveal
interesting acoustic information. This paper represents a tutorial of
fundamental array of processing and beamforming theory releavant to microphone
arrays. Microphone arrays have great potential in practical applications of
speech processing operations.
I.
INTRODUCTION
Array processing involves the use
of multiple sensors to receive or transmit a signal carried by propagating
waves. Sensor arrays have applications in a diversity of fields such as sonar, defense industry, seismology,
astronomy, tomography, smart home systems etc. In this paper we are going to
shed some light on microphone arrays which are used to receive acoustic
signals, or more specifically speech signals. then we are going to examine a
technique called “Beamforming”.
Starting from the invention of
telephone systems in the late 19th century, sound signal acquisition has been
an essential part of speech processing. Most early sound acquisition systems
use only a single microphone but such systems were not found to be very good.
In challenging acoustic environments where there are noise, echo, reverberation
and interferences. For a better control of the mentioned problems and
preservation of the spatial sound realism, multiple microphone systems were
invented.
In the literature, microphone
arrays are generally classified into two major categories: additive and
differential. First one refers to arrays with large sensor spacing whose
outputs are responsive to the acoustic pressure field. Whereas the second one refers
to arrays with small sensor spacing whose outputs are responsive to the differential
acoustic pressure field of different orders. Both types of arrays have their
own pros and cons and they will be investigated later on.
A microphone array system
consists of two important components; a hardware and an algorithm. The
selection of sensors, amplifiers and multichannel convertors are out of our
scope. For the latter, a large variety of processing algorithms have been
studied in the literature in order to enhance certain signals or signal
components from the microphones’ output. e.g: channel identification, channel
equalization, multichannel noise reduction, blind source separation and
beamforming.
Beamforming is a technique used
to process microphone array data in order to find the direction of incident
acoustic waves and estimate the power of sound source1. Beamforming consists
of designing a spatial filter that can take advantage of the spatiotemporal
information embedded in the microphone array outputs to form a response with
different sensitivities to sounds arriving from different directions.
Research in microphone array
beamforming started in the late 1960s although some of the fundamental
principles can be traced back to the 1930s when directional microphones were invented.2
Early works in this area were strongly influenced by the sensor array theory
developed in the field of radar and sonar.
Beamforming techniques can be
broadly classified as being either data-independent or data-dependent.
Data-independent or fixed beamformers are so named because their parameters are
fixed during operation. However data-dependent or adaptive beamforming
techniques continuously update their parameters based on the received signals.
Next chapter will explain a summary of beamforming techniques, indicating their
advantages and disadvantages.
II.
Beamform?ng Types
A. Delay-sum Beamforming
The simplest of all microphone
array beamforming technique is delay-sum beamforming. Delay-sum beamforming is
so-named because the time domain sensor inputs are first delayed by ?n
seconds, and then summed to give a single array output. Usually, each channel
is given an equal amplitude weighting in the summation so that the directivity
pattern demonstrates unity gain in the desired direction.
B. Filter-sum Beamforming
The delay-sum beamformer belongs
to a more general class known as filter sum beamformers, in which both the
amplitude and phase weights are frequency dependent. In practice, most
beamformers are a class of filter-sum beamformers.
C. Subarray Beamforming
The dependency on the operating
frequency means that the response characteristics(beam-width, sidelobe level)
will only remain constant for narrow-band signals where the bandwidth is not a
significant proportion of the centre frequency. However speech is a broad-band
signal meaning that a single linear array design is inadequate if a frequency
invariant beam-pattern is desired. One simple method is to implement the array
as a series of subarrays. These subarrays are designed to give desired response
characteristics for a given frequency range. The subarrays are generally
implemented in a nested fashion such that any given sensor may be used in more
than one subarray. Each subarray is restricted to a different frequency range
by applying band-pass filters. An illustration of a design covering 4 different
frequency bands is shown in Figure 1.
Figure
1: Sample nested subarray
structure
D.Superdirective Beamforming
Conventional linear arrays with sensors spaced at ?/2
have directivity that is approximately proportional to the number of sensors,
N. It has been found that the directivity of linear endfire arrays
theoretically approaches N2 as the spacing approaches zero in a
diffuse noise field. Beamforming techniques that use this capability for
closely spaced endfire arrays are termed superdirective beamformers. For speech
processing applications, superdirective methods are useful for obtaining
acceptable array performance at low frequencies for realistic array dimensions.
The wavelength for acoustic waves at 500 Hz is approximately 0.66 m and that is
why sensor elements spaced closer than 0.33m in an endfire configuration can be
used in the low frequency range to improve performance.
E.Near-field Superdirective Beamforming
Low frequency performance is
problematic for conventional beamforming techniques because large wavelengths
give negligible phase differences between closely spaced sensors, leading to
poor directive discrimination. Tager3 states that delay weight sum
beamformers can roughly cover the octave band before excessive loss of
directivity occurs. A frequency of 100 Hz corresponds to a wavelength of 3.4m
this would give an array dimension of 3.4m < L < 6.8m which is
impractical for many applications.
One such method is a technique
propes by Tager4 called near-field superdirectivity. This is due to the fact
that it takes the amplitude differences into accounts as well as the phase
differences. While the phase differences are negligible
at low frequencies, the amplitude differences are significant.
Particularly when the sensors are placed in an endfire configuration as this
maximises the difference in the distance from the source to each microphone.
F. Generalised Sidelobe Canceler (GSC)
The most famous adaptive beamforming
technique that addresses this limitation is derived from Frost5. Frost's
algorithm belongs to a class of beamformers known as linearly constrained
minimum variance (LCMV) beamformers. Perhaps the most commonly used LCMV
beamforming technique is the generalised sidelobe canceler (GSC)6. It
seperates the adaptive beamformer into two main processing paths. The firts of
these implements a standard fixed beamformer, with constraints on the desired
signal. The second path is the adaptive portion which provides a set of filters
that adaptively minimise the power in the output.
III.
Overview of Beamforming Techniques
This section summarises the
important characteristics of the beamforming techniques discussed in previos
chapter. Table 1 indicates wheter or not it is a fixed or adaptive techniqe,
optimal noise conditions for its use and its optimal array configuration. Table
2 indicates advantages and disadvantages of the techniques.
TABLE I.
Properties of beamforming
techn?ques
Technique
Fixed/Adaptive
Noise
Condition
Array
Conf.
Delay-sum
Fixed
Incoherent
Broadside
Subarray
Fixed
Incoherent
Broadside
Superdirective
Fixed
Diffuse
Endfire
Near-field
Fixed
Diffuse
Endfire
GSC
Adaptive
Coherent
Broadside
TABLE II.
advantages/d?sadvantages of
beamform?ng tech.
Technique
Advantages
Disadvantages
Delay-sum
Simplicity
Low frequency performance, narrow band
Subarray
Broadband
Low frequency performance
Superdirective
Optimised array gain
Assumes diffuse noise
Near-field
Optimised array gain, near field sources, low
frequency performance
Assumes diffuse noise, assumes noise in far-field
GSC
Adapts to noise conditions, minimises output noise
power
Low frequency performance, can distort in practice
Acknowledgment
This work was supported by mechanical engineering department of
Hacettepe University.
References
1
S.Haykin,
Array Signal Processing. Prentice Hall, 1985.
2
Chen J,
Benesty J, Pan C, On the Design and Implementation of Linear Differential
Microphone Arrays, J Acoust Soc Am, 136:3097-3113
3
W.Tager,
Near Field Superdirectivity(NFSD) in Proceedings of ICASSP, pp. 2045-2048, 1998
4
W.Tager,
Etudes en Traitement d'Antenne pour la Prise de Son, PhD thesis, Universite de
Rennes, 1998.
5
O. L.
Frost, An Algorith for Linearly Constrained Adaptive Array Processing,
Proceedings of the IEEE, vol. 60, pp. 926-935, August 1972.
6
L.
Griffiths and C. Jim, An Alternative Approach to Linearly Constrained Adaptive
Beamforming IEEE Trans. on Antennas and Propagation, vol. 30(1), pp.27-34,
Jamuary 1982.