Abstract— Beamforming based on microphone array is a method to identify
sound sources. It can visualize the sound field of the source plane and reveal
interesting acoustic information. This paper represents a tutorial of
fundamental array of processing and beamforming theory releavant to microphone
arrays. Microphone arrays have great potential in practical applications of
speech processing operations.
Array processing involves the use
of multiple sensors to receive or transmit a signal carried by propagating
waves. Sensor arrays have applications in a diversity of fields such as sonar, defense industry, seismology,
astronomy, tomography, smart home systems etc. In this paper we are going to
shed some light on microphone arrays which are used to receive acoustic
signals, or more specifically speech signals. then we are going to examine a
technique called “Beamforming”.
Starting from the invention of
telephone systems in the late 19th century, sound signal acquisition has been
an essential part of speech processing. Most early sound acquisition systems
use only a single microphone but such systems were not found to be very good.
In challenging acoustic environments where there are noise, echo, reverberation
and interferences. For a better control of the mentioned problems and
preservation of the spatial sound realism, multiple microphone systems were
In the literature, microphone
arrays are generally classified into two major categories: additive and
differential. First one refers to arrays with large sensor spacing whose
outputs are responsive to the acoustic pressure field. Whereas the second one refers
to arrays with small sensor spacing whose outputs are responsive to the differential
acoustic pressure field of different orders. Both types of arrays have their
own pros and cons and they will be investigated later on.
A microphone array system
consists of two important components; a hardware and an algorithm. The
selection of sensors, amplifiers and multichannel convertors are out of our
scope. For the latter, a large variety of processing algorithms have been
studied in the literature in order to enhance certain signals or signal
components from the microphones’ output. e.g: channel identification, channel
equalization, multichannel noise reduction, blind source separation and
Beamforming is a technique used
to process microphone array data in order to find the direction of incident
acoustic waves and estimate the power of sound source1. Beamforming consists
of designing a spatial filter that can take advantage of the spatiotemporal
information embedded in the microphone array outputs to form a response with
different sensitivities to sounds arriving from different directions.
Research in microphone array
beamforming started in the late 1960s although some of the fundamental
principles can be traced back to the 1930s when directional microphones were invented.2
Early works in this area were strongly influenced by the sensor array theory
developed in the field of radar and sonar.
Beamforming techniques can be
broadly classified as being either data-independent or data-dependent.
Data-independent or fixed beamformers are so named because their parameters are
fixed during operation. However data-dependent or adaptive beamforming
techniques continuously update their parameters based on the received signals.
Next chapter will explain a summary of beamforming techniques, indicating their
advantages and disadvantages.
A. Delay-sum Beamforming
The simplest of all microphone
array beamforming technique is delay-sum beamforming. Delay-sum beamforming is
so-named because the time domain sensor inputs are first delayed by ?n
seconds, and then summed to give a single array output. Usually, each channel
is given an equal amplitude weighting in the summation so that the directivity
pattern demonstrates unity gain in the desired direction.
B. Filter-sum Beamforming
The delay-sum beamformer belongs
to a more general class known as filter sum beamformers, in which both the
amplitude and phase weights are frequency dependent. In practice, most
beamformers are a class of filter-sum beamformers.
C. Subarray Beamforming
The dependency on the operating
frequency means that the response characteristics(beam-width, sidelobe level)
will only remain constant for narrow-band signals where the bandwidth is not a
significant proportion of the centre frequency. However speech is a broad-band
signal meaning that a single linear array design is inadequate if a frequency
invariant beam-pattern is desired. One simple method is to implement the array
as a series of subarrays. These subarrays are designed to give desired response
characteristics for a given frequency range. The subarrays are generally
implemented in a nested fashion such that any given sensor may be used in more
than one subarray. Each subarray is restricted to a different frequency range
by applying band-pass filters. An illustration of a design covering 4 different
frequency bands is shown in Figure 1.
1: Sample nested subarray
Conventional linear arrays with sensors spaced at ?/2
have directivity that is approximately proportional to the number of sensors,
N. It has been found that the directivity of linear endfire arrays
theoretically approaches N2 as the spacing approaches zero in a
diffuse noise field. Beamforming techniques that use this capability for
closely spaced endfire arrays are termed superdirective beamformers. For speech
processing applications, superdirective methods are useful for obtaining
acceptable array performance at low frequencies for realistic array dimensions.
The wavelength for acoustic waves at 500 Hz is approximately 0.66 m and that is
why sensor elements spaced closer than 0.33m in an endfire configuration can be
used in the low frequency range to improve performance.
E.Near-field Superdirective Beamforming
Low frequency performance is
problematic for conventional beamforming techniques because large wavelengths
give negligible phase differences between closely spaced sensors, leading to
poor directive discrimination. Tager3 states that delay weight sum
beamformers can roughly cover the octave band before excessive loss of
directivity occurs. A frequency of 100 Hz corresponds to a wavelength of 3.4m
this would give an array dimension of 3.4m < L < 6.8m which is impractical for many applications. One such method is a technique propes by Tager4 called near-field superdirectivity. This is due to the fact that it takes the amplitude differences into accounts as well as the phase differences. While the phase differences are negligible at low frequencies, the amplitude differences are significant. Particularly when the sensors are placed in an endfire configuration as this maximises the difference in the distance from the source to each microphone. F. Generalised Sidelobe Canceler (GSC) The most famous adaptive beamforming technique that addresses this limitation is derived from Frost5. Frost's algorithm belongs to a class of beamformers known as linearly constrained minimum variance (LCMV) beamformers. Perhaps the most commonly used LCMV beamforming technique is the generalised sidelobe canceler (GSC)6. It seperates the adaptive beamformer into two main processing paths. The firts of these implements a standard fixed beamformer, with constraints on the desired signal. The second path is the adaptive portion which provides a set of filters that adaptively minimise the power in the output. III. Overview of Beamforming Techniques This section summarises the important characteristics of the beamforming techniques discussed in previos chapter. Table 1 indicates wheter or not it is a fixed or adaptive techniqe, optimal noise conditions for its use and its optimal array configuration. Table 2 indicates advantages and disadvantages of the techniques. TABLE I. Properties of beamforming techn?ques Technique Fixed/Adaptive Noise Condition Array Conf. Delay-sum Fixed Incoherent Broadside Subarray Fixed Incoherent Broadside Superdirective Fixed Diffuse Endfire Near-field Fixed Diffuse Endfire GSC Adaptive Coherent Broadside TABLE II. advantages/d?sadvantages of beamform?ng tech. Technique Advantages Disadvantages Delay-sum Simplicity Low frequency performance, narrow band Subarray Broadband Low frequency performance Superdirective Optimised array gain Assumes diffuse noise Near-field Optimised array gain, near field sources, low frequency performance Assumes diffuse noise, assumes noise in far-field GSC Adapts to noise conditions, minimises output noise power Low frequency performance, can distort in practice Acknowledgment This work was supported by mechanical engineering department of Hacettepe University. References 1 S.Haykin, Array Signal Processing. Prentice Hall, 1985. 2 Chen J, Benesty J, Pan C, On the Design and Implementation of Linear Differential Microphone Arrays, J Acoust Soc Am, 136:3097-3113 3 W.Tager, Near Field Superdirectivity(NFSD) in Proceedings of ICASSP, pp. 2045-2048, 1998 4 W.Tager, Etudes en Traitement d'Antenne pour la Prise de Son, PhD thesis, Universite de Rennes, 1998. 5 O. L. Frost, An Algorith for Linearly Constrained Adaptive Array Processing, Proceedings of the IEEE, vol. 60, pp. 926-935, August 1972. 6 L. Griffiths and C. Jim, An Alternative Approach to Linearly Constrained Adaptive Beamforming IEEE Trans. on Antennas and Propagation, vol. 30(1), pp.27-34, Jamuary 1982.