Design of broadband beamformers with low complexity

In this article, we consider the design of broadband beamformers with low complexity. In fact, the design problem is multi-objective in nature, trading off between speech distortion and noise suppression. Finding a balance between these two objectives is important in order to achieve a desired sound quality. These measures are introduced as the objectives here. The design can then be obtained via a bi-objective integer programming problem, where the coefficients of the filters are expressed as sums of signed powers-of-two terms. We study two different integer spaces and penalty functions for solving the problem. Then, an algorithm based on a discrete filled function is developed for finding the optimal design. In order to illustrate the effectiveness of the algorithm, real data is used and two broadband beamformers are demonstrated.


Introduction
The increased popularity of wireless cellular telephones and their uses in a variety of occasions has motivated the development of handsfree communication devices. In this particular acoustic environment, the microphone array is required to suppress the car noise as well as the interference from the handsfree loudspeaker, while keeping the distortion of the speech low. Since the mathematical model of this problem is very difficult to construct, sequences of calibration signals are used instead for the design of beamformers [1]. Under this signal model, the least-squares technique (LS) and the signal-to-noise plus interference ratio (SNIR) are often used [2,3] to optimize the performance of the beamformer. However, evaluation results [3] have shown that beamformers designed by LS have very good distortion controls, while the ones designed by SNIR have better suppression levels of both noise and interference when compared to LS. Both approaches cannot control directly the individual level of speech distortion, noise suppression and interference suppression. This problem is partially overcome in [4], where it was proposed to use nonlinear programming to design beamformers with multi-criteria and demonstrated the set of Pareto optimal. Indeed, tradeoffs between different performance indicators in different beamforming systems are important and have been studied in the literature [5][6][7][8][9][10]. However, after achieving the required performances, coefficients of the designed beamformers are of very high precision and hence require significant efforts in order to preserve the performance and trade off the computational efficiency. Indeed, in many designs, the truncation method is still widely employed and quantization errors play a significant role in the dynamic range of filter gain and increase with filter order [11,12]. Furthermore, an efficient implementation of the beamformers in fixed-point arithmetic hardware (such as FPGA) is essential [13] in the production stage. However, there is very little result for finding the finite precision beamformers subject to achieving a certain speech quality.
One way to achieve low complexity is to express the coefficients of the filters as sums of signed powers-of-two (SPT) terms, and minimize the number of SPT terms required. This problem has received a great attention. For the design of high-pass or low-pass finite impulse response (FIR) filters, the use of SPT terms via the least squares criterion or the minimax criterion have been widely studied in the literature. Several optimization methods, such as branch and bound [14], simulated annealing [15], and searching techniques [16,17] have been proposed to tackle this class of problems. For the design of the filter bank, several algorithms have been proposed, such as the genetic algorithm [18], and the tree search algorithm [19]. However, these approaches have not been applied for broadband beamformer designs yet, especially when we consider the model of optimizing signal distortion and noise suppression directly. Effective algorithms are required to tackle this new constrained optimization problem.
The problem of designing FIR filters with low complexity is always formulated as a constrained optimization problem, where the variables take values in -1, 0, and 1. However, the number of variables for this formulation is very large and the proposed methods, such as branch and bound and the tree search algorithm, are very expensive. The heuristic methods such as the genetic algorithm and simulated annealing are not only expensive but also unpredictable. To reduce the number of variables, it is necessary to find a fast conversion between an integer and its minimum combination of signed power of two terms. In [20], a method is proposed to allocate the number of SPT terms to each coefficient value, but the optimum assignment scheme is not guaranteed. In this article, a fast and optimal conversion based on a recursive function is introduced and the problem is transformed into an integer programming problem with the number of variables greatly reduced. For this problem, the optimal solution should be close to the infinite precision solution, which can easily be obtained by gradient-based methods. Hence, the truncation of the infinite precision solution can be set as a good initial point and we are required to search the neighborhood of the initial point in the integer space. A suitable technique is the filled function method, which was first introduced in [21] for global optimization with continuous variables employed in a hybrid setting similar to [22]. It searches for a better minimizer among local minimizers by means of a function, which is called a filled function. A discrete filled function method was later developed in [23] for solving discrete global optimization problems. In this article, a novel method is proposed for the finite precision beamformer design using a discrete filled function. We formulate the design problem and transform it into an unconstrained integer programming problem. By incorporating a procedure for choosing initial points and using a discrete filled function, we develop an efficient algorithm to tackle the problem.
The rest of the article is organized as follows. In Section 2, with the calibration signal, we present the signal model and formulate the beamformer design problem. In Section 3, we transform the original problem into an unconstrained integer programming problem. An algorithm is then developed in Section 4, which is then applied for the off-line design of beamformers. Two examples are given in Section 5 and the numerical results obtained are compared with other methods.

Problem formulation
Unlike the single-channel case, multi-channel beamforming techniques exploit the properties of spatial and temporal distributions of both the speech and noise sources to enhance the performance. The structure of a linear FIR beamformer with several channels is shown in Figure 1.
We assume that there are M elements in the microphone array. In general, the signals received by the ith microphone element can be represented as , and x i I (n) are the source signal, the noise signal and the interference signal, respectively, and n is the discrete time index. Assume that known calibration sequence observations are used for each of these signals. The output of the beamformer is given by where L is the length of the filters and w = ((w 1 ) ⊺ ,..., where N 2 is the allowable number of SPT terms for each coefficient w i (j). Let W denote the set of all those w such that the constraints (2.4) and (2.5) are satisfied.
Basically, the LS formulation and the SNIR formulation are two different kinds of methods for determining the weight matrix. In both approaches, there is no direct control over the level of distortion and the level of noise and interference suppression. In particular, LS is very good at the distortion level, but poor in the suppression level. On the other hand, SNIR is always very good at the suppression level but left a consistently high level of distortion.
For beamformers, it is desired to maximize the noise and interference suppression, while keeping the distortion caused by the beamforming filters at the minimum. In order to measure various quantities, calibration signals are used. DefineP x r s x r s (ω) as the power spectrum estimate of the source signal without filtering, where the rth microphone is chosen as reference, andP y s y s (ω) is the power spectrum estimate of the beamformer output with filtering when the source signal is active alone. The normalized distortion measure can be defined as: CdPy sys (ω) −Pxr where ε{•} denotes the mean value and ω is the frequency. The constant C d is defined as: Similarly, the normalized noise suppression measure and interference suppression measure are, respectively, given by and where C s = 1 C d . Note that in (2.8),P y N y N (ω) and P x r N x r N (ω) are power spectrum estimates of the beamformer output with and without filtering, when the surrounding noise is active alone. In the same manner, P y I y I (ω) andP  when the source signal is active alone, i.e., when the beamformer attenuates the source signal by a specific amount, the noise and interference suppression quantities are reduced by the same amount.
In order to control the suppression and the distortion simultaneously, the filter design problem is formulated as a nonlinear programming problem given below.
Problem 1. Find a w ∈ Wsuch that F 1 (w) ≡ −10 * (log 10 S N (w) + log 10 S I (w)) (2:10) is minimized, subject to where c 1 is a pre-defined distortion level in dB scale. Similar to Problem 1, we can also formulate the constrained nonlinear programming problem as: is minimized, subject to where c 2 is a pre-defined suppression level in dB scale. Remark 1. In some situations, the influence of interference noise can be ignored. In such cases, the term S I in both Problems 1 and 2 can be removed.

Infinite precision solution
Both Problems 1 and 2 are discrete optimization problems. There is a lack of gradient information and hence they are much more difficult to be solved than continuous optimization problems. Here, we will first solve the infinite precision solution of the continuous version of the optimization problem, and use it as an initial guess to search for a finite precision solution in the second stage.

Equivalent form
Basically, the computation ofP y s y s ,P y N y N andP y I y I are very expensive if we calculate them by using the filtered signals y s ,y N , and y I for every w [4]. To simplify the computation, we take the discrete time Fourier transform of both sides of (2.2) yielding where W k (ω), X k (ω), and Y(ω) are the Fourier transforms of w k , x k , and y, respectively. Denote¯as the conjugate symbol, we havê whereP x k x j (ω) is the cross power spectrum of x k and x j . Since for each k, W k (ω) can be expressed as where ξ (ω)(ξ (ω)) can be computed directly as a Toeplitz matrix function: Let R ξ (ω) and I ξ (ω) be, respectively, the real part and imaginary part of ξ (ω)(ξ (ω)) given by and let R x k x j (ω) and I x k x j (ω) be the real part and imaginary part ofP x k x j (ω) , respectively. Then, (3.2) can be rewritten aŝ Let Ξ(ω) be a ℝ ML×ML matrix function given by Then, by (3.7)-(3.9), the constant C d defined in (2.7) can be simplified as (3:10) The normalized distortion measure defined in (2.6) can then be rewritten as 11) and the normalized noise suppression measure and interference suppression measure defined in (2.8) and (2.9) can be, respectively, simplified as (3:13) With (2.6), (2.8), and (2.9) replaced by (3.11)-(3.13), the computation of distortion and suppression values has been greatly simplified. For the suppression given by (3.12) (respectively, (3.13)), if the constraint of distortion level is ignored, we obtain the optimal solution of (3.12) (respectively, (3.13)). Denotẽ

Properties
Then, maximizing (3.12) is equivalent to maximizing (3:14) Considering (3.14), the optimal value is given by the maximal eigenvalue ofR and the optimal solutionw * is given by the respective eigenvector. Then, the optimal solution w* maximizing (3.12) is given by We obtain the optimal solution w* by maximizing (3.13) using the same principle.
The continuous versions of the optimization problems of Problems 1 and 2 are nonlinear programming problems. One well-known method for solving this type of problems is the sequential quadratic programming (SQP) which will be employed here. An overview of SQP can be found in [24].

Finite precision solution
For the coefficients of beamformers, it is advantageous to express each one as the sum of signed power-of-two terms. In general, we hope to use less wordlength and less SPT terms to obtain a satisfying performance when compared with the infinite precision solution. However, if the wordlength is not long enough, the truncated finite precision solution will produce very poor performance. As a result, we need to develop a method to find a satisfying finite precision solution by using a relatively small number of SPT terms.

Problem transformation
Let us first construct a transformation to convert Problem 1 to an equivalent integer programming problem.
Ignoring the constraints (2.4) and (2.5), it is easy to see that the set of all (4:1) Then, when the wordlength is taken as b-bit, the minimal number of SPT terms forw is defined as: Then, for each i and j, z i (j) {1 -2 b ,..., 2 b -1}. Thus, when the wordlength is taken as b-bit, the minimal number of SPT terms for some integerz can be defined by (4:3) Remark 2. The computation of P z (z, b)and the conversion from an integer to signed digit code can be implemented by recursive functions. These are given in the Appendix.
Clearly, the constraints (2.4) and (2.5) are equivalent to The cost functions (2.10) and (2.12) are equivalent tō To deal with the constraints, we use the penalty function method. Let Q 1 , Q 2 , Q 3 , Q 4 and Q 5 denote five sufficiently large positive real numbers. The constraints (4.4), (4.5), (2.11), and (2.13) are replaced with and respectively. Then, by adding the penalized terms, the objective functions (4.6) and (4.7) become Problem 3. Find a z ∈ Zsuch that E 1 (z), which is defined by (4.12), is minimized.
Similarly, Problem 2 is transformed into: Problem 4. Find a z ∈ Zsuch that E 2 (z), which is defined by (4.13), is minimized.
Problems 3 and 4 are unconstrained integer programming problems. To solve Problems 3 and 4, we will develop a three-step algorithm. The first step is the selection of an initial point, the second step is a local search, while the third step is a global approach.

Initial point
We will develop an efficient computational method to search for the optimal solutions of Problems 3 and 4. But first, we need to find a good initial point.
In general, it is much more difficult to solve the discrete optimization problem than to solve the continuous version of the optimization problem. In order to select good initial points for Problems 3 and 4, we first find the solutions (infinite precision solutions) of the continuous version of Problems 1 and 2 and then round them to the nearest discrete solutions.
Then, after the optimal solutionŵ of the continuous version of Problem 1 or 2 is obtained, we denotê z = 2 bŵ and select the initial point z 0 as the nearest point in Z toẑ, that is, where T is a function defined by (4:15) in which [z] denotes the largest integer less than or equal toz.

Local search
With an initial point, we can start to search for a local minimizer. The steepest descent algorithm is applied here to find a local minimizer by selecting the point which produces the largest reduction in the value of the objective function over the current point's neighborhood. The definition of neighborhood is given in Definition 1. For any z ∈ Z, the neighborhood of z is defined by (4:16) where e i is the ith unit vector (the ℝ ML vector with the ith component equal to one and all other components equal to zero).
If we have found a point which minimizes the objective function over its neighborhood, then the local search stops and the point obtained is called a local minimizer. The precise definition of local minimizers is given as follows: Based on the two definitions given above, we present a discrete steepest descent algorithm, which will be used as a subroutine, to search for a local minimizer of the cost function E i in the following algorithm.

Global approach
With Algorithm 1, we can find a local minimizer from any initial point. For this problem, there exist many local minimizers, and not all of them are useful in practice. Thus, we shall derive a discrete filled function method to search for the best local minimizer. We introduce the following function based on the one constructed in [23]. (4:17) where ║ ⋅ ║ denotes the usual Euclidean norm. When r > 0, 0 <μ <r/K (K is a sufficiently large real number), (4.17) is called a discrete filled function.
It is not necessary to define the function F μ,p when E i (z) < E i (z*). In this case, we can use the discrete steepest descent method directly with z as the initial point and will obtain a local minimizer, which is better than z* with reference to the objective function.
The search according to this discrete filled function (4.17) takes place as follows. With the starting point z*, the μ[E i (z) -E i (z*)] 2 term favors a solution with lower objective function value while the -r║z -z*║ 2 term favors a solution far away from z*. Combining the effects, the discrete filled function favors a solution whose objective function value is not too much greater than that of z* and at a considerable distance away from z*. The idea is to direct its search towards the direction with the least increase in the objective function value.
To address the situation when we fail to find a point z such that E i (z) <E i (z*) using the discrete filled function (4.17) as the objective function, we choose a positive integer number n s . When the number of searching steps is greater than n s , we stop and return the current local minimizer. This is because for the problem considered in this article, the optimal solution should be in the neighborhood of the continuous solution. Therefore, when the number of searching steps is greater than n s , we can consider that the best local minimizer has been obtained and it is not necessary to continue searching.
Then, with the current local minimizer z* of E i , we present a discrete steepest descent algorithm to search for a point better than z* in the following algorithm.
Algorithm 2. (Subroutine) 1. Set the current local minimizer z* as the initial point z 0 . Set l = 0 be the number of searching steps. 2. For each point z ∈ N (z l )\{z l }, compute the corresponding objective function value F μ,p (z; z*). Choose z' such that F μ,p (z'; z*) is the minimum. Go to Step 3.
3. If Ei(z') <Ei(z*), then stop and return z' which is better than z*, else set l = l + 1 and goto Step 4. 4. If l <n s , go to Step 2, else stop and return z' = z*, which means there is no point better than z*.

Algorithm
The proposed algorithm solving Problem 3 (or Problem 4) is summarized in the following: Algorithm 3. (Main program) 1. Calculate the infinite precision solutionŵof Problem 1 (or Problem 2) by using the SQP method. Compute the initial point z 0 .
2. Apply Algorithm 1 to obtain a local minimizer of the objective function E i with the initial point z 0 . Let the local minimizer be denoted as z*.
3. Apply Algorithm 2 to find a point z' better than z*. If z' ≠ z*, set z 0 = z' and go to Step 2, else stop search and obtain the solution z*, go to Step 4. 4. If the best value E i (z*) is sufficiently large, it means some constraints are not satisfied and the solution is infeasible, else return the solution z* and its value E i (z*). Stop.
It can be seen that Algorithm 3 includes two searches: the search of local minimizer and the search for switching to the better local minimizer. Both searches are based on the discrete steepest descent algorithm.

Simulation results
The proposed method has been used to solve several examples. The results obtained are consistently favorable when compared with results obtained by the truncation method. In this section, we illustrate the results for two examples. The computation was performed in Matlab, where the coefficients in this article are set as μ = 10 −6 , ρ = 1, n s = 10, Q 1 = Q 2 = Q 5 = 100, Q 3 = Q 4 = 10.
In the first example, the measurements were performed in a car environment. The calibration signals were recorded with a sample rate of 12 kHz and with a 300-3400 Hz bandwidth. A linear microphone array with four elements is considered. Noise calibration signals were used, which are emitted individually from the artificial talker and the handsfree loudspeaker as the source and the interference calibration signals, respectively. Interference signals were recorded by emitting an independent sequence of white noise, from the handsfree loudspeaker alone, within the bandwidth. This recording serves as the point source interference calibration signal. Recordings with real speech signals were recorded both individually and while driving. In order to gather background noise signals, the car was driving at a speed of 110 km/h on a paved road. The duration of these signals was 8 s [3].
We consider Problem 1 with the filter length given as 8. The total allowable number of SPT terms is constrained by N 1 = 40 and the allowable number for each coefficient is constrained by N 2 = 3. To see how our method studies, we take an example when the distortion level is taken as c 1 = -25 dB and the wordlength is taken as b = 5. We use the function fmincon in Matlab to solve the infinite precision solution, whose objective value is -23.8079 dB. The truncation solution is obtained, where the objective value is -20.7160 and the number of SPT terms is 60. For the proposed method, we first obtain a local minimizer by Algorithm 1, where the objective value is -21.9350 dB. By the discrete filled function, we apply Algorithm 1 to search for a solution better than the current local minimizer. Then, setting this solution as the initial point, we apply Algorithm 1 to obtain a better local minimizer whose objective value is -22.1598 dB. Again, by applying the discrete filled function four times, we obtain the local minimizer whose objective value is -23.4359 dB. This is then the best solution we found. Hence, we can see that from the truncation method to the proposed method, the total improvement of performance is 2.72 dB and the improvement of the number of SPT terms is 20. Furthermore, the improvement of performance from local search to global search is 1.50 dB.
Next, by choosing N 1 = 50, N 2 = 3, and b = 6, we apply the proposed method to solve the infinite precision solution with the distortion level taken as c 1 = -20 dB, -22 dB, and -25 dB, respectively. The infinite and finite precision solutions and their values are reported in Table 1.
The power spectrum estimates of the unprocessed and processed noise and interference signal using Welch's periodogram in the case of c 1 = -20 dB are depicted in Figures 2 and 3. In the suppression figure, if the line of beamformer output is lower than that of the single sensor observation, then the noise signal or interference signal is suppressed. The much the dashed line is lower than the real line, the much the signal is suppressed, especially at some frequency when the intensity is strong (dB value is high). It can be seen from Figures 2 and 3 that the noise or interference signal has been suppressed.
In the second example, the calibration signals were recorded with a two-element microphone arrays in an anechoic environment. A sample rate of 6 kHz is used. The duration of the signals was 5 s. We consider Problem 2 with the filter length given as 20. The total allowable number of SPT terms is constrained by N 1 = 50 and the allowable number for each coefficient is constrained by N 2 = 3. The wordlength is taken as b = 6. We use the function fmincon in Matlab to solve the infinite precision solution with the suppression level taken as c 2 = -20 dB, -22 dB, and -25 dB, respectively. The infinite and finite precision solutions and their values are reported in Table  2. The power spectrum estimates of the unprocessed and processed noise and interference signal using Welch's periodogram in the case of c 2 = -20 dB are shown in Figure 4. As the dashed line is lower than the real line for all frequencies in Figure 4, the noise signal has been suppressed remarkably.
We can see from Tables 1 and 2 that the sums of SPT terms of the directly truncated solutions are always very large, and neither their values are satisfactory when compared with the infinite precision solutions, nor are the constraints. For the solutions obtained by the discrete filled function method, the sums of SPT terms are reduced remarkably and their values are very close to that of the infinite precision solutions.

Conclusion
In this article, a novel design method is developed for the finite precision beamformers. The design is based on a bi-criteria formulation trading off speech distortion against noise suppression, and the method is based on  the concept of discrete filled functions, where filter coefficients are expressed as sums of SPT terms. An algorithm is derived using gradient descent techniques together with a procedure. From our numerical studies, the proposed method is highly effective, producing superior performance and reasonable design complexity.

Appendix 1: computation of P z (z, b)
When the wordlength is taken as b-bit, the minimal number of SPT terms forz, which is defined as P z (z, b) in (4.3), is computed as a recursive function: Recursive Function (P z (z, b)).
For this recursive function, the first two terms are the terminal conditions and the last two terms are the recursive rules. It is not difficult to see that the recursive rules preserve the property that P z (z, b) is the minimal number of SPT terms forz. For example, in the third term, if P z (z − 1, b) is the minimal number of SPT terms forz − 1 and P z (z + 1, b) is the minimal number of SPT terms forz + 1, then P z (z, b) is the minimal