 Research
 Open access
 Published:
A refined affine approximation method of multiplication for range analysis in wordlength optimization
EURASIP Journal on Advances in Signal Processing volume 2014, Article number: 36 (2014)
Abstract
Affine arithmetic (AA) is widely used in range analysis in wordlength optimization of hardware designs. To reduce the uncertainty in the AA and achieve efficient and accurate range analysis of multiplication, this paper presents a novel refined affine approximation method, Approximation Affine based on Space Extreme Estimation (AASEE). The affine form of multiplication is divided into two parts. The first part is the approximate affine form of the operation. In the second part, the equivalent affine form of the estimated range of the difference, which is introduced by the approximation, is represented by an extra noise symbol. In AASEE, it is proven that the proposed approximate affine form is the closest to the result of multiplication based on linear geometry. The proposed equivalent affine form of AASEE is more accurate since the extreme value theory of multivariable functions is used to minimize the difference between the result of multiplication and the approximate affine form. The computational complexity of AASEE is the same as that of trivial range estimation (AATRE) and lower than that of Chebyshev approximation (AACHA). The proposed affine form of multiplication is demonstrated with polynomial approximation, Bsplines, and multivariate polynomial functions. In experiments, the average of the ranges derived by AASEE is 59% and 89% of that by AATRE and AACHA, respectively. The integer bits derived by AASEE are 2 and 1 b less than that by AATRE and AACHA at most, respectively.
1 Introduction
As a method of representing real numbers, floating point can support a wide dynamic range and high precision of values. It has been thus commonly used in signal processing, such as image processing, speech processing, and digital signals processing, to represent signals. When these applications are implemented on hardware for high speed and stability, the signals need to be represented in fixed point to optimize the performance of area, power, and speed of the hardware. Hence, the values in floatingpoint need to be converted to those in fixed point. This process is named as wordlength optimization. Its goal is to achieve optimal system performance while satisfying the specification on the system output precision. Wordlength optimization involves range analysis and precision analysis. The former one is to find the minimum word length of the integer part of the value, while the latter one focuses on the optimization of the fractional part of the word length.
Wordlength optimization has been proven to be an NPhard problem [1]. It can be usually classified into dynamic analysis [2–7] and static analysis [8–20]. By analyzing a large set of stimuli signals, dynamic analysis is applicable to all types of systems. However, it will take long time on simulation to provide sufficient confidence. Also, the precision for the signals without simulation cannot be guaranteed. Comparatively, the static analysis is an automated and efficient wordlength optimization method and more applicable to large designs when compared to dynamic analysis. The static analysis mainly uses the characteristics of the input signals to estimate the word length conservatively, which can result in overestimation [12] to some extent. As a part of wordlength optimization, the range analysis can also been classified in the same way.
Affine arithmetic (AA) [21] is often used for range analysis in static analysis. In AA, every signal must be represented in an affine form, which is a firstdegree polynomial. As AA tracks the correlations among range intervals of signals, it can provide more accurate wordlength range. This makes it suitable for range analysis of the result of linear operations. It is noted that besides linear operations, nonlinear operations, such as multiplication, are also involved in hardware operations, typically in linear time invariant (LTI) systems. AA cannot provide an exact affine form for nonlinear operations. To solve this problem, Stolfi and de Figueiredo [22] proposed affine approximation methods for multiplication, which include trivial range estimation (AATRE) and Chebyshev approximation (AACHA). AATRE is efficient for computation, but the range produced by it can be four times of real range at most. The accumulation of the uncertainty of all signals in the computational chain may result in an error explosion, which is unacceptable in application. Such overestimation obviously cannot satisfy the accuracy requirement of the system, which limits the application of AATRE in large systems. The uncertainty of AACHA is less than AATRE, however, it is too complex to be used in large systems. Since LTI operations are accurately covered by AA, the proposed method is applied in the field of the range analysis of wordlength optimization in this paper.
A novel affine approximation method, Approximation Affine based on Space Extreme Estimation (AASEE), is proposed to reduce the uncertainty of multiplication and achieve an accurate and efficient range analysis of multiplication in this paper. To analyze the uncertainty conveniently, we use two parts to divide the different parts of all the approximation methods for multiplication, which include AATRE, AACHA, and AASEE. The first part is named as approximate affine form, which is approximated to the nonlinear operation. The second part is named as equivalent affine form, which is the equivalent affine form of the estimated range of the difference between the result of multiplication and the approximate affine form. The more accurate the two parts are, the more accurate the approximation method is. Based on linear geometry [23], it is proven that the proposed approximate affine form is the closest to the result of multiplication. To derive the equivalent affine form, we use the extreme value theory of multivariable functions [24] to estimate the upper and lower bounds of the difference in space, and the difference is introduced by the approximation of the first part. The uncertainty of the proposed method is minimized. The accuracy of the resulting affine form by AASEE is higher than that by AATRE and averagely higher than that by AACHA. Meanwhile, the computational complexity of AASEE is equivalent to that of AATRE and lower than that of AACHA.
The rest of this paper is organized as follows. Background of range analysis for multiplication is presented in Section 2. Section 3 presents the method of derivation of the two parts for multiplication. The refined affine form of multiplication, AASEE, is presented in next section. In Section 5, we compare the computational complexity and the accuracy among AASEE to AATRE and AACHA. The case studies and experimental results are demonstrated in Section 6. Section 7 concludes the paper.
2 Background
2.1 Related work
Interval arithmetic (IA) and affine arithmetic (AA) have been widely used in range analysis in wordlength optimization.
IA [25] is a range arithmetic theory which is firstly presented by Moore in 1962. Cmar [2] employs it for range analysis of digital signal processing (DSP) systems. Carreras [20] presents a method based on IA. To reduce the oversized word length, the method provides the probability density functions that can be used when some truncation must be performed due to constraints in the specification. IA is not suitable for most realworld applications, since it could lead to drastic overestimation of the true range.
AA [21] is proposed to overcome the weakness of IA by Stolfi in 1993. In [8, 9], Fang uses AA to analyze wordlength optimization. Both range and precision are represented by the same affine form, which limits the optimization. Pu and Ha [10] also use AA for wordlength optimization. Simultaneously, they use two different affine forms for range analysis and precision analysis, respectively, and achieve more refined result of wordlength optimization. Similarly, Lee et al. [11] develop an automatic optimization approach, which is called MiniBit, to produce accuracyguaranteed solutions, and area is minimized while meeting an error constraint. Osborne [12] uses both IA and AA for range analysis for different situations. Computation using either of the two methods in the design is timeconsuming. The problem of overestimation is serious due to the approximation of the nonlinear operations.
Since AA cannot be used in the systems with infinite number of loops, an improved approach, quantized AA (QAA), has been proposed in [13] for linear timeinvariant systems with feedback loops. This method can provide fast and tight estimation of the evolution of large sets of numerical inputs, using only an affinebased simulation, but it does not provide the exact bounds.
AATRE [22] is adopted for multiplication in most of the works for the low computational complexity. But the uncertainty of the range by AATRE is very large. To adjust the tradeoff between the accuracy of approximation and computational complexity, Zhang [14] introduces a new parameter N in the Nlevel simplified affine approximation (NSAA). This method is faster than AACHA and more accurate than AATRE, but it is more complex than AATRE. Furthermore, it is troublesome to choose a suitable N. A method of range analysis is proposed by Pang [26]. This method combines methods of IA, AATRE, and arithmetic transform (AT); and the result of the method is more accurate than AATRE, while the CPU implementation time is longer than AATRE. To deal with applications from the scientific computing domain, Kinsman [17, 18] uses the computational methods based on Satisfiability Modulo Theory. Search efficiency of this method is improved leading to tighter bounds and thus smaller word length.
For all the existing methods, the accuracy of approximation is improved at the expense of the computational complexity. This paper presents an affine approximation method for multiplication, which achieves better tradeoff between accuracy and computational complexity.
2.2 Range analysis
Range analysis involves studying the data range of every signal and minimizing the integer word lengths for signals on the premise that the signals in the design have enough bits to accommodate this range. The range of signal x is represented by x= [x_{min}, x_{max}], where the two real numbers, x_{min} and x_{max}, denote the lower and upper bounds of x, respectively. The required integer part of the word length for signal x, which is represented as IWL_{ x }, can be derived by:
In (1), all the signals in the design are assumed to be expressed as signed numbers, and the sign bit is taken into account in IWL_{ x }. According to (1), once the range of a signal is decided, the integer part of word length of the signal can be derived.
2.3 Affine arithmetic
AA is widely applied for range analysis. In AA, an uncertain signal x is represented by an affine form as a firstdegree polynomial [22]:
For the signal x, x_{0} is the central value, and ε_{ i } is the i th noise symbol. ε_{ i } denotes an independent uncertainty source that contributes to the total uncertainty of the signal x, and x_{ i } is its coefficient.
The upper and lower bounds for the range of x can be represented as
With x_{min} and x_{max}, the input interval \stackrel{\u0304}{x}=\phantom{\rule{0.3em}{0ex}}[\phantom{\rule{0.3em}{0ex}}{x}_{min},{x}_{max}] can be converted into an equivalent affine form as (4), using only one independent noise symbol.
AA can keep correlations among the signals of the computational chain by contributing the sample noise symbol ε_{ i } to each signal [22].
For multiplication, AATRE and AACHA are typical approximation methods.
The affine form of AATRE is
Suppose M_{1}= max(n_{1},n_{2}), in which n_{1} and n_{2} denote the number of the noise symbol, whose coefficient is nonzero, of \widehat{x} and , respectively. The computational complexity of AATRE is O(M_{1}).
AACHA provides a better approximation result, but it is more complex. The affine form of AACHA is
where a and b denote the minimum and the maximum of the range of \left(\sum _{i=1}^{n}{x}_{i}{\epsilon}_{i}\right)\left(\sum _{i=1}^{n}{y}_{i}{\epsilon}_{i}\right). Suppose M_{2} = n_{1} + n_{2}. The complexity of computing the both extremal values, a and b, is O(M_{2} logM_{2}). As M_{1} ≤ M_{2}, the computational complexity of AATRE is lower than that of AACHA [22].
2.4 Extreme value theory
The proposed approximation is based on the extreme value theory of multivariable functions [24].
According to the extreme value theory of multivariable functions, the Hessian matrix of the function, H, and Jacobian matrix of the function, J, can be used to find the local maxima and the local minima. Hessian matrix of function f(x_{1},x_{2}, …, x_{ n }) is
Here we use {\mathit{H}}_{{f}^{\alpha}} to represent H at a point {f}^{\alpha}=({x}_{1}^{\alpha},{x}_{2}^{\alpha},\cdots \phantom{\rule{0.3em}{0ex}},{x}_{n}^{\alpha}) and {\mathit{J}}_{{f}^{\alpha}} to represent J at a point f^{α}.
A stationary point of f, f^{α}, is a point where {\mathit{J}}_{{f}^{\alpha}}=0. {\mathit{H}}_{{f}^{\alpha}} is indefinite when {\mathit{H}}_{{f}^{\alpha}} is neither positive semidefinite nor negative semidefinite. If {\mathit{H}}_{{f}^{\alpha}} is positive definite, then f^{α} is a local minimum point. If {\mathit{H}}_{{f}^{\alpha}} is negative definite, then f^{α} is a local maximum point. If {\mathit{H}}_{{f}^{\alpha}} is indefinite, then f^{α} is neither a local maximum nor a local minimum. It is a saddle point. Otherwise, f^{α} is not utilized in this paper.
The principal minor determinants are used to determine if a matrix is positive or negative definite or semidefinite.
It is necessary and sufficient for a positive semidefinite matrix that all the principal minor determinants of the matrix are nonnegative real numbers.
It is necessary and sufficient for a negative semidefinite matrix that all the odd order principal minor determinants of the matrix are nonpositive real numbers and all the even order principal minor determinants of the matrix are nonnegative real numbers.
3 Derivation of the two parts for multiplication
A generic nonlinear operation z\leftarrow f(\widehat{x},\u0177) proposed in [22] can be described by (8):
Since the operation f is nonlinear, f^{∗}(ε_{1}, …, ε_{ n }) cannot be expressed exactly as an affine combination of the noise symbols, ε_{ i }. Under this case, an approximate affine form of the operation, which is represented as f_{ z }, must be used to approximate f^{∗}(ε_{1}, …, ε_{ n }). The difference introduced by this approximation, d_{ f } = f^{∗}f_{ z }, can be expressed by an equivalent affine form of the estimated range of the difference, which is represented as \widehat{d}. Hence, the affine form of z can be expressed as
In (9), f_{ z } is a firstdegree function of ε_{ i } and can be expressed as (10)
The computational complexity of computing the true range of d_{ f } is very high in a practical application. The estimated range of d_{ f } is utilized instead of the true range. Suppose d_{max} and d_{min} denote the upper and lower bounds of the estimated range of d_{ f }, respectively. According to (4), the \widehat{d} can be expressed as (11)
With (10) and (11), the affine form of z can be represented as
For multiplication, z can be expressed as
The first three items of (13) form an affine form and the last term is a quadratic term. Its affine form can also be represented as (12).
According to the definition of f_{ z } in (10) and \widehat{d} in (11), AATRE and AACHA can also be represented by f_{ z } and \widehat{d}. For AATRE in (5), the f_{ z } and \widehat{d} are defined as
For AACHA in (6), the f_{ z } and \widehat{d} are defined as
In the existing affine approximation methods of AATRE and AACHA, d_{max} and d_{min} are estimated in the XY plane. In these methods, the same noise symbol of different variables is considered to be independent. Hence, the range of \widehat{d} is much larger than that of d_{ f }. The difference between \widehat{d} and d_{ f } will propagate to \widehat{z} and result in uncertainty.
To describe the multiplication accurately, we use ε_{ i } as the input arguments and estimate the range of z in the (n+1)dimensional space E^{n+1}. The (n + 1)dimensional space E^{n+1} is labeled as (ε_{1}, …, ε_{ n }, z). In space E^{n+1}, a firstdegree polynomial function can be expressed as a (n + 1)dimensional hyperplane and a nonlinear polynomial function denotes a (n + 1)dimensional space curved surface. The approximate affine form in (10) denotes a (n + 1)dimensional hyperplane in E^{n+1}. Each hyperplane in E^{n+1} can be viewed as a parallel translation of a tangent hyperplane at a certain point of (n + 1)dimensional space curved surface. Hence, all possible approximate affine forms for z can be regarded as the (n + 1)dimensional tangent hyperplanes at all points of (n + 1)dimensional space curved surface in E^{n+1}. The translation amount is taken into account in d_{ f }, which is approximated by \widehat{d}. In space E^{n+1}, d_{ f } can be viewed as the function of the distance between the points of space curved surface and the tangent hyperplane.
Figure 1 shows an example of \widehat{x}=1+{\epsilon}_{1}+5{\epsilon}_{2} and \u0177=36{\epsilon}_{1}+{\epsilon}_{2}. The space is labeled as (ε_{1}, ε_{2}, z). The red mesh surface represents the function z=\widehat{x}\u0177=(1+{\epsilon}_{1}+5{\epsilon}_{2})(36{\epsilon}_{1}+{\epsilon}_{2}). The blue plane represents the tangent plane f_{ z }, z = 3  3ε_{1} + 16ε_{2}, at the point z^{α} = (0, 0, 3). All the possible approximate affine forms for z are the tangent planes of all the points. d_{ f } is a function of distance between z and f_{ z }.
Here we use {f}_{{z}^{\alpha}} in (18) to represent the tangent hyperplane at the point {z}^{\alpha}=({\epsilon}_{1}^{\alpha},{\epsilon}_{2}^{\alpha},\dots ,{\epsilon}_{n}^{\alpha}). Then, the possible approximate affine form can be represented as {f}_{{z}^{\alpha}}, too.
In (18), z{\prime}_{{\epsilon}_{n}} are the partial derivatives of z with respect to the variables ε_{ n } at the point z^{α}.
With the estimated range of d_{ f }, the maximum absolute error of d_{ f } can be expressed as
To reduce the uncertainty, f_{ z } must be the most closed to the result of multiplication. Hence, f_{ z } is the tangent hyperplane whose maximum absolute error is minimum among that of all the possible affine form {f}_{{z}^{\alpha}}, that is,
The geometrical meaning of f_{ z } denotes the tangent hyperplane whose maximum absolute error is minimized.
f_{ z } is derived by the range of d_{ f }, while \widehat{d} is the equivalent affine form of d_{ f }. It is very complex to compute the true range of d_{ f }. With \widehat{d} in (11), the uncertainty in AA for nonlinear operations is generated due to the difference between the true range of d_{ f } and the estimated range of d_{ f }.
It is much tighter and easier to estimate range of d_{ f } in E^{n+1} space than in the XY plane. Based on the extreme value theory of multivariable functions, the estimated range of d_{ f } in AASEE is derived.
With more accurate d_{max} and d_{min}, f_{ z } and \widehat{d} can be calculated more precisely, and AASEE can achieve a refined affine approximation result.
In the next sections, the estimated range of d_{ f } will be derived firstly, and the two parts will be derived later.
4 AASEE for multiplication
4.1 Estimated range of the difference
For multiplication, which is expressed as (13), the value of z at the point z^{α} is
The partial derivatives of z with respect to the variable ε_{ i } at the point z^{α} are
Upon substitution for z^{α} and z{\prime}_{{\epsilon}_{i}}, the tangent hyperplane {f}_{{z}^{\alpha}} can be expressed as
The difference between the tangent hyperplane {f}_{{z}^{\alpha}} and (n + 1)dimensional quadratic surface z is
Suppose d_{emax} and d_{emin} denote the estimated maximum and minimum of the function value at the domain boundary respectively, and d_{fimax} and d_{fimin} denote the local maxima and the local minima, respectively. The estimated maximum and minimum of multivariable function d_{ f }, d_{max} and d_{min}, can be expressed as
According to (24), the function value at the domain boundary, d_{ fe }, is represented by
To simplify, we observe the extreme case of ∀ε_{ i } = ±1. Under this case, for the first item, it is always positive when i = j. Hence, the estimated function value at the domain boundary, d_{e}, is expressed as
Hence, the maximum and minimum of d_{e}, d_{emax} and d_{emin} are derived as
To simply compare, d_{fimax} and d_{fimin} in (25) and (26) can be expressed as
As the example in Section 3, Figure 2 shows the function of d_{ f } = 6(ε_{1}0.1)^{2}29(ε_{1}0.1)(ε_{2}0.1) + 5(ε_{2}0.1)^{2} when {\epsilon}_{1}^{\alpha}=0.1 and {\epsilon}_{2}^{\alpha}=0.1. The estimated maximum and minimum of d_{ f } at the domain boundary, d_{emax} and d_{emin}, are also marked in the figure. Since the value of ε_{ i } in (27) are substituted by ∀ε_{ i } = ±1, d_{emax} is larger than the maximum of d_{ f } and d_{emin} is smaller than the minimum.
The extreme value theory of multivariable functions is used to compare d_{emax}, d_{fimax}, d_{emin}, and d_{fimin}.
Hessian matrix of function {d}_{f}=\sum _{i,j=1}^{n}{x}_{i}{y}_{j}({\epsilon}_{i}{\epsilon}_{i}^{\alpha})({\epsilon}_{j}{\epsilon}_{j}^{\alpha}) is
From (33), we can see that H is independent of ε_{ i }. It is a expression of x_{ i } and y_{ i }. This means that H is same for all the points in the domain.
To determine if H is positive or negative definite or semidefinite, its principal minor determinants are derived as
As introduced in Section 2.4, H is a positive semidefinite matrix, iff it satisfies
H is a negative semidefinite matrix, iff it satisfies
If it satisfies neither (37) nor (38), which means it satisfies (39), H is an indefinite matrix as
According to (37), (38), and (39), we can compare d_{emax}, d_{emin}, d_{fimax}, and d_{fimin}, which are expressed as (29), (30), (31), and (32), respectively. Based on (25) and (26), d_{max} and d_{min} can be identified.
Lemma 1.
The estimated maximum of function d_{ f }, d_{ max }equals to the estimated maximum of the function value at the domain boundary, and the estimated minimum of function d_{ f }, d_{ min }equals to the estimated minimum of the function value at the domain boundary. This can be expressed as
Proof.
There are two cases to consider, as ∃x_{ i }y_{ i } < 0 and ∀x_{ i }y_{ i } ≥ 0.
For ∃x_{ i }y_{ i } < 0, (39) is satisfied and H is indefinite. The stationary point is a saddle point, such as the point P in Figure 2. Neither d_{fimax} nor d_{fimin} exists in d_{ f }, that is,
According to (41), Lemma 1 can be proven in this case.
For ∀x_{ i }y_{ i } ≥ 0, H may be positive semidefinite or negative semidefinite. d_{ f } may have local minima or local maxima under this condition.
As ε_{ i } = [1, 1], the following inequalities are established:
If a local maximum lies at z^{α}, the difference between d_{emax} and d_{fimax} is
∀x_{ i }y_{ i } ≥ 0, there exists
According to (25) and (46), we can prove that
Similarly, if a local minimum lies at z^{α}, the difference between d_{emin} and d_{fimin} is
As ∀x_{ i }y_{ i } ≥ 0 in (48), the inequality (49) can be proven:
According to (26) and (49), we can prove that
As (47) and (50) are established, Lemma 1 can be proven in the case of ∀x_{1}y_{1} ≥ 0.
Combining these two cases, Lemma 1 is proven.
According to Lemma 1, d_{max} and d_{min} at a point z^{α} can be computed as d_{emax} and d_{emin} in (29) and (30).
4.2 Expression of the approximate affine form in AASEE
Lemma 2.
When f_{ z }represents a tangent hyperplane at the point z^{0} = z_{0} = (0, 0, …, 0), it satisfies (20).
Proof.
According to Lemma 1, (29), and (30), the maximum absolute error of d_{ f } is
So the maximum absolute error between the tangent hyperplane {f}_{{z}^{0}} at the point z^{0} = z_{0} = (0, 0, …, 0) and (n + 1)dimensional quadratic surface z is
Suppose that there is another point z^{α} ≠ z^{0}, which is typically represented by z^{α} = (ε_{1}, ε_{2}, …, ε_{ n }), where ε_{ i } = [1, 1], and ε_{ i } cannot be equal to 0 for all i, i = 1 … n. The maximum absolute error between the tangent hyperplane {f}_{{z}^{\alpha}} at point z^{α} and (n + 1)dimensional quadratic surface \widehat{x}\u0177 is
e_{a}(z^{α}) and e_{a}(z^{0}) can be compared by
Because e_{a}(z^{0}) ≤ e_{a}(z^{α}), the tangent hyperplane {f}_{{z}^{0}} at the point z^{0} = z_{0} = (0, 0, …, 0) is the tangent hyperplane whose maximum absolute error is minimized.
It is proven that the chosen f_{ z } is a tangent hyperplane at the point z^{0} = z_{0} = (0, 0, …, 0).
According to Lemma 2, f_{ z } of AASEE denotes the tangent hyperplane at the point z_{0} = (0, 0, …, 0) and can be expressed as
This f_{ z } is the same as the f_{ z }s in AATRE and AACHA.
4.3 Expression of the equivalent affine form in AASEE
According to (55), the d_{ f } between the tangent hyperplane {f}_{{z}^{0}} and the quadratic surface is
According to Lemma 1, (29), and (30), the estimated maximum and estimated minimum of d_{ f }, d_{max} and d_{min} can be expressed as
n = 1 is a special case and d_{max} and d_{min} can be optimized as
By combining the two cases, d_{emax} and d_{emin} are rewritten as
When n > 1, the range of \widehat{d} can be expressed as
According to (11), the affine form of \widehat{d} can be expressed as
When n = 1, the range of \widehat{d} can be expressed as
The affine form of \widehat{d} can be expressed as
4.4 Formulary of AASEE
According to (12), the affine form of AASEE for multiplication is
It is impossible to obtain the exact affine form for multiplication in AA. The result of multiplication must be approximated to an affine form. Using ε_{ i } as the input arguments, the uncertainty of multiplication in AASEE is reduced. The proposed f_{ z } is the most closed to the result of multiplication among all the possible approximate affine forms, and the upper and lower bounds of \widehat{d} in AASEE are much closer to true bounds of d_{ f }. Hence, the uncertainty in AASEE is smaller than that in AATRE and AACHA. Formed by such f_{ z } and \widehat{d}, AASEE creates a refined affine form of multiplication.
5 Comparison of AASEE to AATRE and AACHA
5.1 Computational complexity
The computational complexity of an expression is determined by its most complex item. For n > 1, the most complex item is the coefficient of ε_{n+1}. To make the analysis convenient, we transform this coefficient:
The computational complexity of the minuend is O(M_{1}), where M_{1} is defined in Section 2.3, while the computational complexity of the subtrahend is less than O(M_{1}).
Hence, the computational complexity of AASEE is O(M_{1}). We can see that it is the same as that of AATRE and is lower than that of AACHA.
5.2 Accuracy
The accuracy of \widehat{d} is influential to the accuracy of the affine approximation methods of multiplication. The more accurate \widehat{d} will lead to a more accurate the affine approximation result.
For AATRE, \widehat{d}=\sum _{i=1}^{n}\left{x}_{i}\right\sum _{i=1}^{n}\left{y}_{i}\right{\epsilon}_{n+1}. In this method, the same noise symbol of different variables is considered to be independent. The range of this \widehat{d} is
It is much larger than the range of \widehat{d} by AASEE, which is expressed in (62) and (64).
In AACHA, \widehat{d}=\frac{a+b}{2}+\frac{ba}{2}{\epsilon}_{n+1}, where a and b are represented the estimated range of \widehat{d}. In this method, a polygon in XY plane is used to find a and b. The domain of \widehat{x}\u0177 is bounded by the polygon. However, the polygon is larger than the true domain, and all the same noise symbols of different variables are not taken into account together.
All the same noise symbols of different variables are considered together by \widehat{d} of AASEE. It is more accurate than \widehat{d} of AATRE. In the most cases, it is more accurate than \widehat{d} of AACHA, too.
6 Case studies
The following nonlinear system cases are used to demonstrate the efficiency of the proposed refined affine form of multiplication. These cases are commonly used in signal processing. The first two cases are univariate cases and come from [11]. The rest of cases are multivariate polynomial functions and come from [27–29].
6.1 Introduction of the cases
Case 1. Polynomial approximation. The first case study is that degreefour polynomial for the approximation of y = ln(1 + x), where x = [0,1]. Horner’s rule evaluates the polynomial
where the coefficients are obtained by polynomial curve fitting technique.
Case 2. Bsplines Uniform cubic Bsplines are commonly used for image warping [30]. Basic functions B_{0}, B_{1}, B_{2}, and B_{3} in Bspline are defined as
where u = [0, 1].
Case 3. Multivariate polynomial functions. In the third case, eight multivariate polynomial functions are examined. They are as follows:

1.
SavitzkyGolay filter:
\begin{array}{ll}{f}_{1}\left(\mathit{X}\right)& =7{x}_{1}^{3}984{x}_{2}^{3}76{x}_{1}^{2}{x}_{2}+92{x}_{1}{x}_{2}^{2}+7{x}_{1}^{2}\\ \phantom{\rule{1em}{0ex}}39{x}_{1}{x}_{2}46{x}_{2}^{2}+7{x}_{1}46{x}_{2}75\\ \text{where the input range:}\phantom{\rule{1em}{0ex}}\mathit{X}=\phantom{\rule{0.3em}{0ex}}{[\phantom{\rule{0.3em}{0ex}}2,2]}^{2}\end{array} 
2.
Image rejection unit:
\begin{array}{ll}{f}_{2}\left(\mathit{X}\right)& =16384\left({x}_{1}^{4}+{x}_{2}^{4}\right)+64767\left({x}_{1}^{2}{x}_{2}^{2}\right)+{x}_{1}{x}_{2}\\ \phantom{\rule{1em}{0ex}}+57344{x}_{1}{x}_{2}({x}_{1}{x}_{2})\\ \text{where the input range:}\phantom{\rule{1em}{0ex}}\mathit{X}=\phantom{\rule{0.3em}{0ex}}{[\phantom{\rule{0.3em}{0ex}}0,1]}^{2}\end{array} 
3.
A random function:
\begin{array}{ll}{f}_{3}\left(\mathit{X}\right)& =({x}_{1}1)({x}_{1}+2)({x}_{2}+1)({x}_{2}2){x}_{3}^{2}\\ \text{where the input range:}\phantom{\rule{1em}{0ex}}\mathit{X}=\phantom{\rule{0.3em}{0ex}}{[\phantom{\rule{0.3em}{0ex}}2,2]}^{3}\end{array} 
4.
Mitchell function:
\begin{array}{ll}{f}_{4}\left(\mathit{X}\right)& =4\left[{x}_{1}^{4}+{\left({x}_{2}^{2}+{x}_{3}^{2}\right)}^{2}\right]+17{x}_{1}^{2}\left({x}_{2}^{2}+{x}_{3}^{2}\right)\\ \phantom{\rule{1em}{0ex}}20\left({x}_{1}^{2}+{x}_{2}^{2}+{x}_{3}^{2}\right)+17\\ \text{where the input range:}\phantom{\rule{1em}{0ex}}\mathit{X}=\phantom{\rule{0.3em}{0ex}}{[\phantom{\rule{0.3em}{0ex}}2,2]}^{3}\end{array} 
5.
Matyas function:
\begin{array}{ll}{f}_{5}\left(\mathit{X}\right)& =0.26({x}_{1}^{2}+{x}_{2}^{2})0.48{x}_{1}{x}_{2}\\ \text{where the input range:}\phantom{\rule{1em}{0ex}}\mathit{X}=\phantom{\rule{0.3em}{0ex}}{[\phantom{\rule{0.3em}{0ex}}100,100]}^{2}\end{array} 
6.
Threehump function:
\begin{array}{ll}{f}_{6}\left(\mathit{X}\right)& =12{x}_{1}^{2}6.3{x}_{1}^{4}+{x}_{1}^{6}+6{x}_{2}({x}_{2}{x}_{1})\\ \text{where the input range:}\phantom{\rule{1em}{0ex}}\mathit{X}=\phantom{\rule{0.3em}{0ex}}{[\phantom{\rule{0.3em}{0ex}}10,10]}^{2}\end{array} 
7.
GoldsteinPrice function:
\begin{array}{ll}{f}_{7}\left(\mathit{X}\right)& =\left[\phantom{\rule{0.3em}{0ex}}1+{({x}_{1}+{x}_{2}+1)}^{2}\left(1914{x}_{1}+3{x}_{1}^{2}14{x}_{2}\right.\right.\\ \phantom{\rule{1em}{0ex}}\left(\right]separators="">\left(\right)separators="">\phantom{\rule{0.3em}{0ex}}+\phantom{\rule{0.3em}{0ex}}6{x}_{1}{x}_{2}+3{x}_{2}^{2}\\ \times \left[30+{(2{x}_{1}3{x}_{2})}^{2}\right.\end{array}\n \n \n \n \xd7\n separators="">\n \n \n \n 18\n \n 32\n \n \n x\n \n \n 1\n \n \n +\n 12\n \n \n x\n \n \n 1\n \n \n 2\n \n \n +\n 48\n \n \n x\n \n \n 2\n \n \n \n 36\n \n \n x\n \n \n 1\n \n \n \n \n x\n \n \n 2\n \n \n +\n 27\n \n \n x\n \n \n 2\n \n \n 2\n \n \n \n \n \n \n \n 
8.
Ratscheck function:
\begin{array}{ll}{f}_{8}\left(\mathit{X}\right)& =4{x}_{1}^{2}2.1{x}_{1}^{4}+\frac{1}{3}{x}_{1}^{6}+{x}_{1}{x}_{2}4{x}_{2}^{2}+4{x}_{2}^{4}\\ \text{where the input range:}\phantom{\rule{1em}{0ex}}\mathit{X}=\phantom{\rule{0.3em}{0ex}}{[\phantom{\rule{0.3em}{0ex}}100,100]}^{2}\end{array}
6.2 Analysis of case 1
For the input range x = [0, 1], equivalent affine form is \widehat{x}=0.5+0.5{\epsilon}_{1}. For case 1, the intermediate and output signals are defined as
Using AATRE, the affine forms of intermediate and output are
Using AACHA, the affine forms of intermediate and output are
Using AASEE, the affine forms of intermediate and output are
Table 1 shows the variable ranges and the range intervals, (y_{max}y_{min}), of intermediates and output by the three methods. The true range of y lies in [0,0.6931], and the range interval of output is 0.6931. Suppose R(T), R(C), and R(A) are represented as the ratios of range interval obtained by AATRE, AACHA, and AASEE to the true range interval, respectively. The closer this ratio converges to 1, the more accurate the method is. In this case, as R(T) = 1.33, R(C) = 1.15, and R(A) = 1.03, we can see the range by AASEE is closer to the true range than AATRE and AACHA.
6.3 Comparison of range and computational complexity by the three cases
The output ranges by the three methods of case 2 and case 3 can be obtained according to the process of case 1.
Table 2 demonstrates the ranges and the integer word lengths by AASEE and comparison among AATRE, AACHA and AASEE. Column c.fun shows the case study and the function of the row. The true output ranges, which are used as reference values, are obtained by numerical method or nonlinear programming technique, which are timeconsuming and are not practical to solve the true bounds for large number of signals. From the table, we can see that the ranges, which are derived by AASEE, cover the true ranges and they are smaller than those by AATRE, for all the functions. For these thirteen functions, the ranges, which are derived by AASEE, are smaller than those by AACHA for nine functions, and equal to those by AACHA for two functions. According to (1), the integer word length can be decided by the range. The integer wordlength, which is derived by AASEE, is 2 b less than that by AATRE and 1 b less than that by AACHA, at most. Comparing with AATRE, AASEE and AACHA can save 0.54 b on average.
To calculate the estimated range of d_{ f }, the values of ∃ε_{ i } = ±1, ∀i = 1, 2, …, n in (27) are substituted by ∀ε_{ i } = ±1 in AASEE. The difference between the estimated range and the true range of d_{ f } is introduced by this approximation. In most of the applications, the estimated ranges, which are computed by AASEE, are closer than those by AACHA. However, the estimated minimum and maximum of \widehat{x}\u0177 on the boundary of the polygon are independent of the value of ε_{ i }. In some applications such as functions f_{2} and f_{8} in Table 2, the results by AASEE are almost the same as those by AACHA.
In Table 3, ratios of range intervals and the computational complexity are compared among AATRE, AACHA, and AASEE. The computational complexity is calculated from the numbers of multiplications and additions. For AACHA, the extreme value of a quadratic function in one variable on a bounded interval needs to be calculated. N_{m}, N_{a}, and N_{e} denote the numbers of multiplications, additions and the extreme value computations of each case, respectively. Table 3 shows that R(T) values are from 1.04 to 281.2, R(C) are from 1.03 to 233.7, and R(A) are from 1.03 to 192.9. The ratios of R(A) to R(T) and R(C) show the accuracy of AASEE compared to AATRE and AACHA, respectively. The average ratios can be used to evaluate the accuracy of the affine approximation methods. The ratios of R(A) to R(T) are from 0.18 to 0.99, and the average of these ratios is 0.59. The ratios of R(A) to R(C) are from 0.33 to 1.17, and the average of these ratios is 0.89. For these 13 cases, on average, the accuracy of AASEE is 1.69 times than that of AATRE and 1.12 times than that of AACHA. The extreme value computation, which is only necessary for AACHA, of the quadratic function is the most complex and timeconsuming among the operations. Hence, the computational complexity of AACHA is much higher than that of AATRE and AASEE. The increase rate of the number of multiplications, N_{m}, by AASEE to AATRE is from 0.091 to 1.75, and the average is 0.450. The increase rate of the number of multiplications, N_{m}, by AASEE to AACHA is from 0.2 to 1.833, and the average is 0.567. The increase rate of the number of additions, N_{a}, by AASEE to AATRE is from 0.05 to 3.4, and the average is 0.944. The increase rate of the number of additions, N_{a}, by AASEE to AACHA is from 0 to 0.985, and the average is 0.157. The numbers of multiplications and additions of AASEE are increased a few. As shown in Table 3, AACHA is slightly more accurate for functions c_{3}.f_{2} and c_{3}.f_{8}, but the computational complexity of AACHA is much higher than that of AASEE.
6.4 Comparison of the design cost by the three methods
To compare the design cost, the system area by the three methods, the fractional word lengths are obtained by the precise analysis in [11]. Typically, we select the case of a random function of case 3, c_{3}.f_{3}, for this section. The design of c_{3}.f_{3} is synthesized on Xilinx Xc2vp307ff896 FPGA device (Xilinx, San Jose, CA, USA).
Figure 3 shows the area variation for c_{3}.f_{3} with increasing target precision. It can be seen that the area, which is calculated by AASEE, is less than that by AATRE and AACHA, and the area difference between them is increasing with the target precision. This difference is from 265 to 729 with the target precision increased. Such optimization of integer word length can save area.
Figure 4 shows the percentage area saving of AASEE over AATRE at different target precision for c_{3}.f_{3}. The percentage area saving is from 14.34% to 5.62% with the target precision increased. Generally, we obtain increased relative saving for lower precision.
7 Conclusions
This paper presents a novel affine approximation method for multiplication, Approximation Affine based on Space Extreme Estimation. In this method, an extra noise symbol is added to an approximated affine form.
To reduce the uncertainty in AA, we derive this method in the (n + 1)dimensional space E^{n+1}. In space E^{n+1}, approximate affine form can be regarded as the tangent hyperplane at a certain point of (n + 1)dimensional space curved surface. Using the linear geometry, it is proven that the f_{ z } of AASEE is the closest to the result of multiplication among all the possible approximate affine forms. Taking ε_{ i } as the input arguments, all the same noise symbols of different variables are taken into account together. Hence, the uncertainty of \widehat{d} of AASEE is reduced. Based on the extreme value theory of multivariable functions, we can prove that the range of this \widehat{d} covers the true range of the difference introduced by approximation and much tighter than that by AATRE and AACHA.
The uncertainty in AASEE is much smaller than that in AATRE and AACHA on average. At the same time, the computational complexity of AASEE is the same as that of AATRE and lower than that of AACHA.
In the case studies, the accuracy of AASEE is 1.69 times than that of AATRE and 1.12 times than that of AACHA on average. The integer word length, which is derived by AASEE, is 2 b less than that by AATRE and 1 b less than that by AACHA, at most. For the case of c_{3}.f_{3}, the area, which is computed by AASEE, is less than that by AATRE and AACHA, and the percentage area saving of AASEE over AATRE is from 14.34% to 5.62% with the target precision increased.
References
Constantinides G, Woeginger G: The complexity of multiple wordlength assignment. Appl. Math. Lett 2002, 15(2):137140. 10.1016/S08939659(01)001070
Cmar R, Rijnders L, Schaumont P, Vernalde S, Bolsens I: A methodology and design environment for DSP ASIC fixed point refinement. In Proceedings of Design, Automation and Test in Europe. Munich: IEEE Computer Society; 09–12 March 1999:271276.
Kum K, Sung W: Combined wordlength optimization and high level synthesis of digital signal processing systems. IEEE Trans. ComputerAided Design Integr. Circuits Syst 2001, 20(8):921930. 10.1109/43.936374
Roy S, Banerjee P: An algorithm for trading off quantization error with hardware resources for MATLABbased FPGA design. IEEE Trans. Comput 2005, 54(7):886896. 10.1109/TC.2005.106
Mallik A, Sinha D, Zhou H: Lowpower optimization by smart bitwidth allocation in a SystmCbased ASIC design environment. IEEE Trans. ComputerAided Design Integr. Circuits Syst 2007, 26(3):447455.
Caffarena G, Carreras C, Lopez JA: SQNR estimation of fixedpoint DSP algorithms. Eurasip J. Adv. Signal Process 2010, 21: 112.
Banciu A, Casseau E, Menard D: Stochastic modeling for floatingpoint to fixedpoint conversion. In Proceedings of IEEE Workshop on Signal Processing Systems (SiPS). Beirut: IEEE Computer Society; 4–7 October 2011:180185.
Fang CF, Rutenbar R, Puschel M, Chen T: Toward efficient static analysis of finiteprecision effects in DSP applications via affine arithmetic modeling. In Proceedings of Design Automation Conference, Institute of Electrical and Electronics Engineers Inc.. Anaheim; 2–6 June 2003:496501.
Fang CF, Rutenbar R: Fast, accurate static analysis for fixedpoint finiteprecision effects in DSP designs. In Proceedings of International Conference on ComputerAided Design, Institute of Electrical and Electronics Engineers Inc.. San Jose; 9–13 November 2003:275282.
Pu Y, Ha Y: An automated, efficient and static bitwidth optimization methodology towards maximum bitwidthtoerror tradeoff with affine arithmetic model. In Proceedings of Asia and South Pacific Design Automation Conference, Institute of Electrical and Electronics Engineers Inc.. Yokohama; 24–27 January 2006:886891.
Lee DU, Gaffar AA, Cheung RC, Mencer O, Luk W, Constantinides GA: Accuracy guaranteed bitwidth optimisation. IEEE Trans. ComputerAided Design Integr. Circuits Syst 2006, 25(10):19902000.
Osborne WG, Coutinho JGF, Luk W, Mencer O: Instrumented multistage wordlength optimization. In Proceedings of IEEE International Conference on FieldProgrammable Technology, Institute of Electrical and Electronics Engineers Inc.. Kitakyushu; 12–14 December 2007:8996.
Lopez JA, Carreras C, NietoTaladriz O: Improved intervalbased characterization of fixedpoint LTI systems with feedback loops. IEEE Trans. ComputerAided Design Integr. Circuits Syst 2007, 2(11):19231933.
Zhang L, Zhang Y, Zhou W: Tradeoff between approximation accuracy and complexity for range analysis using affine arithmetic. J. Signal Process. Syst 2010, 61(3):279291. 10.1007/s1126501004522
Sarbishei O, Radecka K, Zilic Z: Analytical optimization of bitwidths in fixedpoint LTI systems. IEEE Trans. ComputerAided Design Integr. Circuits Syst 2012, 31(3):343355.
Rocher R, Menard D, Scalart P: Analytical approach for numerical accuracy estimation of fixedpoint systems based on smooth operations. IEEE Trans. Circuits Syst. I, Reg. Papers 2012, 59(10):23262339.
Kinsman AB, Nicolici N: Bitwidth allocation for hardware accelerators for scientific computing using SATmodulo theory. IEEE Trans. ComputerAided Design Integr. Circuits Syst 2010, 29(3):406413.
Kinsman AB, Nicolici N: Computational vectormagnitudebased range determination for scientific abstract data types. IEEE Trans. Comput 2011, 60(11):16521663.
Wadekar SA, Parker AC: Accuracy sensitive wordlength selection for algorithm optimization. In Proceedings of the International Conference on Computer Design: VLSI in Computers and Processors, 1998. ICCD ‘98, Institute of Electrical and Electronics Engineers Inc.. Austin; 5–7 October 1998:5461.
Carreras C, Lopez JA, NietoTaladriz O: Bitwidth selection for datapath implementations. In Proceedings of the 12th International Symposium on System Synthesis, 1999. Boca Raton: IEEE Computer Society; 1–4 November 1999:114119.
Comba JLD, Stolfi J: Affine arithmetic and its applications to computer graphics. In Proceedings of SIBGRAPI’93  VI Simposio Brasileiro de Computacao Grafica e Processamento de Imagens. Recife: IEEE Computer Society; 20–22 October 1993:918.
Stolfi J, de Figueiredo (eds) LH: Affine arithmetic. In SelfValidated Numerical Methods and Applications. Brazil: Monograph for 21st Brazilian Mathematics Colloquium, IMPA, Rio de Janeiro; 1997:7074.
Huang K, Yee H: Improved tangent hyperplane method for transient stability studies [of power systems]. In Proceedings of APSCOM91 Conference, Institution of Electrical Engineers. Hong Kong; 5–8 November 1991:363366.
Eivind E, Gustavsen TS: GRA6035 Mathematics. Oslo: BI Norwegian Business School; 2010.
Moore R: Interval Analysis. New Jersey: PrenticeHall; 1966.
Pang Y, Radecka K: An efficient algorithm of performing range analysis for fixedpoint arithmetic circuits based on SAT checking. In Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS). Rio de Janeiro: IEEE Computer Society; 15–18 May 2011:17361739.
Shekhar N, Kalla P, Enescu F: Equivalence verification of arithmetic datapaths with multiple wordlength operands. In Proceedings of Design, Automation and Test in Europe. Munich: IEEE Computer Society; 6–10 March 2006:824829.
Gopalakrishnan S, Kalla P, Meredith MB, Enescu F: Finding linear buildingblocks for RTL synthesis of polynomial datapaths with fixedsize bitvectors. In Proceedings of International Conference on ComputerAided Design, Institute of Electrical and Electronics Engineers Inc.. San Jose; 5–8 November 2007:143148.
Shou H, Song W, Shen J, Martind R, Wang G: A recursive Taylor method for raycasting algebraic surfaces. In Proceedings of International Conference on Computer Graphics and Virtual Reality. Las Vegas: CSREA Press; 26–29 June 2006:196204.
Jiang J, Luk W, Rueckert D: FPGAbased computation of freeform deformations in medical image registration. In Proceedings of IEEE International Conference on FieldProgrammable Technology 2003. Tokyo: IEEE Computer Society; 15–17 December 2003:234241.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Sun, R., Zhang, Y. & Cui, A. A refined affine approximation method of multiplication for range analysis in wordlength optimization. EURASIP J. Adv. Signal Process. 2014, 36 (2014). https://doi.org/10.1186/16876180201436
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/16876180201436