Realization of Ternary Sigma-Delta Modulated Arithmetic Processing Modules

Sigma-delta modulated systems have a number of very appealing properties and are, therefore, heavily used in analog to digital converters, ampliﬁers, and modulators. This paper presents new results which indicate that they may also have signiﬁcant potential for general purpose arithmetic processing. The paper introduces new arithmetic processing structures for ternary (i.e., +1, 0, or − 1) sigma-delta modulated signals. Simulations show that these new structures can be implemented very e ﬃ ciently and have relatively good accuracy.


Introduction
Oversampled sigma-delta modulation (SDM) signal representations have several key advantages over traditional Nyquist rate pulse code modulated formats. When signals are put into SDM format, they typically have very short word-lengths (e.g., binary or ternary). This very simple representation creates the potential for reduced hardware complexity, simpler signal bus routing, and resilience against electronic component inaccuracies [1,2]. There is yet another advantage to the use of SDM (or bitstream) systems. If signals are maintained in bitstream form all the way through the processing chain, one does not need format conversions (or the associated interpolation and decimation filters). This is so because within many systems, the "front end" analog to digital converters and "back end" digital to analog converters use SDM bitstream format, but the intermediate processing stages are typically implemented in multibit format. If the intermediate processing stages of a system can be operated in binary or ternary format as well, then the format conversions are unnecessary.
Several works have proposed digital bitstream arithmetic processing using pulse width modulation, particularly in digital neural networks, for example, [3,4]. O'Leary and Maloberti [5] also presented a binary bitstream adder in which the sum is stored and fed back to the adder to reduce the truncation error (i.e., the carries of the full-adder). This same approach has been adopted by [6] to implement a ternary bitstream adder using 2's complement format. However, the possibility of compensating the ignored carries is confined to the immediate next sample. The compensation fails when the next sample addition generates a carry as well. Another well-cited work on bitstream arithmetic is that conducted in [7], where various binary arithmetic circuits were proposed. However, most of the arithmetic circuits proposed in [7] suffer from two main drawbacks. First, many of these structures do not operate fully in the short wordlength domain; they partially operate in the multibit domain. In particular, many of these structures use integrators which consist of a recursive multibit adder followed by an SDM requanFtizer. The second major drawback in the arithmetic units within [7] was the limited accuracy of the structures.
This paper attempts to address both of the above listed limitations. Ternary quantized SDM processing is the assumed format throughout the paper. Ternary format (i.e., +1, 0, −1) is used rather than binary because the extra zero state reduces quantisation error and so enables greater accuracy. At the same time, the zero state often corresponds to a no hardware operation, and so ternary formats often require minimal extra hardware components [8].
2 EURASIP Journal on Advances in Signal Processing Q T (·): Ternary quantizer 1-bit Multi-bit Figure 1: The structure of a first-order ternary sigma-delta modulator.
In this paper, ternary arithmetic processing modules are proposed, and an attempt is made to provide a measure of the accuracy of these systems. This is done by determining the resolution or number of bits in a multibit counterpart with similar accuracy. Both DC and slowly varying input signals are considered in this paper.

Basic 1-Trit Ternary Adder/Subtractor
The adder is a critically important component of an arithmetic processor since it is a fundamental building block upon which many other processing operations are built. Therefore, it is highly desirable to create adders with minimal complexity. Let x(k) and y(k) be two ternary bitstreams. It is assumed that the inputs to the proposed adder are obtained with ternary sigma-delta modulators (TSDMs) . A sample first-order TSDM is shown in Figure 1. These modulators may be well incorporated into analog to digital converter hardware. The signal y(k) is where [−λ/2, +λ/2] is the allowable range of the input. x(k) is defined similarly. Assume that the desired output of the proposed ternary adder is s(k) = x(k) + y(k), such that all x(k), y(k), and s(k) ∈ {1, 0, −1}. The task of designing the adder involves a tradeoff between implementation complexity and accuracy. One can implement a very simple adder with relatively large quantization by simply truncating any results greater than ±1. The resulting quantization error is then relatively large. This simple adder would be defined as The above "basic" ternary adder can be implemented using a traditional ternary half-adder. It is perfectly accurate except when the two input signal values are identical, in which case a carry is generated (and neglected).
The quantization noise at the output of the basic ternary adder has two components: (i) the quantization noise inherent in the two input signals and (ii) the quantization noise due to the truncation operation which occurs when the two inputs have identical values. These two components are quantified below.
(i) If there was a white uniformly distributed quantization noise in each of the inputs, then the power spectral density (PSD) of each input would be where f s is the sampling frequency, and Δ is the quantization step. The quantization noise will not be white, however, because of the noise shaping inherent in SDM signal formats [9,10]. Assume, say, that 1storder TSDMs have been used to create x(k) and y(k), then the power spectral density of the quantization noise in both x(k) and y(k) would be It is worth noting that we are dealing here with the whole oversampled spectrum (from 0 to f s Hz) since the operations are achieved within the ternary domain and there is no decimation process.
(ii) Now, it is necessary to determine an expression for the quantization noise corresponding to the truncation introduced by the ternary adder. We denote the truncation error signal as p(k). This signal will only be nonzero when x(k) = y(k) = ±1. This condition would be expected to occur on average about two samples out of every nine, assuming that the probability of obtaining a value of +1, −1, or 0 is equal to 1/3. The total average power of p(k) would, therefore, be about 2/9. The spectral shape of p(k) depends on the correlation between x(k) and y(k). If x(k) and y(k) are perfectly correlated, then the spectrum has a delta function at DC. If the signals are uncorrelated, then the spectrum tends to be white.
Then, the expected PSD due to the truncation process is where E[·] is the expectation operator, and F is the Fourier transform operator. The total power spectral density at the output of the adder will be the sum of the quantization noise corresponding to the two input signals (2P q ( f )) and P trun ( f ). That is, the total quantization power spectrum is given by x(k) The subtraction operation can be easily accomplished by negating one of the ternary bitstreams and using the same proposed adder.

Improved Adder
Knowing the source of errors in the ternary adder specified in (2), one may alleviate this error using a simple technique. If the lost carries are compensated for whenever possible in the next samples, then in the average sense, the adder would have improved accuracy. This can be done by introducing a ternary flip-flop d(k) in the adder circuit to store any carry overflows and propagate this carry information to subsequent samples. Figure 2 shows a block diagram of the improved ternary adder version. The rationale behind the new adder circuit is that any carry arising from the addition of the current two input samples should be stored in a flipflop and added to the next output sample. If there is any carry generated from doing this addition to the next output sample, then that resulting carry should also be fed back and stored. The operation of the circuit in Figure 2 can be described mathematically as The improved adder can be implemented by using three ternary half-adder (THA) modules and one delay element. Each THA performs according to the truth-table shown in Table 1. The ternary adder (TA) defined in (7) can easily be implemented with either conventional digital gates (e.g., [11]) or with multiple-valued logic (e.g., [12,13]).
According to (7), d(k) can be re-expressed as noting that the condition c 1 By recombining lines 3 and 4 of (7), one obtains the following expression for s(k): error term, e c (k). (9) Recalling that d(k) = sgn[c 1 (k) + c 2 (k)]. It should be noted that the error term is only nonzero when both c 1 (k) and c 2 (k) are equal and nonzero (or equivalently, . Implicitly, this leads to the fact that an error will only occur when , with two consecutive +1s (−1s)). Then, the error e c (k) is given as Note that if the probability of +1, −1, and 0 is assumed equal for a trit, then the probability of would be 2/81. Now, assuming ergodicity, the average value for e c (k) can be calculated as and the autocorrelation function for e c (k) is given by The precise form for the autocorrelation function will depend on the nature of the signals and in particular the correlation between them. By taking the Fourier transform of (12), one can determine the power spectral density P c ( f ) of e c (k) as Note that in the above equation the discrete Fourier transform uses cosine basis functions rather than complex exponential ones because c m is an autocorrelation function and is, therefore, even. An alternative approach can be used by recombining lines 3, 4, and 5 of (7), then one obtains the following expression for d(k): where q(k) is the truncation error in d(k) due to "uncompensatable" carries. q(k) is given by x y c s

EURASIP Journal on Advances in Signal Processing
Rearrangement the above expression for d(k) yields an expression for s(k)as follows: Taking the z-transform of (15), the output can be expressed as Examination of the above equation reveals that S(z) comprises a true component (X(z) + Y (z)) and an error term (−(1−z −1 )D(z)+Q(z)). The error term is, in turn, comprised of two components, the first corresponding to carries which are eventually "compensated" ((−(1 − z −1 )D(z)) and the second due to uncompensated or "lost" carries (Q(z)). The structure of the adder causes the compensated carry error component to be high-pass filtered, as per the (1 − z −1 )D(z) term. This high-pass filtering causes significant attenuation of the error term and accounts for the improvement provided by the adder. The uncompensated carries error term does not get attenuated, but fortunately, it is relatively low in power because it tends to be nonzero in an average probability of about 2/81. Because of the high-pass filtering of the error term which is inherent in this adder, a significant reduction in the average quantization error can be achieved. This is illustrated in simulations in Section 5.

Format Conversion via an SDM with Ternary Integrator
As discussed earlier, ternary arithmetic is significantly more accurate than binary arithmetic, at least for pulse width modulation type signal formats. To implement practical ternary arithmetic, it may sometimes be necessary to convert incoming signals from binary format to ternary format. It is obviously desirable that this must be done efficiently. This section proposes an efficient new structure for binary to ternary format conversion, with this new structure involving an SDM whose internal integrator is formed from the adder proposed in the previous section.
In the short word-length literature, digital integrators have generally been constructed from a multibit subsystem (such as up-down counters) followed by a 1-bit noise shaper to restore the format to the short word-length domain [7]. This approach is computationally intensive. By using the ternary adder (TA) from the previous section, a novel integrator is proposed that operates entirely in the ternary domain (see Figure 3, inside the box) Simulation results presented in Section 5 show that the new 1-bit ternary integrator outperforms the traditional counterpart in [7].
Having devised a digital integrator the next step is to construct an SDM-based format conversion structure. This structure uses both the proposed adder and new integrator and is shown in Figure 3.
Consider first the leftmost TA in Figure 3. Equation (16) provides a general expression for the output of an arbitrary TA, and using this result, one can obtain the output for the leftmost TA as follows: (17) where (1 − z −1 )D 1 (z) and Q 1 (z) are the errors due to the compensated and uncompensated carries, respectively, in the leftmost TA. (Note that because of the synchronous clocking which is used, there is effectively a single sample delay in the feedback path of the SDM in Figure 3. This effective delay is not explicitly shown in Figure 3 because of convention-SDMs representations normally do not explicitly show a delay). Now consider the rightmost TA. Again, using the result in (16), one can obtain the Z-transform of the output as (18) where (1−z −1 )D 2 (z) is the error due to compensated carries, and Q 1 (z) is the error due to uncompensated carries in the rightmost TA. Combining the above two equations yields The performance of this new format conversion structure is evaluated in the next section.

Simulations
As we are dealing with arithmetic processing, there is a need to determine the resolution of the new ternary bitstream structures. To make a reasonable comparison with the No. of bits used in [6] 4 . 9 6 . 4 used in [7] 45.2 4.8 proposed 0.72 7.75 multibit domain, the output stream has to be windowed and averaged to determine an equivalent multibit value. The length of the time window should be greater than or equal to the oversampling ratio (OSR) to ensure a fair comparison. The SNR of this averaged output ternary bitstream is then calculated, and an equivalent resolution (equivalent number of bits) can be obtained.

Simulation Results for the Proposed Adders.
Four different input signals were considered in this work. The first two were sinusoids corrupted by additive white Gaussian noise with an SNR of about 25 dB. These sinusoids had the forms x a (k) = 0.5 sin(2πk)+ν 1 (k) and y a (k) = 0.5 sin(4πk)+ν 2 (k), respectively, with ν 1 (k) and ν 2 (k) representing the additive noise. These two sinusoids were mapped to the symmetrical ternary domain using ternary quantizer sigma-delta modulators. Figure 4 shows the spectra of the bitstreams x a (k), y a (k), and the summation of these bitstreams with the adder is specified in (7). Additional simulations have revealed that the adder is quite robust to the presence of DC components. That is, spurious tones do not appear when there is a DC component present. The final two types of input signals considered were the DC signal and the ramp signal. The former is considered to be one of the most challenging signals for SDMs to deal with-it can easily produce limit cycles [14,15]. It was specified by x a = −0.5. The ramp signal was specified by y a (k) = 2 −10 k, where k = 0, 1, 2, . . . , 2 10 . Figure 5 shows plots of the sum of x a (k) and y a (k) obtained with (i) the basic adder defined in (2), (ii) the improved adder defined in (7), and (iii) a 32 bit precision multibit adder. As seen in Figure 5, the curves corresponding to the 32 bit precision adder and the improved ternary adder are almost indistinguishable. The average output signal error power was calculated by subtracting the true signal value from a multibit reconstruction of the ternary signal representation. This reconstruction was achieved by filtering the ternary signal with an L-point moving average filter. In these simulations, L = OSR = 128. The mean squared error of the basic adder (method 1) and the improved adder (method 2) were 8.3 × 10 −4 and 7.2 × 10 −6 , respectively. This corresponds to an equivalent multibit resolution of 4.4 and 7.75 bits, respectively. An improvement of 3.35 bits has, therefore, been achieved by using the improved adder rather than the basic adder. This result is consistent with the expectation expressed in Section 3.
To compare the resolutions achieved with the proposed adder versus those of the adding technique presented in [7] (1-bit adder), [6] (ternary), the same DC and ramp inputs as above were used. The adder output was averaged over OSR = 128 samples, and the mean squared error (MSE) was calculated and compared with an equivalent N-bit quantizer that produced the same value of MSE (for the same dynamic range (−0.5-0.5)). Table 2 summarizes the outcomes. The improved adder proposed in this paper clearly outperforms the existing adders.  Figure 6: Output of the proposed format conversion structure in Figure 3 and output of its traditional counterpart. The oversampling ratio is always 128.

The Proposed Binary to Ternary Format Conversion
Structure. It should be noted that as long as we are dealing with short word-length systems (i.e., with no need to go back and forth between decimation/interpolation stages), one must be concerned about the whole range of the frequency spectrum, that is, [0, f s ].
For the simulations in this section, a 16-bit PCM signal x a (k)) = 0.5 sin(2π f o k) + ν 1 (k) with an SNR of 45 dB was modulated using an SDM to produce x(k) ∈ {1, −1}. For simulation, this binary bitstream was used as input to the newly proposed SDM format conversion structure (shown in Figure 3. The power spectrum of the output is shown in Figure 6 with the normalized in-band region assumed to be ±( f o / f s ) = ±0.0078. Also in Figure 6 is shown the output obtained if the same format conversion structure and inputs are used, but with a traditional integrator (of the form proposed in [7]) instead of the integrator proposed in Section 3.
For the traditional and proposed format conversion structures, the ensemble-average (1000 runs) of the in-band SNRs (SNR inb ) was found to be +45.7 dB and +52.2, respectively. While the whole of band SNR (SNR all ) was −3.7 dB for the new integrator and −8.5 dB for the traditional one. That is, improvements of about 6.5 dB in SNR inb and about 5 dB were obtained by using the new integrator within the format converter structure. This improvement is a promising finding as integrators are common structures in many digital electronic circuits. Moreover, the proposed format converter structure not only outperforms its traditional counterpart but also permits more efficient hardware implementation.

Realization of Exponential/Trigonometric
Functions. This section illustrates the use of the improved adder as a building block for realising practically important functions such as exponential and trigonometric functions. To create these functions, the improved adder was first used to create a multiplier according to the model in [7]. That is, the multiplier in [7] was realized by simply replacing the original adder components with the new adder introduced in Section 3.
Once the multiplier was constructed, the exponential and trigonometric functions were able to be created by using two non-DC terms of their series expansions (i.e., e x − 1 = x + x 2 /2! and cos(x) − 1 = x 2 /2! − x 4 /4!). Figure 7(a) shows the averaged (with a 128 sample moving average filter) ternary realization of the function e x − 1 for a ramp DC input (extending between −1 and +1) compared with its 2-term infinite-precision counterpart. The input was varied in steps of 2×2 −10 . A mean squared error of 6 × 10 −4 was obtained for a dynamic range of 2.0234 which is equivalent to the quantization noise of ∼5.6 bit system. Figure 7(b) shows averaged ternary sine and cosine functions which were created using the same approach as was used for the exponential function. These functions are drawn versus the index k of the input signal x(k) = A sin[8π k] so as to provide a clear visual assessment of the outcome. The input signal x(k) was varied in steps of = 2 × 2 −9 . For a dynamic range of 1, the mses were 3.29 × 10 −5 for the sine and 8.4 × 10 −5 for the cos, with these being equivalent to 6.76 and 5.7 resolutions bits, respectively. The reduction in accuracy in the cosine function is attributed to the quadruple term implementation in its expansion series.

Conclusions
Novel 1-bit ternary arithmetic structures have been proposed in this paper for adders, integrators, and format converters. The internal processing and the output for these new structures are all kept entirely in the ternary domain. The operation of the proposed adder is assessed in terms of the accuracy (expressed as the equivalent number of bits in corresponding multibit system). Simulations show that both structures are surprisingly efficient and, therefore, have the potential to realize multiplication, division, and exponential/trigonometric functions.