Graph-Based Channel Detection for Multitrack Recording Channels

We propose a low complexity detection technique for multihead multitrack recording systems. By exploiting sparseness of two-dimensional partial response (PR) channels, we develop an algorithm which performs belief propagation (BP) over corresponding factor graphs. We consider the BP-based detector not only for partial response channels but also for more practical conventional media and bit-patterned media storage systems, with and without media noise. Compared to the maximum likelihood detector which has a prohibitively high complexity that is exponential with both the number of tracks and the number of intersymbol interference (ISI) taps, the proposed detector has a much lower complexity and a fast parallel structure. For the multitrack recording systems with PR equalization, the price is a small performance penalty (less than one dB if the intertrack interference (ITI) is not too high). Furthermore, since the algorithm is soft-input soft-output in nature, turbo equalization can be employed if there is an outer code. We show that a few turbo equalization iterations can provide signiﬁcant performance improvement even when the ITI level is high.


INTRODUCTION
In the past few years, magnetic recording systems have seen great breakthroughs both in fabrication techniques and in the development of signal processing algorithms.As recording densities increase, data recovery processes become more complicated.Due to the reduction of island separations in both along-track and cross-track directions, the amount of intersymbol interference (ISI) and intertrack interference (ITI) increases rapidly which potentially results in significant degradations in the performance of traditional systems.In addition, the implementation of timing and gain control becomes more difficult.Multihead multitrack recording represents a promising direction to improve detection capabilities with increased recording densities.Such methods not only offer a better performance by suppressing the ITI and ISI efficiently but also increase efficiency in control information overhead.So far, multihead multitrack recording has been proposed in the context of conventional media recording systems, and several studies have been reported [1][2][3][4][5][6][7][8][9][10][11].
Due to the "superparamagnetic effect," conventional media storage systems have approached their maximum achievable recording densities.To further extend this storage density limit, several new technologies have been proposed recently, among which patterned media recording [12][13][14][15][16][17][18][19] has attracted a significant interest due to its potential to achieve ultrahigh recording densities.In this paper, we deal with general multihead multitrack recording systems, however, we also pay special attention to their use for patterned media recording systems.
A major concern on multihead multitrack recording is the exploitation of its benefits while maintaining a reasonable computational complexity.It is well known that the optimal maximum likelihood (ML) detector [1,2] for this system has a complexity exponential with both the number of tracks and the number of ISI taps.Even with a modest number of tracks, the resulting computational complexity of a multidimensional Viterbi algorithm becomes prohibitively high.Therefore, the development of low complexity alternatives is necessary.In the literature, there are several approaches to solve this problem.For example, detection can be performed iteratively on rows and columns of the 2D ISI channel [20], or as an alternative, one can consider a simplified trellis, either by reducing the number of states [8], or by reducing the number of branches per state [10,11].
In this paper, instead of using a multidimensional Viterbi algorithm, we consider the detection problem from a different perspective and develop an efficient algorithm for multitrack systems with an acceptable computational complexity.To this end, we propose to use belief propagation (BP) [21,22] to perform inference on factor graphs of multihead multitrack recording channels.This idea has been recently applied to generalized 2D ISI channels and it is shown that the detection performance is poor for very loopy channel conditions [23].Meanwhile, an enhanced BP detector is proposed to achieve near optimal performance with some complexity increase.Here, we note that although the proposed detector is not optimal in general, it suffers from only a small performance penalty at low to medium ITI levels for multitrack recording with PR equalization.At the same time, due to the inherent sparseness of the channel, the detector maintains a low complexity and a fast parallel structure.It is applicable to both conventional and bit-patterned media storage systems.In our development, instead of working on 2D multitrack recording channels directly, we consider channel detection over an equivalent 1D channel.Furthermore, since the detector is soft-input soft-output in nature, we also consider turbo equalization to further improve the data recovery performance, which has also been considered in [24,25] for 2D ISI channels with different detection algorithms.
The rest of the paper is organized as follows.In Section 2, we introduce the system model.In Section 3, we discuss the equivalent 1D channel model and develop the BP-based detection algorithm.Performance evaluations for different channels are illustrated and discussed in Section 4. Finally, some remarks are presented in Section 5 to conclude the paper.

SYSTEM MODEL
We consider a multihead multitrack recording system with an array of N R heads flying over N T adjacent tracks.For each group of N T tracks, there are two guard bands (no information written) on each side.The signal read by the rth head is given by where x i s is the ith bit stored on the sth track with the value {+1/ − 1}, g r,s (t) is the channel response incorporating both ISI and ITI (from the sth track to the rth head), and the noise n r (t) is assumed to be additive white Gaussian with a mean of zero and a power spectral density of N 0 /2.To simplify the description of the algorithm, we assume that interference in the cross-track direction is limited to two adjacent tracks with an interference factor of η [6].Therefore, the readback signal is modeled as where h(t) is the channel response in the along-track direction.For bit-patterned media, the readback pulse h(t) can be calculated using the reciprocity principle [14,17].
For conventional media, we consider longitudinal recording channel as an example where the transition response is usually modeled as a Lorentzian pulse, that is, Recording density is defined as D = PW 50 /T, and the channel response is given by h(t) = p(t) − p(t − T).We consider the case of N T = N, N R = N + 2, where two extra heads are centered over the two-guard bands.For other choices of (N T , N R ), our algorithm is still applicable in a similar fashion.The signal-to-noise ratio is defined as dt is the energy of the channel response.
In magnetic recording systems, in addition to the additive white Gaussian noise, another important source of noise is jitter.In conventional media storage systems, dominant media noise sources include position fluctuations and pulse width variations.When they are incorporated into the channel model, the transition response in the kth time interval is given by where Δt k and Δw k represent the position jitter and width jitter for the kth transition.In our model, we assume that they are truncated Gaussian random variables with variances σ 2 t and σ 2 w , respectively.Δt k is limited to [−T/2, T/2] and Δw k must be larger than −PW 50 .In patterned media storage systems, although transition noise accompanied with conventional film media is eliminated, other forms of media noise arise [13,15,17,19].For example, position jitter, bit size jitter, and magnetization jitter are three major types of media noise.Similarly to the case of the conventional media, we can also model them as truncated Gaussian random variables (e.g., in this paper, we will give an example considering the position jitter with Δt k ∈ [−T/2, T/2]).
At the receiver, readback signals are passed through a front-end low-pass filter, sampled at the symbol rate and then equalized to a certain partial response target using a linear equalizer under the MMSE criterion.The original data is then recovered from the output of the equalizer using a suitable detection algorithm, which is the main focus of this paper.

MULTIHEAD MULTITRACK DETECTION
Basically, the multihead multitrack detection problem is 2D in nature, that is, we need to recover data from a 2D ISI channel.Therefore, in order to achieve full benefits of multitrack systems, joint detection is essential.

Joint detection
The optimal solution to the problem is a full ML-type or MAP-type detector using either the Viterbi algorithm or BCJR algorithm, respectively, both of which are based on the ISI channel trellis.However, since we need to consider simultaneous detection of multiple tracks, the trellis for the 2D ISI channel is much more complicated than the single-track case.Computational complexity of the detector increases exponentially with both the number of tracks N and the target length L S .Even for moderate values of N and L S , complexity becomes quite prohibitive, thus the optimal algorithms are not practical.For example, for a single-track channel with a partial response target of order L S = 3, the trellis is defined by 2 LS−1 = 4 states each with 2 branches, which may be reasonable in complexity for practical implementation.However, when we consider a multitrack channel with N = 5 tracks and the same ISI length L S = 3, the number of states becomes 2 N(LS−1) = 1024 and the number of branches for each state transition is 2 N = 32.
To efficiently perform the joint detection, we need to find reduced-complexity alternatives while maintaining a near-ML performance.With this motivation, several previous papers have either worked with simplified trellises [8,11], or worked on rows and columns iteratively [20].The proposed approaches provide good tradeoffs between complexity and performance.In this work, we consider the problem from a different perspective and work with a factor graph representation of the channel and propose a BP-based detector.A major advantage of the proposed detector is that its complexity will not increase with the number of tracks as we will see in the following sections.
Belief propagation algorithm is an efficient method to solve many probabilistic inference problems over graphs.For example, it has been successfully applied to decoding of lowdensity parity-check codes [26].Although it is not optimal when there are cycles in the graph, it is shown empirically that the algorithm gives good results for a wide range of cases.Recently, it has also been exploited in data recovery for single-track recording channels [27,28].It has been proven that the performance of the BP-based detector is the same as that of ML/MAP-based detectors when there are no cycles, and it approaches the optimal ML/MAP performance when there are no short cycles.In [29], the authors still consider single-track recording channels, but extend the use of BP detection to the case of correlated noise.For general 2D ISI channels, [23] proposes to use a generalized BP algorithm for channels with loopy conditions.Here, we consider the use of BP algorithm for data recovery over multihead multitrack recording channels by considering an equivalent 1D channel model as described next.

Equivalent channel model
The 2D ISI channel can be represented as a factor graph.This concept can be described using different perspectives.To give a clear and simple picture, we follow the approach of [11], where the 2D ISI channel is viewed as a time-varying 1D channel.
Guard band 0 0 0 0 0 0 Let us first look at the timing-varying 1D channel model.Consider the classical PR4 target, for which the corresponding 2D partial response target (denoted as 2PR4 channel) is given by ( The time-varying 1D channel corresponding to this 2D partial response channel is illustrated in Figure 1. As a function of delay D, the target can be represented in causal form as in [11] (6) or, in noncausal form, as in our model, by As we can see, the advantage of this channel model over the 2D model is reduction of the number of branches for each state (from 2 N to 2).However, the number of states is still very large, that is, 2 2N+2 .To solve this problem, the authors in [11] use the M-algorithm which is a particular tree search algorithm for detection.

New approach-BP detection
We observe from (7) that the channel response of this 1D model is very sparse, that is, there are only 6 nonzero taps among 2N + 6 channel taps.Therefore, it is very inefficient to build up a trellis based on such a long channel response.Meanwhile, algorithms based on factor graphs can avoid this kind of inefficiency by performing belief propagation whose complexity is related only to the number of nonzero taps.Let us describe our proposed BP detector as a low complexity solution for the multihead multitrack detection problem.
Factor graph of H2PR4 channel model.

Factor graph
Instead of using a trellis, the time-varying 1D channel can also be characterized using a factor graph as shown in Figure 2.Each factor node represents an output signal and each bit node denotes a written bit to be recovered.The factor nodes are connected to the bit nodes through the channel model given in (7).We note that, the factor node degree and the bit node degree are both equal to 6 for the 2PR4 target.

Detection algorithm
Based on the factor graph representation, we can use the belief propagation algorithm to perform data recovery as illustrated in Figure 3 (which are similar to the algorithms employed in [27,28]).α (l) i j denotes the message propagated from bit node i to the neighboring factor node j at the lth iteration and β (l)  i j denotes the message sent from factor node i to the neighboring bit node j at the lth iteration.We will use log domain representations to express all the messages passed along the edges.
At each factor node i, the a posteriori LLR is generated for each connected bit node j based on the received signal, ISI channel constraint, and extrinsic information from other connected bit nodes.It is given by where we use B i = (x i1 , . . ., x iF ) to denote all the bit nodes that are connected to factor node i and B i ∼ xj to denote the same set of bit nodes excluding bit node j.F is the degree of the factor node.All possible choices of B i are divided into two sets and separately included in two summations, depending on whether x j equals +1 or −1.Given the ISI channel constraint, p(y i | B i ) can be easily expressed using Gaussian probability density function.If we assume that all the incoming messages are independent (this is true for a cycle-free graph and only approximately valid when the graph has cycles), the second term can be decomposed into (suppose j = i n ) where To further simplify the computation, we may employ the "max-log" approximation, that is, log(e x + e y ) ≈ max(x, y).Thus, the update message is given by where the noiseless channel output corresponding to each possible choice of B i .At each bit node i, by exploiting the a priori information from outside the detector and messages obtained from connected factor nodes, extrinsic information for factor node j is calculated as where V denotes the bit node degree (the number of factor nodes that are connected to the bit node), and θ i is the apriori information about x i .After several predetermined iterations (L), the final soft output is computed as This soft information can be used either for making instant decisions if there is no outer code, or it can be considered as the soft input for the outer decoder.For the latter, since our detector is a soft-input soft-output module, we can also apply turbo processing techniques to the whole system.In that scenario, extrinsic information from the outer decoder is treated as the a priori information θ.

Computational complexity
Let us now study the computational complexity of the BPbased detector.In the following discussions, we consider the "max-log" version of the algorithm where the message updates and the final output are given by ( 11)- (13).At each factor node, we can observe from (11) that |y i − y i | 2 /N 0 only needs to be computed at the first iteration and will not change in the following iterations.To calculate this term, for each B i , 2 multiplications (one for the square operation and one for the division of N 0 ) and F additions (to calculate y i − y i ) are needed.Here, F is the factor node degree, and is also the number of nonzero PR channel taps.Since there are 2 F−1 possible choices of B i when x j = +1/ − 1, the total number of operations to compute |y i − y i | 2 /N 0 is approximately F2 F additions and 2 F+1 multiplications.In addition to these operations, for the first and rest of the iterations, F2 F additions (with updated α ki ) and 2 F comparisons are necessary.At each bit node, the message updates are relatively simple and only additions are needed (about V additions per iteration).Therefore, we can see that the total computational complexity for the BP-based detector is mainly determined by the complexity at the factor nodes, and it is linear with the number of iterations and exponential with the number of nonzero taps.
Compared with the ML/MAP detector, the complexity of the proposed algorithm is greatly reduced and it becomes applicable for practical systems.For example, let us consider a PR target with the length of ISI span and ITI span selected to be L S and L T , respectively.The number of tracks is N, that is, we want to decode N data sequences jointly.For the 2D Viterbi decoder, numbers of multiplications/additions/comparisons per bit are all approximately 2 NLS .For the 1D Viterbi decoder (corresponding to the 1D equivalent channel model introduced in Section 3.2), they are approximately 2 N(LS−1)+LT .Considering the reduced complexity versions of the Viterbi algorithm [8], for example, for the M-algorithm, we need 2 N S number of multiplications/additions and an extra sorting complexity which is approximately linear with S (S is the number of surviving paths).Compared to the BP-based detector, we can see that our detector is more favorable when N becomes larger.We also note that the structure of the BP algorithm makes it possible to employ parallel implementation, which will potentially make the detection faster.Therefore, the proposed BPbased detector provides us with a promising low-complexity solution for multitrack recording channels with PR equalization.

EXAMPLES
Let us present several performance evaluation results for different channel models and different parameters using the proposed detector.We note that for other suboptimal approaches [8,11,20,23], there exists a tradeoff between complexity and performance, and an easy comparison is generally hard to achieve.Therefore, we will only compare the error rate performance of the proposed detector with that of the optimal full-complexity ML detector.A rough comparison with other approaches can be obtained by observing the corresponding differences with the performance of the optimal detector.
First, let us look at the system performance with varying number of iterations L. We consider a 2D partial response (2PR4) channel with N = 8 and η = 0.1.As shown in Figure 4, the system performance is closely related to the number of iterations.If L is too small, the performance is severely degraded.On the other hand, if L is too large, additional performance improvement with further iterations is only marginal.The reason is as follows.With small L, each node cannot obtain enough extrinsic information from other nodes to provide an accurate estimation of actual data.With large L, exchanged messages will either approach the true LLR information if the factor graph is cycle-free, or become correlated with each other if there are cycles.For either case, the improvement that can be provided is limited.Since complexity increases linearly with L, we should choose it appropriately.In the following discussions, we will pick L = 10.
We next compare the performance of the BP-based detector with that of the full-complexity ML detector in Figure 5.In this example, we choose N = 5.It is clear that, if the ITI coefficient is not very large, performance gaps between two detection algorithms are very small.However, when ITI becomes severe, the gap becomes large.This is due to the existence of short cycles.As we can see from Figure 2, there are many length-4 cycles (a path starts from one node, passes along four different edges, and then returns to the same node) in the factor graph of the 2D ISI channel.Four edges in length-4 cycles have different weights.When η is small, only parts of the edges are significant and the existence of short cycles does not degrade the performance very much (less than a dB).However, when η becomes large, these four edges become comparable to each other and the existence  of short cycles starts to degrade the system performance considerably.Therefore, the proposed BP-based detector is suitable for channels with low to medium ITI levels.For high ITI levels or channel factor graphs with worse loopy conditions, an enhanced BP detector needs to be constructed, for example, one can tradeoff complexity with performance as done in [23].
In practical magnetic recording channels, because of front-end filtering and equalization, noise becomes colored.To assess the performance of the proposed algorithm for this scenario, we apply the BP detection algorithm to a more practical storage system.We consider examples for two different systems: conventional media and bit-patterned media.
For conventional media, we give an example based on the Lorentzian channel model.With a signal-to-noise  ratio of 15 dB, system performance with different recording densities and different η is evaluated in Figure 6.It is clear that degradation due to ITI is much more severe at low recording densities.
To give an example with patterned media, we choose the following parameters as an example.The read pole head length and width are selected as 4 nm and 20 nm, respectively, and the shield-to-shield gap is chosen to be 18 nm.The head-medium spacing and medium thickness are both fixed at 10 nm, and the islands are assumed to be square with a side length of 15 nm.The bit pitch distance T a is fixed at 30 nm and the track pitch distance T b is varied depending on the recording density.Several results are shown in Figure 7. Again, we observe that the performance of the BP-based detector is very close to that of the fullcomplexity ML detector for large T b (low ITI levels) and is degraded for small T b (high ITI levels).
We also consider the BP detection over magnetic recording channels with media noise.We again evaluate two different media.In Figure 8, we investigate the Lorentzian channel when there exist different levels of position jitter and pulse width jitter.It is clear from the results that the detection performance is more sensitive to the position jitter.With the same level of position jitter, error rate curves due to different width jitter almost overlap with each other.However, when the position jitter increases, performance is degraded greatly.We note that in this case, our detector can be further improved by performing noise prediction [29].However, we do not pursue this topic in this paper.Similarly, we evaluate patterned media recording channels with position jitter in Figure 9 as an example, and observe similar behavior.
There is an additional advantage for our proposed detector.Since it is soft-input soft-output, we can easily incorporate the technique of turbo equalization if there is an outer code.Let us give two examples for the 2PR4 channel whose performance without coding is given in Figure 5.For the first example, we consider an outer turbo code where two encoders with the same generator (23/31) octal are parallel concatenated using a random interleaver with length 10 000.The code rate is fixed at 8/9 and the number of turbo decoding iterations is 15.In the simulation, we still choose L = 10 and perform L detector iterations before carrying out one turbo equalization.The performance improvement at η = 0.2 is demonstrated in Figure 10 for different SNR values (5.0 dB ∼ 5.4 dB).We observe that the use of turbo equalization helps to further improve the data recovery performance, especially at SNRs over 5.2 dB.We also observe that the gain achieved by using turbo   equalization is significant for the first few iterations and then it tends to diminish when the number of iterations is increased.This phenomenon agrees with most of the applications of turbo equalization.Generally, the amount of improvement that can be gained using turbo equalization would depend on the specific detector.One way to predict this is through EXIT chart analysis [30].If the slope of the detector transfer function is steep, more turbo equalizations are helpful.Otherwise, more reliable feedback from the decoder makes little difference in the detector output.
For our last example, we consider a simple convolutional code with generator (23/31) octal as the outer code and the code rate is still 8/9.Let us look at the case of η = 0.5 and N = 5, where we know from Figure 5 that the uncoded system suffers from severe performance loss at this ITI level.We illustrate how the turbo equalization helps to improve the system performance in Figure 11.Both bit error rate (BER) and frame error rate (FER) are shown, where the frame length is 1000 and SNR per bit instead of SNR is used to make a fair comparison between the uncoded and the coded systems.We can see that, with just one or two turbo equalizations, both BER and FER are improved significantly (i.e., BER under 10 −6 and FER under 10 −4 ).

CONCLUSIONS
In this work, we have considered belief propagation based detection algorithms for multihead multitrack recording systems.By developing a factor graph representation of the channel and exploiting sparseness of the graph, we have provided a low-complexity solution to the data recovery problem over multitrack recording channels.Compared to ML-type detectors which have an exponential complexity with both the number of tracks and the number of channel taps, the proposed detector has a complexity exponential with the number of nonzero interfering taps only and linear with the number of iterations.At low to medium ITI levels, it only suffers from a small performance loss compared to MLtype detectors.Furthermore, it can be employed with turbo equalization for recording systems with an outer code due to its soft-input soft-output nature.

Figure 1 :
Figure 1: Channel coefficients of the equivalent 1D time-varying channel model for 2PR4 target at the kth time instant, N = 6.

Figure 3 :
Figure 3: Diagram of the message-passing process.

Figure 6 :
Figure 6: Performance for conventional media storage system (Lorentzian channel).

Figure 8 :
Figure 8: Performance for conventional media storage system with media noise levels.

Figure 9 :
Figure9: Performance for bit-patterned media storage system with media noise levels.

Figure 10 :
Figure 10: Performance improvement by employing turbo equalization, turbo coded system.