EURASIP Journal on Applied Signal Processing 2002:12, 1415–1426 c ○ 2002 Hindawi Publishing Corporation Noncoherent Multiuser Communications: Multistage Detection and Selective Filtering

We consider noncoherent multiuser detection techniques for a system employing nonlinear modulation of nonorthogonal signals. Our aim is to investigate near-optimum noncoherent multiuser-detection techniques that utilize the received signal structure while retaining reasonable complexity. Near-optimum approximations of the maximum-likelihood detector are investigated where the signal structure is reflected in the approximation techniques explored. Several implementations of noncoherent-soft interference cancellers are proposed and investigated, each of which exploits the signal structure in a specific way. We propose a class of detectors that employ selective filtering, a technique that exploits the a priori information that each user selects one of signals for transmission. We show that selective filtering offers improved performance over the noncoherent counterparts of the existing near-optimum multiuser detectors. Both deterministic and blind adaptive implementations of selective filtering are considered. Numerical comparisons are provided to demonstrate the near-optimum performance of the proposed detectors.


INTRODUCTION
Nonlinear M-ary modulation with noncoherent detection is often necessary when phase estimation is difficult due to rapid changes in the channel conditions [1]. In a multiuser setting, the correlated waveforms that are used to transmit the users' messages give rise to interference issues since the receiver observes the superposition of all users' transmissions. Similar to its coherent, linear modulation counterpart [2], the maximum-likelihood (ML) detector for multiuser noncoherent communications with nonlinear modulation that estimates all users' messages jointly, has prohibitive complexity. ML multiuser detection, for both linear and nonlinear modulation, is NP-hard in the number of users, therefore, no efficient solution methodology is known [3].
For noncoherent systems, the complexity of optimal detection has directed attention toward suboptimal interference suppression techniques [4,5,6,7,8]. The pioneering work [4] introduced a pseudo-linear representation in which the signal space is spanned by MK signals corresponding to the M possible messages for each of the K users. This approach led naturally to two-step detectors in which decorrelative [4,7] or MMSE [6] linear filtering for user separation is followed by noncoherent single-user detection. Alternatively, [5,8] employed decision-directed methods that use prior decisions to suppress the interference. The approach of [5] is to decorrelate against all possible interfering signals. Prior decisions reduce the space of possible interfering signals in that if a decision is made that user k transmitted signal m, then there is no need to decorrelate against the other M − 1 possible transmissions of user k. In [8], the approach is to decorrelate against the known interfering signals identified by previous decisions. The resulting output is then passed to a bank of single-user matched filters followed by a maximum magnitude detector to determine the symbol.
In this paper, we follow the general spirit of [5,8] and examine approaches that combine filtering with decisiondirected methods. We propose low-complexity, suboptimal noncoherent detectors that take advantage of certain a priori information available regarding the multiuser signaling. We incorporate this structure into the algorithms of three detector classes: constrained detectors, soft interference cancellers, and selective filtering detectors [9]. The constrained detectors embed maximum amplitude information for the received signal components as constraints for nonlinear programming relaxations of the ML multiuser detector. In the class of interference cancellers, we explore three variations that arise due to the noncoherent nature of the detection scheme: the serial, clipped, and parallel soft-IC (Interference Canceller). These soft-IC detectors employ the same fundamental multistage detection approach as their linear modulation and coherent detection counterparts, for example, [10]. Each of these cancellers embeds the multiuser signal structure in its detection algorithm in a different way. We further improve the performance of the noncoherent multiuser detectors by exploiting additional information in the form of selective filtering. The selective filters use the a priori information that the desired user selects for transmission only one of the M messages available in its constellation. Unlike the nonselective filters of [4,6,7], selective filtering for the desired user attempts to suppress only the possible signals of the interfering users. For the most part, our results show that the soft-ICs yield better performance than the nonselective decorrelating and MMSE filters, especially in near-far scenarios. To illustrate the feasibility of the selective filters in scenarios with limited information regarding the interferers, for example, a CDMA down-link, a blind adaptive implementation of the selective MMSE detector is also presented.
The rest of this paper is organized as follows. Section 2 establishes notation for the additive-noise, synchronous CDMA system model and discusses the ML detector. Section 3 discusses prior work on nonselective decorrelator and MMSE detectors proposed in the literature. This section also introduces the constrained detectors as well as the noncoherent detectors based on soft-ICs. Section 4 applies selective filtering to some of the detectors discussed in Section 3. This section also discusses a blind adaptive implementation of the selective MMSE detector as well as a successive interference suppression (SIS) scheme. Section 5 discusses the numerical results, and concluding remarks are presented in Section 6. The Appendix contains developments for certain results in Sections 3 and 4.

SYSTEM MODEL AND OPTIMAL DETECTION
We consider a synchronous CDMA system with K active users, processing gain N, and a signaling scheme where each user transmits one of M signals. A discrete-time model can be obtained by projecting the received signal onto an N-dimensional orthonormal basis. Using the pseudo-linear representation introduced in [4], we view the signal space as being an expanded signal space spanned by the MK signals: M messages for each of the K users. We concentrate on cases where the possible waveforms for all messages of all users are linearly independent. The channel is assumed to be additive white Gaussian noise (AWGN), and the receiver observes a superposition of the K signals.
For user k, the N×1 vector s k,m denotes the signature corresponding to message m while the N×M matrix S k ∆ =[s k,1 · · · s k,M ] denotes the signature matrix. It is assumed that the signatures in S k are complex-valued, have unit norm, and are time limited. The amplitude and phase of message m of user k are denoted by A k,m and φ k,m , with correspond- 1 , . . . , e jφk,M ]. The phases are assumed to be independent and uniformly distributed over [0, 2π]. Let m k be the transmitted message of user k. We define the vector We note that b k belongs to the set F = {e 1 , . . . , e M } where e k ∈ {0, 1} M has kth entry e k,k equal to one and zero for all other entries. It is assumed that the M messages of a user are equiprobable. The received vector at the output of the bank of chip-matched filters can be written as where n is an appropriately-sized AWGN vector with mean zero and covariance matrix σ 2 I. Further, r can be expressed in terms of the MK-length vector The aim of the multiuser detector is to recover the message vector b ∈ F K . For a given b, let the vector φ = [e jφ1 · · · e jφK ] represent the phases corresponding to the K nonzero entries of b. With known A and Φ, the ML estimate of b given r is the solution to the optimum multiuser detector [2]. The estimate may be written aŝ In the AWGN channel, the optimization (4) becomes the familiar distance minimization problem Note that (4) and (5) describe a coherent detector since knowledge of Φ is assumed. Next, consider the case where the amplitudes A are known at the receiver as in (4), but both Φ and b are unknown. Since each element of φ must belong to the unit circle C 1 , the joint ML estimate of b and φ is The implementation of this detector requires an exhaustive search over possible vectors b. Further, since each of the elements of φ lie on a unit circle, the inner maximization in (6) is over a nonconvex set and hence there is no guarantee of finding the global minimum in (5). However, relaxing the constraints and allowing each of the elements of φ to lie on the unit disk, C 1 yields a convex set for the inner optimization. The resulting detector is This detector will be referred to as the joint detector. The joint detector effectively assumes that A characterizes the maximum amplitudes of the signals. Both the detector (6) and the joint detector (7) are generalized likelihood ratio test (GLRT) detectors that differ in their assumptions regarding the received signal amplitudes. In particular, when all elements of A become large, the maximum amplitude constraint of the joint detector becomes trivial and the joint detector approaches the GLRT detector in [8] which treats the signal amplitudes as unknown.

NONSELECTIVE DETECTION
Recently, several detection methods with reasonable complexity have been formulated that approximate the solution of the NP-hard ML multiuser-detection problem [11,12]. Further results using nonlinear programming techniques to approximate the ML multiuser detector for linear modulation can be found in [13,14,15]. In this section, similar to the linear modulation counterparts considered in [11,12,13], we relax the constraint set of the ML multiuser-detection problem. We represent the structure of the signal in the form of a constraint set and explore various detectors with the same objective function yet different constraint sets.

Prior work
To place the constrained multiuser detectors in proper context, we start by examining the decorrelative and the MMSE two-stage detectors proposed in [4,6,7]. These detectors combine two stages: linear filtering and single-user detection. Let z = Ax denote the estimate of the desired vector Ax and let the output of the matched filters be where R = S H S is the cross-correlation matrix. The first stage of the decorrelative detector [4] applies the decorrelating filter R −1 to y to obtain If the signals are linearly dependent, we can replace R −1 by the Moore-Penrose generalized inverse [16]; however, for simplicity, we will assume that the signals are linearly independent.
In the first stage of the MMSE detector, we apply the matrix transformation C H to the output r to obtain the estimate z = Ax = C H r that minimizes the mean square error (MSE) The solution is given in [6] as where (1/M)A 2 , and I n is the identity matrix of dimension n. Equivalently, if the MMSE filter is applied to the matched filter output y in (8) instead of r, then z =C H y wherẽ Note that in case of linear modulation, E = A 2 and the familiar expression z = (R + σ 2 A −2 ) −1 y is obtained [2,17]. For both decorrelative and MMSE filtering, the filter output z is an MK-length vector, that is, the concatenation In the second stage of these detectors, we follow [6,7] which suggest using the kth component vector z k as a decoupled decision statistic to obtain an estimatem k of the kth user's message. The simplest such method is the maximum magnitude (MM) rule, denoted µ(z k ), and defined bŷ In the event of ties, the MM rule arbitrarily chooses one of the maximizing entries. For orthogonal signaling over a single user AWGN channel, the MM rule is optimum; however, since the decorrelative and MMSE filters introduce correlation in the additive noise and/or interference components of z k , the MM rule is merely a heuristic. Single-user decoding rules for user k that exploit the correlation structure are developed in [4,6].

Constrained noncoherent multiuser detection
Our starting point for the relaxations of the constraints is the ML detector (6) in which the amplitudes A are known but the symbols b and phases φ, or equivalently Φ, are unknown. In this case, we estimate them jointly as x = Φb. We define the set and observe that x ∈ G K . Rewriting (6), the jointly optimal estimate isx = arg min We observe that the minimization (14) is difficult because G K is a nonconvex constraint set. Due to the high complexity associated with the ML detector, reduced complexity approximations can be obtained by solving a relaxation of the original problem [18]. If we relax the constraint set such that the new constraint set is convex, then the optimizer of the quadratic objective function in (14) can be found efficiently via a variety of nonlinear programming methods. This observation is the key towards the formulation of the approximate solutions presented in the remainder of this section.
We start with the case where the vector x containing all the users' messages is constrained. Although the constraint The convex set x 2 ≤ K can be thought of as the interior of an MK-dimensional hypersphere of radius √ K. The solution of the above problem, derived in [11,12] in the context of linear modulation, is the generalized MMSE detector where λ * is the optimum Lagrange multiplier corresponding to the global constraint (15). Note that (16) reduces to the MMSE solution [17] for λ * = σ 2 . We apply the MM rule to the filter outputx to obtain the symbol decisionm k = µ(x k ). The resulting detector, consisting of the filter (16) followed by the MM rule, will be referred to as the global constrained detector. Now, we consider local constraints for each individual user k.
If we relax the local constraint x H k x k = 1 to be the convex set x H k x k ≤ 1, which represents the interior of an M-dimensional hypersphere of unit radius, then the estimatex is the solution to The solution of (17) is (see Appendix A) where Λ * is a diagonal matrix containing the Lagrange multipliers. We then apply the MM rule described in (12) to the kth component vectorx k to obtainm k = µ(x k ). This detector will be referred to as the local constrained detector. Note that the local constrained detector is not the same as the joint detector (7). Although both detectors are obtained by enforcing a maximum amplitude constraint on each user k, the joint detector searches only over vectors x for which each component vector x k is of the form ae jφk b k where b k ∈ F and 0 ≤ a ≤ 1.
Note that there may be other suboptimal schemes with different constraints that yield better performance with lower complexity compared to the detectors proposed here. Also, it is not clear whether using a more adequately constrained search space is better than the expanded search space we have considered with virtually all magnitudes and phases that satisfy a maximum energy bound (local and global constraints). These issues require further research.

Soft-interference cancellation
Multistage detectors, also referred to as multistage interference cancellers, fall in the class of decision-directed multiuser detectors and are viable alternatives to popular linear detectors such as the decorrelator and MMSE detectors, due to their excellent BER performance and reasonably low complexity [2]. Several multistage coherent detectors for linear modulation have been proposed in the literature, including versions using serial and parallel implementations and versions using hard and soft bit estimates [10,12,19,20,21,22,23]. The contributions of this section are: first, the detectors proposed here are noncoherent realizations of the decision directed, nonlinear detectors proposed in [10,22]; second, new techniques are proposed to incorporate the signal structure into the decision algorithms. In particular, we propose three detectors: the serial soft-IC, the clipped soft-IC, and the parallel soft-IC.
In this section, a stage refers to a single pass through the detectors of all users. All implementations here use the decorrelator outputs in the first stage, followed by multiple stages of processing of these outputs. The goal, once again, is to obtainx, the estimate of all transmitted messages. To obtain the estimatex k for the kth user's message, soft estimates are used to reconstruct the interference that is then subtracted from that user's matched filter output. The differences between these detectors arise in their implementation, for example, serial or parallel, as well as in the types of decisions that are communicated between the users' detectors.
In the serial soft-IC detector, the first step is to determine sequentially the estimatesx k,1 , . . . ,x k,M of the M possible messages of user k. In the second step, only the entrỹ x k,m with the largest magnitude is retained while the other M − 1 entries are forced to 0. This estimated and mapped vector for user k is denoted byx k . The mapping ensures that x k has a structure similar to that of x k . Following from (8), the estimate for message m of user k is (see Appendix B) where the components on the right side of (19) are (from left to right) the matched filter output, the estimates of the previous k − 1 users' messages, the previously detected estimates of messages of user k, the not-yet-detected messages of user k, and the not-yet-detected messages of the other users. After the M entries of user k are determined, the estimated vector x k is then mapped tox k using the maximum magnitude rulê This vector estimatex k is then used by user k+1 in (19) above for estimating its vector, and so on. The whole procedure can be repeated for multiple stages to refine the estimates. The implementation of the clipped soft-IC detector employs the same first step (19). In the second step, however, we incorporate the relaxed constraint |x k,i | ≤ 1 by clipping in accordance with the following rule: Thus, the difference between the serial soft-IC and the clipped soft-IC lies in the type of decision fed between the users. Lastly, the parallel soft-IC differs from the serial soft-IC only in the first step. Instead of serial estimation of each elementx k,m , the parallel soft-IC estimates all elements of (2), we can write the received signal r in terms of the components x k = Φ k b k as The matched filter vector output for user k is where R k j = S H k S j . Therefore, x k can be estimated as The components on the right side of (24) are (from left to right) the matched filter output, the k − 1 processed users with their estimated and mapped vectorsx j , and the users that are yet to be processed. In the second step, the parallel soft-IC obtains the users' message decisions using the same maximum magnitude mapping rule (20) as the serial soft-IC detector.
Since the serial soft-IC estimates message elements x k,m one at a time, its granularity is finer than that of the parallel soft-IC which estimates the entire vector x k in one step. Hence, it is to be expected that the serial soft-IC will perform slightly better. Also, note that the serial and parallel soft-ICs can be implemented without the knowledge of the individual amplitudes A k,m . Instead of estimating just x k,m in (19), the element A k,m x k,m can be jointly estimated, followed by (20). Since the MM rule uses only the relative magnitudes, the individual amplitudes do not have to be known explicitly. It is easy to observe from (19) and (21) that this is not the case for the clipped soft-IC for which amplitude values must be known.

SELECTIVE FILTERING
To detect whether user k transmitted message m the nonselective detectors of Section 3 consider all possible signals of interferers j = k, as well as the other M − 1 possible signals of user k as sources of interference. However, it is known a priori that user k transmits precisely one of his M messages. Therefore, for m ∈ {1, . . . , M}, one and only one of the x k,m is nonzero for each user k ∈ {1, . . . , K}. Selective filtering makes use of this observation. Note that if the desired user's signatures (associated with the M messages) are mutually orthogonal, then the selective and nonselective detectors for this user yield identical performance. In this section, we will examine selective implementations of the decorrelator detector, the MMSE detector and a blind implementation thereof, and the soft interference canceller. To further enhance the performance of the selective detectors, an SIS scheme is also proposed.
In the following, d(i) will denote the ith element of a vector d, while D(i, j) and D(i, :) will denote the (i, j)th element and ith row of a matrix D, respectively. For notational convenience, all vectors and matrices associated with selective filtering will be denoted by a bar above the entry. Without loss of generality, we assume k = 1 to be the desired user, thus, we focus on the selective detection of x 1,m . Specifically, y m is constructed from y 1,m and the M(K − 1) entries of y belonging to the interferers, the selective signature setS m is constructed from s 1,m and the M(K − 1) entries of S belonging to the interferers, and the matrixĀ m is constructed in a manner similar toS m .

Selective decorrelation
To formulate the selective decorrelator, we define H m as the hypothesis that the first user transmitted signal s 1,m . Our problem is to determine which hypothesis among {H 1 , . . . , H M } is correct. From (22), the received signal under hypothesis H m is (25) and the decorrelating transformation to suppress all users k = 1 is given by (S H mSm ) −1S H m [2]. Thus, we first construct followed by the selective transformation Note that v is also the estimate A 1 x 1 , therefore, we apply the MM rule to v to obtainm 1 .

Selective MMSE detection
The MMSE detector is popular due to its amenability to adaptive implementation. Blind adaptive implementations of detectors are useful since they only require the signature and timing of the desired user. They are especially attractive for the CDMA downlink where, due to the dynamic environment, it may be difficult for a mobile user to obtain accurate information regarding signatures and timings of other active users in the system [24,25,26]. In this section, we will first discuss the selective version of the MMSE filter (10) and then we will formulate a blind adaptive implementation. The selective MMSE filter for the first user is obtained using an approach similar to the one used to obtain the selective decorrelator, specifically, we apply an MMSE transformation to the received signal (25) under each hypothesis m ∈ {1, . . . , M}. From (10), the selective MMSE filter corresponding to the mth signature is The filter vectorc 1,m corresponding to the mth signature of the first user is the first column ofC m , that is,c 1,m =C m (:, 1). Now we will discuss a blind adaptive implementation of (29) above. A blind adaptive implementation of the noncoherent nonselective MMSE detector was proposed in [6]. We extend that algorithm to implement a blind adaptive version of the selective MMSE detector. Since the first user is the user of interest, the filter coefficients of only this user are adaptively varied. Representing the mth diagonal entry of E by E m , the filter vector c 1,m corresponding to the mth signature of the first user can be obtained as Note that c 1,m corresponds to the first column ofC m . If we denoter then Note that in the nonselective version in [6], the filter (32) involves the term rr H which is readily available. In contrast, only a subset of that informationr m is needed here and it cannot be obtained explicitly due to a lack of knowledge of the signature setS m . This problem can be circumvented by writingr m as Since the receiver knows the signatures of the desired user, it can constructS mẼmS H m and extractr m from the received signal r. Extending the stochastic gradient algorithm in [6], the adaptation for the mth filter vector may then be expressed as We use the Normalized Squared Error (NSE) criterion [6] to study the convergence properties of the filter coefficients. The NSE at the nth iteration is defined as Note that since the structures of the nonselective and selective MMSE detectors are similar, the convergence analysis of the former [6] can be easily extended to the latter to obtain the upper bound on the step size µ to ensure convergence.

Selective soft-IC
Next, we consider the selective implementation of the serial soft-IC scheme described in Section 3.3. We use a selectivẽ then (20) is applied to obtainx k,m . Note that, in going from (19) to (37) above, selective filtering has suppressed the terms containing the other M − 1 messages of user k. Once all M soft-outputs of user k are obtained in this manner, the MM rule is applied to obtain the message decisionm k .

Selective filtering with SIS
Although it is expected that the selective filters will yield performance improvements over their nonselective counterparts, further improvements are possible through the use of successive decisions. We call the resulting technique selective filtering with SIS. For a user whose message has already been decoded, we need only to suppress the signal corresponding to the decoded message. This is analogous to the successive interference cancellation (SIC) scheme in [27] where a decoded user's signal is reconstructed and explicitly subtracted from the received signal r. The algorithm for SIS is as follows: (1) select the maximum-magnitude matched filter output corresponding to the M messages of user k; (2) sort the users in order of decreasing MMs; (3) for each user k ∈ {1, . . . , K}, (a) perform selective filtering for the kth user in the sorted list; (b) assume the messagem k of the kth user is correct, and retain only signature s k,mk in the selective filter matrix used to detect the message for user k + 1.
In the above algorithm, we can potentially employ any of the selective detectors proposed in Section 4. We will present performance results for the selective decorrelator with SIS in Section 5. Note that the selective decorrelator with SIS and the noncoherent decision feedback detector proposed in [5] share the similarity that for the users whose messages have been decoded, both schemes decorrelate only against the signatures corresponding to the decoded messages. However, they differ in that [5] performs nonselective decorrelation against the M − 1 signatures of the desired user and it uses a second-stage single-user GLRT detector instead of the MM rule for symbol decisions.

NUMERICAL RESULTS
In this section, we evaluate the performance of the proposed constrained detectors, the soft-ICs and the selective detectors, compared with the nonselective detectors proposed in [4,6]. Since the exact symbol error rate expressions are cumbersome or intractable for the detectors considered herein, we resort to simulations for performance evaluations.
In all simulations, we used complex random signatures. The signatures are linearly independent and hence the inverse of the cross-correlation matrix R exists. In all figures, the first user is assumed to be the desired user and P s represents the probability of symbol error. We also assume that the M messages of user k are received with equal power or that A k,m = A k . The SNR of user k is defined as A 2 k /2σ 2 . In nearfar scenarios, all interferers are assumed to be at the same SNR. Figure 1 shows P s versus the SNR for K = 2 users, M = 4 messages per user, and a processing gain of N = 20 for the detectors studied in Section 3. The parameters K and M were chosen to be small due to the implementation complexity of the joint detector in (7). However, we note that although the number of users is small, KM itself is a sizeable fraction of the processing gain N (experiments with larger processing gains are considered later on for the detectors proposed in this paper). Note also that the global (16) and local (18) constrained detectors perform very close to the MMSE detector (10). A similar observation has been made in [12] as well, and this may be attributed to the resemblance of the analytical solutions of constrained optimization problems to the generalized MMSE solution. Figure 2 shows P s of the desired user versus the SNR of the interferer in a near-far scenario. Since the local constrained detector performs only slightly better than the global detector, it has been omitted from this figure.   Figure 3 compares the performance of the various soft-ICs proposed in this paper, (19), (20), (21), (22), (23), and (24), to the nonselective decorrelative and MMSE detectors in a near-far scenario. In all the soft-ICs, a decorrelative first stage was followed by two more stages of matched-filteroutput processing. Interestingly, the nonselective MMSE detector (10) and (11) does not converge to the decorrelator in the high interferer-power region in contrast with the performance obtained by multiuser detectors that employ linear modulation. This is a direct consequence of the fact that, in the near-far situation, the powers of the interferers are high compared to the powers associated with all possible messages of the desired user. Also, the nonselective detectors take the undesired M − 1 messages of the desired user (with relatively low powers) as well as all interferers' signals (with high powers) into account in decoding the desired user's message. Thus, unlike the decorrelator, the nonselective MMSE filter does not zero-force the contributions of the M − 1 undesired messages of the desired user, resulting in a performance improvement in near-far scenarios. Note that this issue does not arise for selective filters and the selective MMSE and decorrelative detectors do converge in the near-far situations. Figure 4a compares the performance of the selective and nonselective filters for a lightly loaded system with K = 2 users, M = 4 signals per user, and N = 20 dimensions; Figure 4b compares the selective and nonselective implementations of the decorrelator and MMSE for a fully loaded system with K = 5, M = 4, and N = 20. Note that the nonselective decorrelator and MMSE curves compare well with those of [6, Figure 3(a)]. Next, we increase both the processing gain N as well as M, the number of messages per user. Figures 5a and 5b show the relative performance of the detectors for a moderately loaded and a fully loaded system, respectively. It can be seen that the selective detectors consistently outperform the nonselective detectors at all values of SNR. Among the selective detectors, the serial soft-IC is better able to cancel interferers at higher powers, hence the crossover in Figure 5b. Figure 6a shows the NSE (36) of the blind adaptive selective (35) and nonselective MMSE detectors [6] averaged over 10 runs for different step sizes µ. The limiting MSE of the detector is proportional to the value of NSE to which the filter coefficients converge. The step size impacts both 14   the rate of convergence and the limiting MSE, and the tradeoff between the two is apparent from the figure. It can be seen that a larger µ brings about faster convergence but at the cost of a higher limiting MSE. The selective detector converges to a lower value of NSE compared to its nonselective counterpart at µ = 0.0001 and vice versa at µ = 0.001. The performance of the blind selective MMSE detector is illustrated in Figure 6b and compared with that of the blind nonselective MMSE [6, Figure 7 probability curves for the blind and the nonblind implementations. Figure 7 illustrates the performance gained by using the SIS scheme with the selective decorrelator. Since the nonselective decorrelator is near-far resistant [4], the probability of symbol error for the desired user remains unchanged with interference power. The selective decorrelator also exhibits a similar behavior since it projects the received signal onto a space orthogonal to the interferers' subspace which remains unaffected by a change in the interferers' SNR. With SIS, however, the situation is different. When the interferers' SNRs are lower than the desired user's SNR, the desired user is decoded first and it does not benefit from the SIS scheme. Hence, in the low SNR regions, its performance is similar to that of the selective decorrelator without SIS. In the high SNR regime, the desired user is decoded last with very high probability and hence it benefits the most from the SIS scheme (due to a reduction in the space of possible interfering signals) yielding an improvement in symbol error rate of around two orders of magnitude over the selective decorrelator without SIS. The SIS curve flattens out in the high SNR region because the dimensionality of the interference subspace remains unaffected by a change in the interferers' SNRs.
For comparison, the SIC of [27] is also included in Figure 7. Due to the high correlation between the signatures of the users in our case, the multiple-access interference (MAI) residual terms for the subsequent users remain significant even after the high-energy users are canceled out. Thus, the performance of the desired user deteriorates as the SNR of the interferers is increased; when the SNR of the interferers is low, the desired user is the first to get decoded and hence it does not suffer from the excessive MAI. Note that, for some other instance of the cross-correlation matrix, the SIC may yield results that are qualitatively similar to that of the SIS scheme.

CONCLUSION
We showed that judicious use of a priori knowledge of the users' selective transmission mechanism can yield improved performance over the noncoherent multiuser detectors proposed in the literature. To this end, we proposed and investigated three categories of detectors. First, a nonlinear programming approach to noncoherent multiuser detection was explored, where the structure of the multiuser signal was reflected in the various constraint sets analyzed. Using this technique, the joint detector was derived to provide a benchmark (a lower bound) for evaluating the performance of different detectors. The global and local detectors were also derived as relaxations of the ML detector but with different constraint sets. These two detectors were shown to resemble the solution to the generalized MMSE detector, as previously observed for linear modulation and coherent detection.
Second, motivated by the ability of the soft-IC-based coherent detectors to perform well in near-far scenarios, the serial, clipped, and parallel implementations of noncoherent soft-ICs were suggested and investigated. The three detectors mainly differ in the manner in which they incorporate the a priori information regarding the structure of the signal. It was observed that the serial soft-IC not only outperforms the MMSE and the decorrelative detectors in near-far scenarios but it does so in equal-received-powers situations as well.
Third, we proposed and implemented a class of detectors that employ selective filtering. Unlike their nonselective counterparts, these detectors make use of the a priori information that of the M signals available to a user, only one is transmitted. The decorrelative MMSE and soft-IC selective detectors were shown to outperform their nonselective counterparts in all cases. To illustrate the feasibility of the selective detectors in scenarios where limited information regarding the interferers is available, for example, a CDMA down-link, a blind adaptive implementation of the selective MMSE detector was presented. Finally, an approach to improve the performance of the selective detectors based on decision-directed successive user suppression was presented in this paper.
Our results indicate that incorporating the information regarding the signal structure offers performance improvements. In particular, detectors employing selective filtering have excellent performance and emerge as viable solutions in a variety of system conditions.

A. DERIVATION OF THE LOCAL CONSTRAINED DETECTOR
Here, we derive the solution of the optimization problem in (17). The objective function in (17) can be expanded in terms of y = S H r as where T = ARA. Since (17) involves the minimization of a convex function over a convex set, it has a unique minimum over this constraint set which can be found using a variety of iterative algorithms, for example, the gradient descent algorithm [28]. In addition, the convex duality theorem [28] ensures that no duality gap exists and we can solve for the dual problem instead. Since (17) has K constraints, there are K dual variables. In terms of T k j = A k S H k S j A j , the Lagrangian dual function of (A.1) can be expressed as which is to be maximized over x and λ ≥ 0, where λ = [λ 1 · · · λ K ] . The gradient vector associated with ᏸ(x, λ) is ∇ᏸ(x, λ) = [∇ x1 ᏸ(x, λ) · · · ∇ xK ᏸ(x, λ)] where Simple unconstrained gradient descent algorithms can be used to iteratively determine each element of λ as follows: which converges toλ k . The maximizer of (A.8) is given by λ * = [λ * 1 · · · λ * K ] where λ * k = max (0,λ k ). Then, from (A.7), the unique minimizer of (17) can be rewritten aŝ (A.10)