Rate-optimal and reduced-complexity sequential sensing algorithms for cognitive OFDM radios

Sequential sensing algorithms are developed for OFDM-based hierarchical cognitive radio (CR) systems. Secondary users sense multiple sub-bands simultaneously for possible spectrum availabilities under hard miss-detection constraints to prevent interference to the primary users. Accounting for the fact that the sensing time overhead can often be significant, a performance metric is developed based on the effective achievable data rate. An optimization problem is formulated in the framework of optimal stopping problems to maximize the average effective data rate by determining the best time to stop taking samples for sensing, as well as the best set of channels to use for data transmission. A basis expansion-based sub-optimal algorithm is derived to reduce the prohibitive complexity of the optimal solution.


I. INTRODUCTION
The cognitive radio (CR) paradigm aims to design intelligent radios that can sense the environment and adapt the transceiver parameters as well as the resource allocation decisions in order to exploit the spectrum availability aggressively [1]. To exploit the spectrum holes opportunistically, orthogonal frequency-division multiplexing (OFDM) transceivers are often employed at the physical layer of the CR due to their flexibility in handling a wide range of spectrum [2].
Obviously, a key component of CR transceivers is the sensing module that monitors the spectrum occupancy of the PUs in real time. Since the presence of CR links must be oblivious to the PUs, hard miss detection constraints need be imposed to the design of the detector in the sensing module. However, this inevitably leads to increased sensing time, which, in turn, leaves less time for the actual data transmission before the PUs may kick back in. Thus, it is important to factor in the sensing overhead in the design of the sensing module, especially in the scenarios where the PU occupancy changes dynamically in time as well as in frequency. However, the sensing task is often challenging. First of all, as the licensees may have invested heavily on the spectrum to run commercialized service in their bands, it is likely This work was supported by NSF grants CCF 0830480 and CON 014658; and also through collaborative participation in the Communications and Networks Consortium sponsored by the U.S. Army Research Laboratory under the Collaborative Technology Alliance Program, Cooperative Agreement DAAD19-01-2-0011. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon. that very strict interference conditions are placed to the CRs. For example, in IEEE 802.22, the sensing threshold for the digital TV (DTV) signal is set as low as −116 dBm over 6 MHz bandwidth [3]. Secondly, in some situations, coherent detectors such as matched filters cannot be used because the required a priori knowledge of the PU signal characteristic is not available or simply because hardware complexity is to be kept minimal. Thus, the sensing module often needs to employ inferior non-coherent detectors such as the energy detector [4] or feature-based detectors [5].
The wide-band sensing problem for OFDM-based CRs was considered in [6], where the detection thresholds for a bank of energy detectors were optimized jointly to maximize throughput performance. A cooperative sensing strategy was also pursued in [7], where a linear-quadratic fusion rule was developed based on the deflection criterion to process correlated observations. These developments, however, assume batch (or fixed sample size (FSS)) detection strategies, where the number of samples collected for detection is a predetermined design parameter, which does not depend on the actual values of the received samples.
Sequential detection schemes on the other hand exploit the fact that the number of samples required to achieve a given reliability level may well be dependent on the actual realization of the observed samples. For example, in a simple binary hypothesis testing context, Wald's sequential probability ratio test (SPRT) compares the likelihood ratio with two thresholds, and the decision is made as soon as the test statistic exceeds either of the thresholds. It is known that SPRT minimizes the average sample number (ASN) among all the tests with the same false alarm and the miss detection probabilities [8, p. 21]. However, it is not clear how to apply the SPRT approach to the wide-band sensing problem, where a bank of detectors need to be run simultaneously. Moreover, the relevant optimization criterion in this case might not be as simple as the ASN. In [9], two layers of SPRTs were employed at individual CRs and the fusion center (FC) to reduce the overall detection delay for a single-channel sensing problem. However, no claims on optimality were provided.
In this work, rate-optimal wide-band sequential sensing algorithms are developed in the framework of the optimal stopping problem. The optimal stopping problem is formulated as determining the time to stop taking sequential observations, as well as identifying the best set of channels to use for data transmission, to maximize an expected value of payoff based on the accumulated observations [10]. The payoff in our CR sensing problem is defined as the total rate achieved by using all the available subchannels, where the availability is determined under hard interference constraints. The sensing overhead is captured by explicitly accounting for the sensing time, which consumes portion of the frame duration.

A. Signal Model
Consider a CR that shares M orthogonal bands opportunistically with PUs in its network. In order not to interfere with the on-going transmissions of the PUs, the CR must identify the bands that are not occupied by the PUs before transmitting its own data.
The n-th received signal sample at the CR on band m ∈ {1, 2, . . . , M}, when a PU is transmitting on that band, can be modeled by where {h As in e.g., [4], each CR receiver relies on energy detection to decide the occupancy of each band. Thus, the observation at each time step n is defined as y where I 0 (·) denotes the zero-th order modified Bessel function of the first kind, and {·} the indicator function that yields 1 if the condition {·} is true and 0 otherwise. In the context of spectrum sensing for CRs, the miss detection probabilities signify the probabilities of failing to detect the presence of the PUs, which could lead to causing interference to the PU transmission. For this reason, the sensing algorithms must be designed to guarantee very small miss detection probabilities. On the other hand, small false alarm probabilities are desired to increase the usage of the available channels. These dual goals can be accomplished by increasing the sample size n. However, increasing the sample size leads to larger sensing overhead, which effectively reduces the time left for actual data transmission. In the next subsection, a sequential sensing problem is formulated to optimize the overall tranmission rate by taking into account the overhead due to sensing.

B. Optimal Stopping Problem 1) Average Throughput Criterion:
The sequential detection problem can be formulated in the framework of optimal stopping problems [12,Sec. 4.4], [10]. Define y n [y If the CR stops sensing at time n and proceeds to data transmission on the bands that are sensed idle, the overall throughput is given by where H is an M × 1 random vector with the m-th entry taking values from {H } to describe the spectrum occupancy of the M bands; T is the frame duration with T ≥ Nτ ; τ the sampling interval; and R (m) the rate that can be achieved in band m.
It is noted that the per-step reward f n (·) is a function of the underlying true spectrum occupancy H, which is not directly observable. It can be shown that the optimal stopping problem based on the per-step reward f n (·) is equivalent to the one based on the per-step reward E{f n (H, δ n )|y 1 , . . . , y n } f n (π n , δ n ), n = 1, 2, . . . , N [12, Sec. 5.1]. Clearly, the latter can be expressed as ] T denotes the belief vector with the entries The m-th element of the belief vector evolves according to the Bayes rule as Then the evolution of the belief vector can be compactly expressed as π n+1 = Φ(π n , y n+1 ).
The goal is to obtain the average throughput-optimal stopping policy that determines whether to stop or not at each time n ∈ {1, 2, . . . , N − 1}, and access policy that determines whether or not to access each band m upon stopping at time n ∈ {1, 2, . . . , N}, given the observations Y n [y 1 , . . . , y n ]. To be precise, let u n ∈ {stop, continue} be the stopping control at time n. With the introduction of an auxiliary state variable x n ∈ {S,S}, where x n = S indicates the system is in the "stop" state, and x n =S in the "non-stop" state, the system evolution is characterized by S, if x n = S, or x n = S and u n = stop S, otherwise for n = 1, 2, . . . , N − 1 (10) with x 1 =S. The reward function at the final stage N is given byf and at stage n ∈ {1, 2, . . . , N − 1} bŷ f n (π n , x n , u n , δ n ) = f n (π n , δ n ) {xn =S,un=stop} .
The problem of interest is to find the control policies u n (π n ) and δ n (π n ) such that the average reward given by is maximized over the finite horizon comprising N sampling intervals.

2) Constrained Dynamic Programming Formulation:
The CR access policy has to ensure low "collision" probability, which is the probability that the CR interferes with the PU transmission due to miss detection of spectrum occupancy. The "collision" probability P In the following, a constrained dynamic programming (DP) formulation is described along with a solution approach based on the Lagrange relaxation technique. First, note that the collision probability on band m given in (14) can be re-written as follows.
The optimal access decision at stage n ∈ {1, 2, . . . , N} is obtained by δ * n (π n ) = arg max δn∈{1,0} M g n (π n , δ n ; λ), which is equivalent to Note that the access decision of each band is based on a likelihood ratio test whose threshold depends on the Lagrange multiplier associated with the band. It is also noted from (34) that the access policy becomes less aggressive as n grows, since the effective rate T −nτ T R (m) diminishes and hence so does the incentive to access the channel. Let us define g * n (π n ; λ) g n (π n , δ * n (π n ); λ), which is given by The optimal stopping rule at stage n ∈ {1, 2, . . . , N − 1} is thus obtained as u * n (π n ) = stop, if V n (π n ; λ) = g * n (π n ; λ) continue, otherwise. (36) Since the observation space (R + ) M and the state space [0, 1] M × (R + ) M are infinite spaces, the optimal backward induction must be implemented approximately via discretization per step [15]. Thus, if one uses a given number of grid points for each channel m, it is immediate that the complexity of the discretized algorithm grows exponentially in the number of subchannels M . Therefore, even with the reduced-dimension sufficient statistic, the implementation of the optimal backward induction can be prohibitively complex even for moderate number of OFDM subchannels.

A. Basis Expansion-Based Solution
To alleviate the "curse of dimensionality" associated with the optimal solution, sub-optimal policies that approximate the optimal policy closely with reduced complexity are desired. Here, a regression-based method that has been applied to problems in quantitative finance [16], [17], [18] is adopted to the novel sequential CR sensing scenario. Let us define V n (π n ) E{V n+1 (Φ(π n , y n+1 ))|Y n }, n = 1, . . . , N − 1.
(37) The idea is to approximate V n (π n ) bŷ V n (π n ) K k=1 a n,k φ n,k (π n ) where φ n,k (π n ), n = 1, . . . , N − 1, k = 1, . . . , K, are a set of basis functions and a n [a n,1 a n,2 . . . a n,K ] T is the coefficient vector. To facilitate numerical computation, a finite set of sample trajectories of the state vector are simulated, and the coefficients a n are obtained via least-squares regression of the resulting sample paths of the V-values [17].
Specifically, one first generates J independent sample paths {y n [j], n = 1, 2, . . . , N}, j = 1, 2, . . . , J, by sampling according to }] T . Given the initial a N −1 , the regression coefficients a n for n = N − 2, N − 3, . . . , 1 can be obtained recursively by a n = arg min and the initial a N −1 is computed by where the dependence of g * n (·) on λ has been suppressed for simplicity of notation.
Once the regression coefficients are computed, the optimal stopping rule at step n ∈ {1, 2, . . . , N − 1}, is to stop if g * n (π n ) >V n (π n ), and to continue otherwise.

B. One-Step Look-Ahead
The k-step look-ahead policy decides to stop or continue based on the optimal rules truncated at k steps ahead of the current time [12,Sec. 6.3].In the case of k = 1, the one-step look-ahead rule decides to stop if g * n (π n ) > E{g * n+1 (Φ(π n , y n+1 ))|Y n } and to continue otherwise at stage n ∈ {1, 2, . . . , N − 1}.
Since the likelihood ratioΓ (m) (y) is monotone in y, the r.h.s. of (49) is given by where F (·|H . (51)

C. Optimized Fixed Sample Size (FSS) Test
To compare the performance of the sequential regressionbased sensing policy to that of the batch scheme, the FSS test is designed and optimized in the following way. The test statistic t whereγ (m) n is the detection threshold. Invoking the central limit theorem, one can show that for large enough n the probability of false alarms α (m) n for band m with the sample size n are given by [6] where Q(·) denotes the Gaussian tail function. The detection thresholdγ (m) n is determined by constraining the miss detection probability underβ; i.e., Pr{t } ≤β, and is given bȳ Then, the average throughput due to the FSS test with the sample size n can be computed as

V. NUMERICAL RESULTS
Scenarios with M = 10, N = T = 100, and τ = 1 were tested. Rates R (m) = m for m = 1, . . . , M were used, and the channel gains {G (m) } were generated from the χ 2 -distribution. The observation noise variance σ 2 was set to 10 −6 . The prior probabilities Pr{H (m) 0 } = 0.7 for all m, and the false alarm rateβ = 10 −2 was used. To estimate the a n coefficients involved in the regression-based sub-optimal policy, J = 1, 000 sample paths were generated.  The left panel of Fig. 2 plots the achieved average total rates of the proposed regression-based scheme when the mean SNR of the PU-to-CR channels is varied. For comparison, the average throughput of the genie-aided and the one-step look-ahead schemes are also shown. The genie-aided scheme chooses the stopping time n that maximizes g * n (π n ; λ) over all n ∈ {1, 2, . . . , N}, assuming non-causal knowledge of {π n }, and thus can be used to obtain performance upperbounds. Averaging was performed over 20, 000 Monte Carlo realizations. It can be seen that the regression-based scheme attains throughput close to the genie-aided upper-bound over a wide range of SNR values. The one-step look-ahead scheme is clearly sub-optimal especially in the moderate-to-high SNR range. Also, it can be seen that the proposed regression-based sequential sensing scheme outperforms the optimized FSS over all SNR levels tested.
The right panel of Fig. 2 plots the ratio of the average throughput of the various schemes to that of the genie-aided scheme. It is seen that the regression-based policy achieves significant portion of the genie-aided throughput, but the optimized FSS test degrades as the SNR decreases. In fact, at SNR as low as −12 dB, even the one-step look-ahead policy outperforms the optimized FSS test, which corroborates the value of sequential CR sensing.

VI. CONCLUSIONS
Sequential sensing algorithms for OFDM-based wide-band CR systems have been developed. The trade-off between the sensing time and the chance of identifying more unoccupied sub-channels were captured in the effective rate achieved by the CR system. Optimal stopping problems were formulated, which maximized the effective rate given the past and the current observations. A basis expansion-based reducedcomplexity solution was derived, whose performance was shown to be close to the genie-aided upper-bounds, and hence close to that of the optimal solution.