 Research
 Open Access
 Published:
Noise reduction for radio map crowdsourcing building in WLAN indoor localization system
EURASIP Journal on Advances in Signal Processing volume 2021, Article number: 40 (2021)
Abstract
Recent years have witnessed a growing interest in using WLAN fingerprintbased method for indoor localization system because of its costeffectiveness and availability compared to other localization systems. In order to rapidly deploy WLAN indoor localization system, the crowdsourcing method is applied to alternate the traditional deployment method. In this paper, we proposed a fast radio map building method utilizing the sensors inside the mobile device and the Multidimensional Scaling (MDS) method. The crowdsourcing method collects RSS and sensor data while the user is walking along a straight line and computes the position information using the sensor data. In order to reduce the noise in the location space of the radio map, the shortterm Fourier transform (STFT) method is used to detect the usage mode switching to improve the step determination accuracy. When building a radio map, much fewer RSS values are needed using the crowdsourcing method compared to conventional methods, which lends greater influence to noises and erroneous measurements in RSS values. Accordingly, an imprecise radio map is built based on these imprecise RSS values. In order to acquire a smoother radio map and improve the localization accuracy, the MDS method is used to infer an optimal RSS value at each location by exploiting the correlation of RSS values at nearby locations. Experimental results show that the expected goal is achieved by the proposed method.
Introduction
Mobile devices are playing more and more important roles in our daily life with the advent of location based service (LBS). In general, LBS greatly hinges on the localization accuracy to provide various services, such as convenient smart city services and ubiquitous InternetofThings implementation [1–3]. After years of research and development, wireless local area networks (WLANs) are widely deployed in the indoor environment. Pervasiveness of WLAN provides good opportunity to estimate the mobile device position for indoor LBS. The common WLAN indoor localization approach makes use of access points (APs) based on the received signal strength (RSS) [4]. The main reason to use RSS for position estimation is not only because the availability of APs in indoor environment but also because the capability of mobile devices to measure and report the RSS from the detected APs.
Generally, fingerprint is the most popular method for WLAN indoor localization system, which was first proposed in [5]. Benefiting from high positioning accuracy in indoor multipath environment, it has subsequently attracted lots of researcher attentions. Typically, fingerprint method works in two phases: offline phase and online phase [5]. In the offline phase, RSS values are sampled in the WLAN indoor environment, and then a radio map is built by labeling these RSS values with their sampling locations. In particular, each position for recording RSS values is called a reference point (RP). Hence, radio map has lots of fingerprints, which are composed of RP locations and associated RSS values. In the online phase, query RSS value is reported and then used to estimate the sampling location within all RPs in radio map by comparing query RSS value with RSS values in the radio map [6, 7]. Therefore, radio map is the key part of WLAN localization system, which effectively builds the bridge between the signal and the location.
In order to obtain high localization accuracy, it is common practice to enrich the radio map with a large number of RPs. However, recording RSS value at lots of RPs is obviously laborintensive and timeconsuming. Thus, obtaining good positioning accuracy while reducing the RSS value sampling workload is no doubt a challenging task in the offline phase. Over the last decade, many algorithms have been proposed for enriching the radio map when only several RPs are available, such as graphbased semisupervised learning (GSSL) [8] and compressive sensing (CS) [9]. But these approaches still require at least some effort for RSS value sampling. In recent years, the crowdsourcing approach has attracted wide attentions for almost zeroeffort to construct radio map [10–12]. In this approach, radio map is constructed in the background when mobile devices are roaming in the indoor WLAN environment. It automatically collects the crowdsourcing data from each participating mobile device without interfering the normal usage of mobile devices. The availability of crowdsourcing data for both signal and location greatly reduces the burden for radio map building in the offline phase and makes the WLAN indoor localization system more practical.
Usually, crowdsourcing data are collected from participating individuals while they take their mobile devices to engage daily activities. The mobile device collects two kinds of data for radio map building in background. For signal space, RSS value can be directly recorded by the WLAN adapters. For location space, inertial sensors, such as accelerometer, gyroscope, and magnetometer, are the most popular devices to record the RP location. Accelerometer is used to determine the individual step, whose location can be considered as the RP position. When each step is determined, we can further calculate the total displacement distance by multiplying average stride length with step count. Gyroscope and magnetometer are able to find the individual moving direction. Usually, movement direction changing is supposed to occur only at the step location, which means the trajectory between any two adjacent steps will be considered as a straight line and the entire trajectory is composed of polylines. With the help of these inertial sensors, an individual trajectory in the indoor environment can be easily determined based on the pedestrian dead reckoning (PDR) method [13, 14]. However, due to lack of benchmark, each step in the trajectory is a relative location with respect to the trajectory beginning or ending. Their real positions in the floor plan are still unknown. Then, the trajectory should be carefully matched with the floor plan to transfer the step relative locations into real locations [15]. When all step real locations are determined, we can label them with the associated RSS values to build the radio map in a crowdsourcing way [16]. When more and more mobile devices provide their crowdsourcing data both in signal space and location space, radio map will become too huge to practice. Hence, the indoor environment is divided into small grids, and all the RPs within the same grid will be merged. The RP location is set as the grid center locations, and the RP RSS values will be an averaged one. The grid size is decided by both the localization accuracy requirement and the radio map storage limitation.
Above all, RP location and RSS value are so important that will affect the radio map performance for localization. Unfortunately, in the way of radio map crowdsourcing building, there are still several problems to be concerned. First, PDR method suffers seriously step determination error because mobile device will be frequently switched between different usage modes, such as holding in hand, putting in pocket and so on. Few literature concerns how to determine the step when mobile device is switching the usage mode. Actually, the acceleration in usage mode switching is very different from the one when mobile device is staying in a certain usage mode. If usage mode switching is not detected, false step determination will occur and lead to poor trajectory estimation. It will no doubt seriously affect the step location transferring accuracy from the relative one into the real one, which in turn degrades the radio map in location space. Second, each crowdsourcing mobile device may be diverse in the hardware design so that they will have different RSS value even sampled in the same place. If the WLAN adapters do not have the same benchmark for the RSS space, the RSS value samplings will introduce serious noise to the radio map in the offline phase. Meanwhile, it will make the RSS values incomparable between online phase and offline phase. Fortunately, this problem has been solved in our previous work [17, 18]. Third, each RSS value may be only one sampling at each step. It cannot well describe the normal signal environment on account of individual different using habit and signal propagation in the indoor multipath environment. The latter two problems will inevitably degrade the RSS space of the radio map.
Therefore, in this paper, our main goal is to provide the method to improve the quality of crowdsourcing data both in location space and signal space. Theoretically, radio map is a matrix with element values obtained from the crowdsourcing data. All the adverse factors can be considered as noise resources making the radio map inaccurate for fingerprint localization. So we propose corresponding methods to reduce the noise coming from the three aforementioned problems occurring in the way of radio map crowdsourcing building.
The main contributions of this paper are as follows:

1
Detect the usage mode switching to improve the step determination accuracy. We propose to utilize the shortterm Fourier transform (STFT) to accurately find the occurrence of usage mode switching both in time and frequency domain, which is able to well describe the step when mobile device is switching one usage mode to another one. Then the dynamic threshold calculated by the previous peakvalley difference is employed to achieve better step determination, which further reduces the noise in the location space of the radio map.

2
Smoothen the RSS value to well describe the RSS value in a real indoor environment. We propose to utilize the multidimensional scaling (MDS) method to infer an optimal RSS value at each step location by exploiting the correlation of RSS values at nearby locations. Comparing with the conventional methods, we succeed in requiring fewer RSS value samplings for crowdsourcing radio map building, which further reduces the noise in the signal space of radio map.
The rest of the paper is organized as follows. The related works are discussed in Section 2. In Section 3, we address the problem of WLAN indoor localization. The step determination method for radio map location space and radio map smoothing method in signal space are analyzed in Section 4. The simulation and experiment results and discussions are presented in Section 5. Finally, Section 6 draws the conclusion on this paper.
Background and related works
In order to reduce the collection workload in the offline phase, various methods have been proposed to realize the rapid deployment of the WLAN indoor localization system in the last decade. In recent years, due to the pervasive application of inertial sensors on mobile devices, motionassisted WLAN localization system has received extensive attention and made quick progress. Recently, using the sensor data collected by accelerometers, gyroscopes, and magnetometers, the walking direction, distance, and gesture information can be captured [19]. Accelerometers measure the 3D linear acceleration (m/s^{2}) or gforce (gravity or g) of the device. Gyroscopes give the angular velocity (rad/s) and measure the direction in principle of angular momentum. Magnetometers provide the strength and direction of magnetic fields [20].
For the pedestrian localization, pedometer is a common method to calculate the user’s movement distance. In order to accurately calculate the movement distance, a number of step counting algorithms have been proposed by researchers. Peak detection or zero crossing of acceleration readings are the simple and effective algorithms [21]. In advanced methods, more accurate step counting results can be obtained by further mining pattern information [22, 23]. Through walk detection and step counting, the mobile device can measure the walking distance by multiplying the stride length with the step counts. The stride length depends on the step frequency, user height, and other factors. Lots of methods like [24–26] have been proposed to find the relationship between step length and the step frequency, user height, and other factors.
With the motionassisted technology, crowdsourcingbased rapid deployment technology is widely applied in indoor WLAN location system [27]. During the offline phase, when the user is walking, RSS values and sensor data are collected simultaneously. Then the distance between two RSS records can be calculated through the sensor data. The novelty is that even when the user is working with routine business and walking in the office, the site survey can be conducted transparently [28, 29]. Hence, there is no need to conduct dense fingerprinting by professional surveyors. In [30], an automatic construction method of radio map by crowdsourcing PDR traces is proposed, the PDR traces are used to generate indoor road paths and the radio map can be constructed combined with the WiFi fingerprints. Zou et al. proposed an adversarial learningenabled automatic WiFi indoor radio map construction and adaptation with mobile robot, the LiDAR SLAM (Simultaneous Localization and Mapping) and Generative Adversarial Networks (GAN) is used to constructs the spatial map and radio map simultaneously [31]. In [32], a graphbased SLAM is used to provide the dead reckoning data from multiple users, then the user’s trajectories can be aligned and a crowdsourcing WiFibased radio map can be established for the WLAN indoor localization service.
The major concern about offline survey reduction is how to balance between localization accuracy and survey cost. RSS measurements collected by the users moving in the environment are potentially more erroneous than those collected by the experts at the exact location of reference points. Therefore, in order to filter the noise or errors, postprocessing methods after sampling like Sliding Correlation Time Window (SCTW) [33], Particle Filter [34], and pathloss model are proposed [35]. Due to the lowrank characteristics of radio map, a variety of lowrank matrix completion methods are proposed to reduce the noise of radio map. In [36], the Inexact Augmented Lagrange Multiplier (IALM) algorithm is proposed to precisely recover the missing RSS in the radio map in the offline phase. By solving the nuclear norm minimization, the IALM algorithm could not only recover the missing received signal strength, but also reduce the noise effectively. In [37], a radio map noise reduction method by using Hankel matrix is proposed to separate the noise from the signal. An empirical model of RSSI is proposed in [38] to reconstruct radio map in both geometric space and the signal space. In [39], a sparse representation and low rank matrix recoverybased radio map update method is proposed to handle the fingerprint missing and sparse noise, as a result, the radio map can be updated quickly and accurately.
Problem formulation
Step determination problem
When building a radio map, we need to get the user’s realtime location information firstly. Using the PDR algorithm and the data of smart phone builtin sensors, we can estimate the user’s walking steps, step size, and movement direction, so as to calculate the user’s realtime position, and the user’s steps are mainly calculated by using the acceleration data. The acceleration of human walking is nearly sinusoidal, that is, there is a peak acceleration and a valley acceleration in each step. In addition, the user’s backward or side step situation is generally not considered in the step counting model, The current research usually divides the mobile phone use mode into four types: Pocketing (P), Swing (S), Texting (T), and Calling (C). Figure 1 shows the threeaxis acceleration curves corresponding to the four usage modes when simulating the user’s walking.
As can be seen from Fig. 1, the characteristics of threeaxis acceleration are also different corresponding to different mobile phone placement modes. Therefore, some researchers train the characteristics of acceleration samples to judge the placement mode of mobile phones for direction estimation. However, considering the diversity of people and mobile phone placement patterns, this pattern recognition using a large number of samples does not have a wide range of applications. Because the acceleration of three axes contains walking information, the modulus of acceleration of three axes is usually used for step counting calculation
At present, the commonly used step detection methods are based on the characteristics of human walking acceleration, such as peak valley detection, zero crossing detection, and autocorrelation detection. Take valley detection as an example, the blue circle in Fig. 2 is the step counting result corresponding to a certain acceleration sampling value. As can be seen from Fig. 2, the collected acceleration data fluctuates violently due to noise, which will cause the wrong step detection. In order to eliminate the interference caused by noise, researchers often add new detection elements, such as time interval, peak valley difference, acceleration slope, and so on. The red box in Fig. 2 shows the step counting result after adding more detection elements. However, considering that the user may switch the mobile placement mode during walking, and the change of peak valley difference before and after the switching process may cause the missing detection of the number of steps. Therefore, a step counting algorithm based on usage mode switching state is proposed in this paper.
RSS fluctuation problem
For the typical WLAN indoor positioning system, a particular mobile device with WLAN adapter is used to record the RSS values. In the offline phase, radio map is first constructed. Suppose there are m APs and n RPs, and we have n fingerprints in the radio map. For the ith fingerprint (S_{i},R_{i}), the RP location is S_{i}=(x_{i},y_{i}), and the RSS value \(\mathbf {R}_{i}\in \mathbb {R}^{1\times m}\) is an average value for several RSS value samplings. The signal space \(\mathbf {RSS} \in \mathbb {R}^{n\times m} \) of the radio map can be tabulated into a matrix form as,
where each element r_{ij} is the RSS recorded at the ith RP from the jth AP.
In the online phase, the mobile device collects an RSS value R_{j} at an unknown position S_{j}. Then, R_{j} will be compared with each row of RSS in Eq. (2). The most several similar R_{i} are found out, and their associated locations S_{i} are obtained. The unknown position S_{j} will be finally decided as the average location of these S_{i}.
For the crowdsourcing WLAN indoor positioning system, one of the problems is that numerous of mobile devices, but not a particular one, participate in both offline phase for radio map building and online phase for localization. Due to different hardware design, the RSS values collected by the diverse mobile devices are subject to the difference of the WLAN adapter. As a result, different data collection devices may have different signal sensing capacities and yield different RSS distribution characteristic. Numerous studies show that the RSS differences for different devices will exceed more than 25 dB due to the hardware differences [40, 41]. Therefore, the localization accuracy is degraded significantly by the RSS variations across different devices. Fortunately, the device diversity problem is solved by the linear regression (LR) method we proposed in [17, 18] and the uniformed RSS values are obtained in both the offline training phase and the online phase.
Another problem is that the RSS values collected by a mobile device in indoor environment are subject to multiple sources of noise, such as path loss, multipath, and shadowing. Moreover, the mobile devices may not be able to scan the whole spectrum to capture the RSS values of all available APs. As a result, the RSS values for radio map building may contain environmental noise and measurement error. For instance, we collect 100 RSS value samplings from a single AP at a location in our lab, which is located at Harbin Institute of Technology, 2A Building. All these samplings are plotted in the histogram as shown in Fig. 3. We suppose −110 dBm corresponds to the occasion when mobile device is unable to receive any signal strength from an AP. It can be seen the recorded RSS value distributes in a large range from −70 to −50 dBm. The distribution of the average RSS readings—with the anomalous −110 dBm values excluded—on the 12th floor of the 2A building is shown in Fig. 4. We can see that although the average signal distribution is somewhat consistent with the signal propagation model, the distribution of RSS values fluctuates sharply. Inclusion of such fluctuation may result in erroneous location estimates.
As described above, one of the key challenges is how to process the measured RSS values collected by crowdsourcing to make them closer to the nominal value. In this paper, we apply the multidimensional scaling (MDS) method to reduce the fluctuation of RSS values and smooth out the radio map.
We define \(\mathbf {D}\in \mathbb {R}^{n\times m} \) to be the matrix of relative distances between AP and RP as
where d(S_{i},AP_{k}) is the Euclidean distance vector between the reference point S_{i} and the access point AP_{k} as,
where S_{APk} is the position of the kth AP.
We also define \(\mathbf {R} \in \mathbb {R}^{n\times n}\) to be the similarity matrix between the RSS readings of corresponding reference points.
It is worth to note that the relative RSS value between two RPs of S_{i} and S_{j} is related to the Euclidean distance as
where \(\mathcal {F}(.)\) is a function that models the WLAN signal propagating in the indoor environment.
Methods
Step determination algorithm
Usage mode switching detection
Usually, there are four kinds of usage mode when a user takes the mobile device roaming in the indoor environment for crowdsourcing data collection, which are pocketing (P), swing (S), texting (T), and calling (C). During the period when user switches the usage mode, the mobile device is actually staying in another mode, which we define as the usage mode switching (M). Since user stepping on the ground will feedback a drastic change in acceleration, different acceleration value thresholds can be set to determine the step when the mobile device is staying in a usage mode. However, it is not easy to distinguish different usage modes in a traditional way.
Taking the four usage modes in a walk as an example, the acceleration data in time domain is in Fig. 5. We can see it is hard to use a constant threshold for step determination. So it is not reliable to make a judgement on the usage mode switching only by the acceleration amplitude in time domain. We need a dynamic threshold for each usage mode and mode switching to determine the step.
Now, we introduce our method to detect the usage mode switching, which will further help to separate different usage modes to determine the step based on the dynamic acceleration threshold. It is well known that the walking step frequency is about 2Hz, so that there will be more energy generated around this frequency. In order to make the acceleration data more intuitive and easy to handle, we propose to employ the shorttime Fourier transform (STFT) method to transform the frequency domain in the following ways:
We make STFT transformation on the above data and plot the result in Fig. 6. It can be seen from the frequency domain that when the user is walking, the energy is concentrated near the walking frequency. When the user changes the mobile phone usage mode, a larger value will occurs in the frequency band below the walking frequency. Cross comparing with Figs. 5 and 6, there are more acceleration information that can be extracted from both in time and frequency domain.
Therefore, based on the energy distribution characteristics in the frequency domain, the mode switching can be detected. The detail detection steps are as follows:

Step 1: Calculate the average amplitude of the spectrum at each moment and compare it with the threshold to determine if the user is walking. According to the practice, we set the threshold 3m/s^{2}.

Step 2: Calculate the mean value of peak spectral energy at the moment of walking, and use twothirds of it as the threshold for mode switching detection.

Step 3: Select 1–2Hz spectrum, calculate the length of the frequency band greater than the threshold in Step 2, then we get the range of greater energy in the frequency band below the walking frequency

Step 4: Compare the frequency range in Step 3 with the threshold in Step 2. If it is greater than the threshold, it is determined that the user is switching the usage mode.
According to the above steps, the result of usage mode switching is shown in Fig. 7. We can clearly determine there are 3 times usage mode switching occurring, which is more obvious than Fig. 5. The reason for selecting 1–2Hz as the usage mode switching detection range in step 4 is as follows. The frequency of human walking is generally higher than 2Hz, which means the spectrum energy of walking is more than 2Hz. So the high frequency spectrum energy below 2Hz can be regarded as the occurrence of usage mode switching.
According to the above detection algorithm, four people are selected for the experiment, and the various mode switching in multiple walks are detected and counted. The results are shown in Table 1. It can be seen from Table 1 that the detection algorithm based on frequency domain has large false detection error. The main reason for missing detection is that when the user produces continuous and fast mode switching, the algorithm can only judge multiple switching as one. The main reason for false detection is that different users have different walking habits, and the larger swing amplitude of the walking part leads to higher energy in 1–2Hz frequency band, resulting in false detection. Although the probability of false detection is high, it can meet the requirements of step counting algorithm.
Dynamic threshold setting
As the usage mode switching in the acceleration is got, we can make dynamic threshold in each mode. Due to the sinusoidal characteristic of human walking, we choose valley detection for step determination. In order to obtain accurate valley detection, we integrate extremum interval judgement, mode switching detection and dynamic threshold calculated by previous peak differences to correct the step determination result. The extremum interval judgement is used to eliminate the false step determination caused by jitter. The dynamic threshold refers to use the previous peak valley difference as the current threshold for different usage mode. The step determination method is expressed as follows:

Step 1: Take an acceleration value in time domain and compare it with 3 values before and after this value to determine whether it is a local extremum. If so, go to Step 2; otherwise, return to Step 1.

Step 2: Determine the current extremum as peak or valley. Subtract the current extremum from the previous neighbor extremum to get the peakvalley difference a_{diff}(n). If current extremum is a valley, go to Step 3; otherwise, return to Step 1.

Step 3: If a_{diff}(n) is greater than the dynamic threshold (1m/s^{2} for initial value), and current valley can be separated from the previous valley by more than 0.3s, the valley is judged as usage mode or mode switching. If the minimum peakvalley difference and the valley interval are not satisfied, return to Step 1; otherwise, go to Step 4.

Step 4: Utilize the proposed usage mode switching detection method to label the valley with usage mode or mode switching. If the label is usage mode, go to Step 5; otherwise, go to Step 6.

Step 5: Determine whether the valley in one of the four usage modes is valid as a step or not. If a_{diff}(n) satisfies with Equation (8), this valley is determined as a step and then go to Step 7, otherwise return to Step 1.
$$ \frac{2}{3}{a_{\text{diff}}}(n  1) < {a_{\text{diff}}}(n) < 3{a_{\text{diff}}}(n  1), $$(8) 
Step 6: Determine whether the valley in the mode switching is valid as a step or not. If a_{diff}(n) satisfies with Eq. (9), this valley is determined as a step and then go to Step 7; otherwise, return to Step 1.
$$ \frac{1}{3}{a_{\text{diff}}}(n  1) < {a_{\text{diff}}}(n) < 4{a_{\text{diff}}}(n  1), $$(9) 
Step 7: Save current peakvalley difference a_{diff}(n) of the valid step as the dynamic threshold. Return to Step 1 for the next judgement, until all the acceleration values are processed.
In order to avoid false step determination, both Eqs. (8) and (9) are used to reduce the noise from the acceleration values caused by jitter. Figure 8 shows the proposed step determination method comparing with the traditional method.
During this experiment, mode switching occurs for two times. As can be seen from Fig. 8, the proposed method outperforms the traditional method for more accurate step determination. The determination error for the traditional method mainly comes from the missing detection after the mode switching. This is because it cannot recognize the mode switching, and thus the acceleration after the switching is misjudged as a glitch and be filtered. But the proposed method can successfully avoid the abovementioned problem because of the usage mode switching detection. Table 2 further shows the comparison of the step determination result of the two methods.
As can be seen from Table 2, the proposed method takes the usage mode switching into account and thus has better detection performance than the traditional method. The main advantage of the proposed method is that when the usage mode switching occurs, it will not filter the effective steps as glitches after switching, which means it has a smaller missed detection probability than the traditional method. So far, more accurate step determination is achieved. It can greatly help to improve the PDR trajectory location, and further enhance the performance of the location sapce of radio map building the in crowdsourcing way.
Multidimensional scalingbased RSS smoothing method
Classical multidimensional scaling algorithm
As noted earlier, in crowdsourcing the signal strengths are measured by a simple walk through the environment. The resulting RSS radio map is a noisy measurement of the signal strengths. In this section, we use MDS to smoothen the RSS values based on the relative distance between collection points, the known AP locations and the signal propagation model.
When a radio map is constructed in offline training phase, the squared similarity distance r^{2}(S_{i},S_{j}) (\(i,j=1,2,\dots,n\)) between all pairs of points in data space is calculated by
Then squared similarity matrix \( \mathbf {R}^{2}\in \mathbb {R}^{n\times n}\) in data space is obtained as
In MDS algorithm, the relative points RSS^{′} is computed from R as follows:
First of all, the double centering is applied to the similarity matrix with
The definition of J is given by
where \(\mathbf {I} \in \mathbb {R}^{n\times n}\) is an identity matrix, and \(\mathbf {e} \in \mathbb {R}^{n\times 1}\) is column vector with e=(1,1,...,1)^{T}.
Decompose B by using the singular value decomposition (SVD) as:
where Λ=diag(λ_{1},λ_{2},...,λ_{m}) is a diagonal eigenvalue matrix of B with \(\lambda _{1}\geq \lambda _{2}\geq \dots \geq \lambda _{n}\geq 0\). U=[u_{1},u_{2},…,u_{n}] is an orthogonal matrix whose columns are the corresponding eigenvectors.
Suppose we want to get the m dimensions of the solution, we denote the matrix of largest m eigenvalue by Λ_{m}, and denote U_{m} to be the first m columns of U. The coordinate matrix of classical scaling is:
Estimation of r ^{2}(S _{i},S _{j})
To smoothen the RSS values in radio map, the precise similarity distance matrix is needed. We use a signal propagation model to compute the similarity distance r(S_{i},S_{j}) when the RSS values collected in RP S_{i} are noised. Consider the kth (k=1,2,...,m) AP installed in the indoor area as shown in Fig. 9.
The location of AP, c_{AP}, can be estimated with good accuracy using the CS method in [42]; hence, we treat its location as known. We use the indoor signal propagation model in [43] to model the wave propagation in the environment. The RSS value received from AP k(k=1,2,...,m) at location S_{i} can be calculated by
where d_{ik} denotes the distance between the kth AP and the location at which the measurement is done, P is the transmission power of the AP, α_{i} is the propagation loss exponent in the environment, and h_{jk} is the combined effect of path loss, fading, and shadowing.
Using this model, and using an approximation as in [43], we get
Let us also define r(S_{i},S_{j}) to be the vector of RSS differences for all APs between the measurements at S_{i} ad S_{j}. The kth element of r(S_{i},S_{j}) is the difference of the RSS measurement from the kth AP between S_{i} ad S_{j}, which can be represented by
The square Euclidean norm of r(S_{i},S_{j}) can be written as
By alternate the unprecise similarity distance r^{2}(S_{i},S_{j}) in \(\mathbf {R}^{2}_{n\times n}\) with ∥r(S_{i},S_{j})∥^{2}, a more accurate similarity matrix can be obtained which is used to smooth the RSS values in the radio map.
The application of MDS algorithm
As described above, RSS values are collected from m APs at n RPs in the indoor area and a radio map is constructed by the n×m matrix of RSS values. Since the noised RSS values are unknown in the RSS matrix RSS, we assume the RSS values r_{i} in RP S_{i} are noised, then it can be smoothed by using MDS method based on coordinates matrix C and the RSS except r_{i} and the RSS values in other RPs can be smoothed one by one until the RSS values in all RPs are smoothed. The radio map can be smoothed by the following steps:

Step1: Compute the similarity distance in data space between RP S_{i} and other RPs using formula (19). The rest of similarity distances are computed by RSS values and the distance matrix for MDS is constructed by combining these two parts of similarity distances.

Step2: Apply MDS to the distance matrix, retaining the first m largest eigenvalues and eigenvectors to construct a m dimensional relative RSS matrix.

Step3: Transform relative RSS matrix to absolute RSS matrix based on RSS except r_{i} by using linear transform, which may include scaling, rotation, and reflection.
Results and discussion
This section provides details on the experimental evaluation of the proposed MDS method using both simulations and implementations. The experimental environment is located in the 12th floor of Building 2A at Harbin Institute of Technology. As shown in Fig. 10, the localization area is the corridor with 49.4 m in length and 14.1 m in width, which is illustrated with yellow color. In the offline phase, we deployed 27 access points (Linksys WRT54G) with IEEE 802.11b/g mode. The radio map was constructed using a stepcounterassisted RSS measurement method. In the localization area, we set 5 walking routes and collect RSS values and sensor data during walking. Figure 11 shows the walking track of the staff and the location of RP points in the radio map when creating the radio map. A fingerprint is synthesized by using RSS values and the corresponding coordinate calculated by the sensor data, and then the radio map is built. Using this system, we only need a few hours to build a radio map with 823 fingerprints. However, since the resultant radio map only has a few RSS values at each RP, the influence of outlier values and signal fluctuation are more pronounced than for traditionally generated radio maps, which have hundreds of RSS values at each RP (Fig. 11).
In contrast, we use the traditional methods to collect 100 RSS values at each RP and take the average value to build the radio map. The signal distribution of AP7 in the radio map is displayed in Fig. 12. It can be seen from the figure that the signal distribution of radio map established by the traditional method is very smooth and the fluctuation is very small. Therefore, it is foreseeable that the user can be precisely positioned using this radio map.
The signal distribution of radio map built by stepcounterassisted RSS measurement method is showed in Fig. 13. As shown in Fig. 13, although the signal distribution of RSS values from a single AP in the radio map built by stepcounterassisted method is generally consistent with the signal propagation model, lots of noises and measurement errors are embedded in the original radio map which lead to the wild fluctuation at some RPs. Figure 14 illustrates the difference between the radio map built by the traditional method and stepcounterassisted method. From the Fig. 14, we can draw two conclusions that the error range is very large (from −35 to 10dB) and the error at each RP fluctuates greatly. If this radio map is used to estimate the user’s location, it will generate a large number of positioning errors.
In order to improve the positioning accuracy, the proposed MDS algorithm is applied to smooth the received RSS values in offline training phase. Figure 15 presents the smoothed signal distribution of the radio map and we can see that the MDS algorithm has removed lots of noises and measurements errors from the original radio map, resulting in better adherence to the signal propagation model. Figure 16 presents the difference between the smoothed radio map and the traditional radio map. Compared with Figs. 13 and 14, the smoothed radio map is more similar to the radio map created by the traditional method, and the RSS fluctuations at each reference point are smaller.
The CDF curve of the positioning error for the MDS algorithm is given in Fig. 17. As comparisons, the sliding correlation time window filter (SCTW) method in [33] and original data are also simulated in this paper. The SCTW method only uses a small number of the RSS values in the radio map and most of the information in the radio map is abandoned, therefore it cannot achieve the optimal radio map. Because the proposed MDS algorithm derives out a more precise radio map, the MDS algorithm has a higher the positioning accuracy than other methods. Notably, the maximum localization error has been reduced from 10m to 6m, and when the positioning error is 2m, the positioning accuracy is increased by more than 30%.
Conclusion
In this paper, the noise reduction method for radio map in crowdsourcing indoor positioning system is proposed. In order to precisely count the steps, mode switching based step counting algorithm and dynamic threshold detection method are used in our stepcountingassisted RSS measurement method. Although stepcountingassisted method can obtain radio maps quickly, RSS data in radio map contains lots of errors and fluctuates violently. Therefore, the accuracy of the radio map cannot meet our requirements. In order to increase the accuracy of radio map and reduce RSS fluctuations and errors, the MDS method are proposed to smooth the radio map. The intuition behind this technique is that the RSS values between different RPs have an intrinsic relationship due to the fixed positions of RPs. According to the MDS algorithm, noisy RSS values can be well corrected and an smoother radio map is obtained. We tested the proposed method in a typical office environment at Harbin Institute of Technology, and the experimental results demonstrate that the proposed method leads to significantly improvements on localization accuracy.
Availability of data and materials
The radio map data used to support the findings of this study were supplied by Liye Zhang under license and so cannot be made freely available. Requests for access to these data should be made to Liye Zhang(zhangliye@sdut.edu.cn).
Abbreviations
 MDS:

Multidimensional scaling
 LBS:

Locationbased service
 WLAN:

Wireless local area networks
 RSS:

Received signal strength
 AP:

Access point
 RP:

Reference point
 GSSL:

Graphbased semisupervised learning
 PDR:

Pedestrian dead reckoning
 STFT:

Shortterm Fourier transform
 SLAM:

Simultaneous Localization and Mapping
 GAN:

Generative Adversarial Networks
 SCTW:

Sliding Correlation Time Window
 SVD:

Singular value decomposition
 IAML:

Inexact Augmented Lagrange Multiplier
References
 1
M. Zhou, Y. Long, W. Zhang, Q. Pu, W. He, Adaptive genetic algorithmaided neural network with channel state information tensor decomposition for indoor localization. IEEE Trans. Evol. Comput., 1–1 (2021). https://doi.org/10.1109/TEVC.2021.3085906.
 2
I. Ashraf, S. Hur, Y. Park, Smartphone sensor based indoor positioning: Current status, opportunities, and future challenges. Electronics. 9(6), 891 (2020).
 3
Y. Wang, Y. Shu, X. Jia, M. Zhou, L. Xie, L. Guo, Multifeature fusion based hand gesture sensing and recognition system. IEEE Geosci. Remote Sens. Lett. (2021). https://doi.org/10.1109/LGRS.2021.3086136.
 4
M. Zhou, Y. Li, M. J. Tahir, X. Geng, Y. Wang, W. He, Integrated statistical test of signal distributions and access point contributions for wifi indoor localization. IEEE Trans. Veh. Technol.70(5), 5057–5070 (2021). https://doi.org/10.1109/TVT.2021.3076269.
 5
P. Bahl, V. N. Padmanabhan, in Proc. IEEE INFOCOM, TelAviv, Israel, March. Radar: an inbuilding rfbased user location and tracking system, (2000), pp. 775–784.
 6
Y. Zhang, L. Ma, Y. Xu, Y. Sun, in 2019 IEEE Global Communications Conference (GLOBECOM). An RSS pathloss considered distance metric learning for fingerprinting indoor localization, (2019), pp. 1–6.
 7
L. Zhang, Z. Chen, W. Cui, B. Li, C. Chen, Z. Cao, K. Gao, Wifibased indoor robot positioning using deep fuzzy forests. IEEE Internet Things J.7(11), 10773–10781 (2020).
 8
J. J. Pan, S. J. Pan, J. Yin, L. M. Ni, Q. Yang, Tracking mobile users in wireless networks via semisupervised colocalization. IEEE Trans. Pattern Anal. Mach. Intell.34(3), 587–600 (2012).
 9
A. W. S. Au, C. Feng, S. Valaee, S. Reyes, S. Sorour, S. N. Markowitz, D. Gold, K. Gordon, M. Eizenman, Indoor tracking and navigation using received signal strength and compressive sensing on a mobile device. IEEE Trans. Mob. Comput.12(10), 2050–2062 (2013).
 10
X. Wang, D. Qin, R. Guo, M. Zhao, L. Ma, T. M. Berhane, The technology of crowdsourcing landmarksassisted smartphone in indoor localization. IEEE Access. 8:, 57036–57048 (2020).
 11
L. Zhang, S. Valaee, L. Zhang, Y. Xu, L. Ma, in 2015 IEEE 26th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC). Signal propagationbased outlier reduction technique (sport) for crowdsourcing in indoor localization using fingerprints, (2015), pp. 2008–2013.
 12
C. Wu, Y. Zheng, Y. Liu, Smartphones based crowdsourcing for indoor localization. IEEE Trans. Mob. Comput.14(2), 444–457 (2015).
 13
J. Yu, Z. Na, X. Liu, Z. Deng, Wifi/pdrintegrated indoor localization using unconstrained smartphones. EURASIP J. Wirel. Commun. Netw.2019(1), 41 (2019).
 14
Z. Deng, G. Wang, D. Qin, Z. Na, Y. Cui, Continuous indoor positioning fusing wifi, smartphone sensors and landmarks. Sensors. 16(9), 1427 (2016).
 15
Y. Yu, R. Chen, L. Chen, W. Li, Y. Wu, H. Zhou, Autonomous 3D indoor localization based on crowdsourced wifi fingerprinting and MEMS sensors. IEEE Sensors J., 1–1 (2021). https://doi.org/10.1109/JSEN.2021.3065951.
 16
S. Shahidi, S. Valaee, in IEEE International Conference on Communications. Hidden Markov model based graph matching for calibration of localization maps, (2015), pp. 4606–4611.
 17
L. Zhang, M. Lin, Y. Xu, L. Cheng, in GLOBECOM 2017  2017 IEEE Global Communications Conference. Linear regression algorithm against device diversity for indoor wlan localization system, (2017), pp. 1–6.
 18
L. Zhang, X. Meng, C. Fang, Linear regression algorithm against device diversity for the wlan indoor localization system. Wirel. Commun. Mob. Comput.2021:, 1–15 (2021).
 19
N. Bai, Y. Tian, Y. Liu, Z. Yuan, Z. Xiao, J. Zhou, A highprecision and lowcost imubased indoor pedestrian positioning technique. IEEE Sensors J.20(12), 6716–6726 (2020).
 20
Z. Mu, M. Dolgov, Y. Liu, Y. Wang, Wifi/pdr indoor integrated positioning system in a multifloor environment. EAI Endorsed Trans. Cogn. Commun.4(14), 155075 (2018).
 21
P. Goyal, V. J. Ribeiro, H. Saran, A. Kumar, in International Conference on Indoor Positioning and Indoor Navigation. Strapdown pedestrian deadreckoning system, (2011), pp. 1–7.
 22
F. Gu, K. Khoshelham, J. Shang, F. Yu, Z. Wei, Robust and accurate smartphonebased step counting for indoor localization. IEEE Sensors J.17(11), 3453–3460 (2017).
 23
X. Wang, G. Chen, X. Cao, Z. Zhang, M. Yang, S. Jin, Robust and accurate step counting based on motion mode recognition for pedestrian indoor positioning using a smartphone. IEEE Sensors J., 1–1 (2021). https://doi.org/10.1109/JSEN.2021.3058127.
 24
M. Uddin, T. Nadeem, in Proceedings of the 19th Annual International Conference on Mobile Computing and Networking. Spyloc: A light weight localization system for smartphones, (2013).
 25
S. He, S. Chan, Y. Lei, L. Ning, in Acm International Joint Conference. Calibrationfree fusion of step counter and wireless fingerprints for indoor localization, (2015).
 26
Y. Jiang, Z. Li, J. Wang, Ptrack: Enhancing the applicability of pedestrian tracking with wearables. IEEE Trans. Mob. Comput.18(2), 431–443 (2019).
 27
B. Wang, Q. Chen, L. T. Yang, H. C. Chao, Indoor smartphone localization via fingerprint crowdsourcing: challenges and approaches. IEEE Wirel. Commun.23(3), 82–89 (2016).
 28
P. Zhang, R. Chen, Y. Li, X. Niu, L. Wang, M. Li, Y. Pan, A localization database establishment method based on crowdsourcing inertial sensor data and quality assessment criteria. IEEE Internet Things J.5(6), 4764–4777 (2018).
 29
S. H. Jung, D. Han, Automated construction and maintenance of wifi radio maps for crowdsourcingbased indoor positioning systems. IEEE Access. 6:, 1764–1777 (2018).
 30
Z. Li, X. Zhao, H. Liang, in 2018 IEEE International Conference on Communications (ICC). Automatic construction of radio maps by crowdsourcing pdr traces for indoor positioning, (2018), pp. 1–6.
 31
H. Zou, C. L. Chen, M. Li, J. Yang, Y. Zhou, L. Xie, C. J. Spanos, Adversarial learningenabled automatic wifi indoor radio map construction and adaptation with mobile robot. IEEE Internet Things J.7(8), 6946–6954 (2020).
 32
Y. Gu, C. Zhou, A. Wieser, Z. Zhou, Trajectory estimation and crowdsourced radio map establishment from footmounted imus, wifi fingerprints, and gps positions. IEEE Sensors J.19(3), 1104–1113 (2019).
 33
L. XinDi, H. Wei, T. ZengShan, in 2012 International Conference on Computer Science and Service System. The improvement of RSSbased location fingerprint technology for cellular networks, (2012), pp. 1267–1270.
 34
M. Hasani, E. S. Lohan, L. Sydänheimo, L. Ukkonen, in 2014 IEEE RFID Technology and Applications Conference (RFIDTA). Pathloss model of embroidered passive RFID tag on human body for indoor positioning applications, (2014), pp. 170–174.
 35
M. S. R. Sakib, M. A. Quyum, K. Andersson, K. Synnes, U. Körner, in 2014 IEEE Ninth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP). Improving wifi based indoor positioning using particle filter based on signal strength, (2014), pp. 1–6.
 36
M. Lin, L. Jia, Y. Xu, W. Meng, in GLOBECOM 2015  2015 IEEE Global Communications Conference. Radio map recovery and noise reduction method for green wifi indoor positioning system based on inexact augmented lagrange multiplier algorithm, (2015).
 37
M. Lin, Z. Wan, Y. Xu, L. Cheng, in GLOBECOM 2017  IEEE Global Communications Conference. Radio map noise reduction method using hankel matrix for WLAN indoor positioning system, (2017).
 38
W. Xue, Q. Li, X. Hua, K. Yu, B. Zhou, A new algorithm for indoor RSSI radio map reconstruction. IEEE Access. 6:, 76118–76125 (2018).
 39
Y. Zhang, L. Ma, Radio map crowdsourcing update method using sparse representation and low rank matrix recovery for WLAN indoor positioning system. IEEE Wirel. Commun. Lett.10:, 1188–1191 (2021).
 40
K. Kaemarungsi, in 2006 1st International Symposium on Wireless Pervasive Computing. Distribution of WLAN received signal strength indication for indoor location determination, (2006), pp. 6–6.
 41
L. Ma, N. Jin, Y. Zhang, Y. Xu, RSRP difference elimination and motion state classification for fingerprintbased cellular network positioning system. Telecommun. Syst.71(2), 191–203 (2018).
 42
C. Feng, S. Valaee, Z. Tan, in Proceedings of the Global Communications Conference. Multiple target localization using compressive sensing, (2009).
 43
V. Pourahmadi, S. Valaee, in 2012 IEEE Global Communications Conference (GLOBECOM). Indoor positioning and distanceaware graphbased semisupervised learning method, (2012).
Acknowledgements
This work was supported by Communication Research Center, School of Electronics and Information Engineering, Harbin Institute of Technology and School of Computer Science and Technology, Shandong University of Technology.
Funding
This paper is supported by Shandong Provincial Natural Science Foundation, China (grant number ZR2019BF022), and National Natural Science Foundation of China (grant number 62001272, grant number 61902222).
Author information
Affiliations
Contributions
Authors’ contributions
The algorithms proposed in this paper have been conceived by L. Zhang, X. Meng, and C. Fang. L. Zhang and Z. Wang made the analysis and experiment and wrote the paper. X. Meng, C. Fang and C. Liu investigated, validated, and revised this paper. The authors approved the final manuscript.
Authors’ information
Liye Zhang received the M.Sc. and Ph.D. degrees in communication engineering from the Harbin Institute of Technology, in 2011 and 2018, respectively. From 2014 to 2015, he was a Visiting Scholar with Department of Electrical and Computer Engineering, University of Toronto, Canada. He is currently a Lecturer with the Shandong University of Technology. His current research interests include Indoor Localization, Computer Vision and Machine Learning.
Zhuang Wang received the bachelor’s degree in computer science and technology from Shandong Youth University for Political Sciences in 2020. He is currently pursuing the M.Sc. degree with the School of Computer Science and Technology, Shandong University of Technology. His current research interests include machine learning and indoor localization.
Xiaoliang Meng is with the School of Computer Science and Technology, Shandong University of Technology, Zibo, Shandong, China. He received the B.S., M.S. and Ph.D. degree in Measurement Technology and Instrumentation from Harbin University of Science and Technology, Harbin, Heilongjiang, China, in 2011, 2014 and 2018 respectively. He is a lecturer in Shandong University of Technology. His research interests include vision measurement and image processing.
Chao Fang is with the School of Computer Science and Technology, Shandong University of Technology, Zibo, Shandong, China. He received his B.S. in Electronic Information Engineering from Shandong Jianzhu University, Jinan, China, in 2011. He received the Ph.D. degree with the National Key Laboratory of Radar Signal Processing, Xidian University, Xi’an. He is a lecturer in Shandong University of Technology. His major research interests include deep learning, system simulation, and signal processing for high resolution SAR.
Cong Liu received the B.S. and M.S. degrees in computer software and theory from the Shandong University of Science and Technology, Qingdao, China, in 2013 and 2015, respectively, and the Ph.D. degree from the Section of Information Systems (IS), Department of Mathematics and Computer Science, Eindhoven University of Technology. He is currently a Full Professor with the Shandong University of Technology. His research interests are in the areas of business process mining, Petri nets, and software process mining.
Corresponding author
Ethics declarations
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhang, L., Wang, Z., Meng, X. et al. Noise reduction for radio map crowdsourcing building in WLAN indoor localization system. EURASIP J. Adv. Signal Process. 2021, 40 (2021). https://doi.org/10.1186/s1363402100758y
Received:
Accepted:
Published:
Keywords
 Indoor localization system
 Crowdsourcing method
 Multidimensional scaling
 Shortterm Fourier transform
 Received signal strength