 Research
 Open Access
 Published:
Noise prediction of chemical industry park based on multistation Prophet and multivariate LSTM fitting model
EURASIP Journal on Advances in Signal Processing volume 2021, Article number: 106 (2021)
Abstract
With the gradual transformation of chemical industry park to digital and intelligent, various types of environmental data in the park are extremely rich. It has high application value to provide safe production environment by deeply mining environmental data law and providing data support for industrial safety and workers’ health in the park through prediction means. This paper takes the noise data of the chemical industry park as the main research object, and innovatively applies the 3σ principle to the zerovalue processing of the noise data, and builds an LSTM model that integrates multivariate information based on the characteristics of the wind direction classification noise data combined with the wind speed and vehicle flow information. The Prophet model integrating multisite noise information was adopted, and the MultiPL model was constructed by fitting the above two models to predict the noise. This paper designs and implements a comparative experiment with Kalman filter, BP neural network, Prophet, LSTM, Prophet + LSTM weighted combination prediction model. R^{2} was used to evaluate the fitting effect of single model in MultiPL, RMSE and MAE that were used to evaluate the prediction effect of MultiPL on noise time series. The experimental results show that the RMSE and MAE of the data processed by the 3σ principle are reduced by 32.2% and 23.3% in the multistation ordered Prophet method, respectively. Compared with the above comparison models, the MultiPL model prediction method is more stable and accurate. Therefore, the MultiPL method proposed in this paper can provide a new idea for noise prediction in digital chemical parks.
1 Methods/experimental
In order to improve the accuracy of environmental noise prediction in chemical industry park, this paper proposes a multivariate and multistation neural network model (MultiPL) based on LSTM and Prophet. According to the periodicity of environmental data in the park, it is divided into multivariate data and multistation data. Secondly, the structure and implementation of the model are introduced and explained in detail. Finally, the prediction accuracy under different proportions of training sets is compared through experiments, and different data sets and different models are used for experiments. Experiments with and without the use of the 3σ criterion were conducted to compare the single model in different data sets, and to compare the model with other models. The results of experiment 4.2 show that the application of 3σ criterion and multivariate and multistation data can improve the prediction performance of the single model. In addition, experiment 4.3 proves that MultiPL is better than single model, traditional prediction method and LSTM + Prophet linear combination model.
2 Introduction
With the spread of 5G highspeed transmission technology, chemical industrial complexes are also entering the Era of Internet of Things (IoT) through sensors [1]. As the chemical park brings good economic benefits through the gathering of factories, pollution problems are gradually exposed. Exhaust gas and wastewater can be recycled and reused through Ecological Industrial Park, and noise, as a threat that is often overlooked, continues to affect human mental and hearing health. Factory noise may cause mild or moderate noise deafness [2]; noise can also cause headaches, insomnia, unresponsiveness, hearing loss and other symptoms [3,4,5,6]. The chemical park is surrounded by farmland and villages, and noise will have a negative impact on villagers’ lives, animal breeding and natural ecology [7]. How to use effective methods to predict noise and dig out noise rules to reduce the impact on life and physical health is a problem that needs to be considered and solved.
IoT data contain a lot of useful information, such as satellite Industrial Internet of Things (IIoT) data can be used to solve service quality problems [8]. Noise prediction is restricted by many conditions. With the development and change of artificial intelligence technology, existing technologies can solve learning trends, big data classification and trend prediction problems by introducing environmental factors [9]. Information transmission in the IIoT is also limited by spectrum resources, so data loss is a common situation [10]. It is an extremely important research topic to dig out the laws of noise and predict the future noise level to be able to mitigate noise hazards [11]. Noise prediction research has received increasing attention. For example, the literature [11] proposed a gradient boosting model to predict noise, which combines multiple characteristics to analyze areas with severe noise exposure, and performs well under specific frequency sensors. [12] proposed a twolayer long shortterm memory (LSTM) network to predict environmental noise under a large amount of data, which can reflect the change of noise level within a day, but only the time regularity of noise is considered. [13] proves that the LSTM model is better than the traditional ARIMA time series forecasting model. In literature [14], LSTM model is used for airport noise prediction, and metadata of aircraft type, trajectory information and weather data are also integrated into the model, resulting in higher prediction accuracy, but lack of consideration of spatiotemporal characteristics of noise. [15] proposed an integrated model of airport noise prediction based on space fitting and BP neural network, which integrates time and space characteristics to improve the accuracy and fault tolerance of prediction. However, the application area of this model is limited and not flexible enough. [16] established a featureweighted support vector regression model FWSVR based on the time series similarity, which has generalization ability. [17] simulates the noise of a typical road network based on the existing traffic flow model. The above two methods are limited to univariate prediction and lack information integrity. [18] uses the improved Federal Highway Administration (FHWA) model to predict the noise level. This method integrates multivariate information, but the information is not perfect in practical application. Environmental noise prediction still faces the following challenges: Noise has superposition and mutability, how to capture the noise law of the park? How to reduce the influence of sparse zero and outliers caused by sensor faults on the prediction without affecting the noise law? In addition to noise prediction, Prophet, Stackelberg model and extended Kalman filter have also been used by some researchers to achieve good results [19,20,21]. However, a single forecasting method cannot capture the distribution of complex time series patterns. More and more researchers are capturing complex time series distribution patterns based on hybrid forecasting models in order to obtain better forecast accuracy and performance [22]. There are three types of hybrid models for time series prediction. Hybrid model based on ARMA and machine learning [23, 24]: Literature [23] combined ARMA, PSOSVM and clustering method for wind power generation prediction, and [24] uses the combined EMDGMARMA model for coal mine safety production situation prediction. Hybrid model based on ARIMA and machine learning [25,26,27]: In literature [25], the mixed SSAARIMAANN model was used to predict daily rainfall, in [26], the combined ARIMA and ANN model was used to predict daily radiation and in [27], the mixed ARIMA and SVM model was used to predict corn futures price. Hybrid model based on machine learning [28,29,30]: Literature [28] uses CNN and AItuned SVM for power consumption prediction, literature [29] uses CNNLSTM hybrid model for price sequence prediction, and literature [30] uses LSTMRNN combined model for lowtraffic flow forecast. The prediction accuracy obtained by applying the mixed model in the above literature is better than that of the single model, so the mixed model will be the key method to solve the problem of time series prediction of park noise. The abovementioned literature focuses on noise pollution mainly on road traffic, airport, and urban environmental noise, ignoring the harm of noise in chemical parks. Motivated by the studies mentioned above, this paper studies the noise prediction of chemical industry park from the perspective of mixed model, which fills in the blank of the research direction of noise prediction in chemical industry park.
Based on the existing sensor distribution and traffic data in the chemical park, this paper builds a scene model suitable for the distribution characteristics of the park, constructs a noise multivariate data set and a multistation data set according to the scene, and introduces the 3σ criterion to deal with the zero value of noise in order to improve the prediction accuracy. A MultiPL model based on LSTM and Prophet models is proposed. Multivariate data set features such as wind speed, vehicle flow, and noise data based on wind direction classification are used in the multivariate LSTM model to improve the prediction accuracy. The multistation noise data set is used as an additional regression variable for the Prophet model. Fitting the above model forms MultiPL prediction model with higher accuracy.
The rest of this article is structured as follows. The second part introduces the research background, data set and preprocessing. The third part introduces the principle and construction of MultiPL model. In the fourth part, the experimental results of the training model are given and evaluated in detail. The last part is summary and prospect.
3 System model and data set
3.1 System model
The research scene of this paper is an engineering plastics industrial park in Shandong Province, China. Based on the original smart chemical industry park, noise monitoring data are obtained through sensors. The collected data are accurate and effective, which provides an effective data basis for noise prediction.
The Park covers an area of 8.97 km^{2} and is equipped with 12 air monitoring stations (no data at Station 11 due to failure) and 8 vehicle gate monitoring stations. At the mark in Fig. 1a, this paper takes the data of no.10 monitoring station and gate for analysis. There are three main sources of noise in the park:

1.
There are a large number of vehicles in the park for the transportation, loading and unloading of chemical raw materials. The volume of vehicles will affect the noise level.

2.
Chemical plants generally operate 24 h a day, and the impact of noise is not only periodic but also persistent.

3.
Natural sounds, such as wind, also affect the overall noise level. Different wind directions will bring different regional sound effects.
According to Fig. 1b, noise affects the hearing health of workers in the park, reduces the growth rate of crops, and causes residents to be irritable and tired. Conversely, hearing loss leads to decreased work efficiency, and residents' behaviors affect the operation of the park.
In the face of many problems in the scene, noise prediction and risk identification can assist the park in planning the operation cycle and reduce the operation of noise source equipment during periods of high noise to avoid the occurrence of the above situations.
3.2 Data set construction
Based on the system model, we constructed the park noise prediction data set as shown in Fig. 2. Part A represents the noise data and natural environment information monitored by the air monitoring station, and part B represents the vehicle information recorded by the gates. The information is uploaded to the gateway and stored in the park database server.
We carry out preprocessing by reading the data in the server. In this paper, all data are constructed into two subdata sets according to requirements: multivariate data set and multistation data set.
3.2.1 Data set preprocessing
The data sets used in this paper are from the scenarios in Sect. 2.1 and span from 14:00 on August 22, 2020 to 01:00 on February 2, 2021. As shown in Fig. 2, data preprocessing mainly includes the following three tasks:
Step 1: Data cleaning. Noise has mutability, and the irregular 0 dB value of the data has a great influence on the prediction accuracy. The 3σ criterion is introduced to deal with outlier zero value. Sparse missing data are completed by KNN adjacent interpolation.
Step 2: Data screening. The original noise data interval is 30 s, and a noise sensor has 470,760 pieces of data. The data are too dense. The training process can be accelerated by resampling experimental data according to 10min intervals.
Step 3: Traffic data parsing: All the vehicle information in the park is classified with the vehicle entry and exit status as tags, and statistics are made at 10min intervals.
After the data set preprocessing is completed, we construct subdata set and verify the correlation between noise data and different variables, laying a foundation for the subsequent prediction work.
3.2.2 Multivariate data set and multisite data set
Multivariate data sets include vehicle flow, noise characteristics of adjacent stations based on wind direction classification (the construction method is located in Sect. 3), wind speed and noise. The multistation dataset contains noise data from 11 monitoring stations.
The noise data and natural environment data are derived from part A of Fig. 2, including information such as temperature, wind speed, wind direction, light, noise, and PM2.5. The traffic flow data come from part B of Fig. 2. In Fig. 3b, c, the Xaxis represents the time interval index (in days), and (b) the Yaxis represents the noise decibel value and wind speed. The blue and red curves represent the noise value and wind speed, respectively, (c) the Yaxis represents the noise decibel value and the number of traffic flows. The blue, red and green curves indicate the number of vehicles entering and leaving the park and the decibel level, respectively. In order to analyze the correlation of representative data in air monitoring stations, Pearson correlation coefficient ρ is introduced as follows:
where \({\text{cov}} ( \cdot )\) refers to the covariance operator, σ is the standard deviation, \(\rho_{NW}\) means in the same station at the same time the correlation coefficient of wind speed, \(N^{(T,Y)}\) and \(W^{(T,Y)}\), respectively, represent the noise and wind speed values of \(Y\) station at time \(T\). \(\rho_{NN}\) represents the correlation coefficient between the noise values of different stations at the same time, \(N^{(T,Y)}\) and \(N^{{(T,Y^{\prime})}}\) represent the noise of \(Y\) and \(Y^{\prime}\) at time \(T\), respectively. Draw conclusions based on the information in Fig. 3:

1.
Correlation of data. According to (a), the correlation coefficient between noise and wind speed is 0.48, which is the main influencing factor in the existing information. According to (d), the noise data of different stations are correlated.

2.
Similarity of data. According to (a), the fluctuation trend of noise and wind speed is similar. It is necessary to correlate wind speed information to predict noise more accurately.

3.
Periodicity of data: Traffic flow and noise level have similar periodicity. Among them, at zero o’clock, the peak of vehicle entry and exit is reached, and the second peak of traffic in the park is reached around 12 noon.
The multivariate data set contains the influence of traffic flow, wind speed and wind direction on noise change, and the multistation data set contains the correlation between the noise of neighboring stations and the stations to be measured. The MultiPL noise prediction method is proposed according to the unique data attribute of park.
4 MultiPL model based on Prophet and LSTM
4.1 Multielement LSTM model
LSTM (long shortterm memory) network model is an improvement of RNN (recurrent neural network). The infrastructure of LSTM contains a part that controls the storage state, which can solve the problem of gradient disappearance encountered by RNN [31]. In this paper, the method of supervised learning is adopted, which does not require artificial construction of time series features. The time series curve can be fitted through deep learning network, and the longterm dependence of time sequence relationship can be captured for feature learning and prediction. The principle of LSTM is shown in Fig. 4.
When \(f_{t} = 1\), it means that the shortterm memory is completely retained. After the noise data are input, whether it can be stored in the cell depends on the input gate, and the output of the input gate is \(C_{t}\) as in the formula (3).
\(n_{t}\) represents the input noise of the current layer. \(h_{t  1}\) is the output noise of the previous layer and the hidden state of the current layer. The above formula represents the state of the new cell after discarded useless information and retained some new information, where \(i_{t} = \sigma (W_{i} [h_{t  1} ,n_{t} ] + b_{i} )\) and it represents the probability of new information being retained, and the prediction noise of output depends on the output gate:
\(o_{t}\) is the output probability. Multiplying \(o_{t}\) and hyperbolic tangent function \(\tanh (C_{t} )\) can achieve the purpose of controlling the cell state filtering, and the output \(Y_{t}\) is the hidden state of the next layer. In the above expression, \(W_{f} ,W_{i} ,W_{C} ,W_{o}\) are the function parameter weight vectors and \(b_{f} ,b_{i} ,b_{C} ,b_{o}\) are the bias vectors.
The essence of realizing multivariate is to form a sample with multiple dimensions of multiple information and transform it into a supervised learning problem, so as to achieve the purpose of multiple inputs and single output. There are 32 neurons in the first hidden layer, 1 neuron in the output layer is used to predict noise, and the input variables are fourdimensional information including wind speed, noise of neighboring station based on wind direction, traffic flow information and noise of prediction station. The output is prediction noise of prediction station with 2 prediction steps and time interval of 10 min. The model was trained 100 times with a batch size of 128, tracking training and test losses during training by setting the validation_data parameter in the fit () function.
Multifactor features were extracted based on LSTM model for noise prediction. The prediction error was large during the abrupt change period: In January, the noise plunged about 4.5 dB, and the high error of the prediction result was about 2 dB. Therefore, the Prophet model was introduced to fuse multistation information to improve the prediction accuracy.
4.2 Prophet model based on spatial multistation regression
The Prophet prediction model has great advantages in processing periodic data with abnormal values and trend changes, and the noise of chemical parks has strong microabruptness and macroregularity. Therefore, Prophet model is introduced for noise prediction in this paper. Prophet model decomposes the time series according to the following formula:
In formula (6), \(g\left( t \right)\) represents the noise trend term, which is mainly used to fit aperiodic changes in the time series. We use a trend term model based on piecewise linear functions:
In formula (7), \(m\) is the offset, \(k\) represents the growth rate, and \(\delta\) represents the change in the growth rate. The indicator function is: \(a(t) = (a_{1} (t),...,a_{S} (t))^{T}\). \(a(t) \in \left\{ {0,1} \right\}^{S} ,a_{j} (t) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {{\text{if}}\;t \ge S_{j} } \hfill \\ {0,} \hfill & {{\text{otherwise}}} \hfill \\ \end{array} } \right.\), \(\gamma = (\gamma_{1} ,...,\gamma_{S} )^{T} ,\gamma_{j} =  S_{j} \delta_{j}\), where S represents the number of mutation points.
\(s\left( t \right)\) is a periodic term modeled by Fourier series:
In Formula (8), \(t\) represents a fixed period, \(2n\) represents the number of periods expected to be used in the model, \(P\) represents the period of the time series, and P = 7 represents a period of weeks.
\(h\left( t \right)\) is a holiday item that regards the influence of each holiday at different times as an independent model. \(\varepsilon \left( t \right)\) represents the error term or interference term, which represents random and unpredictable fluctuations. Prophet algorithm can add up trend terms, season terms and so on to be the predicted value of time series.
In this paper, the method add_regressor() was used to add data from multiple stations as regression variables for fitting. First, the noise time series data of other sites were added to Prophet in turn for prediction. Then, the sites were sorted according to the RMSE size of the prediction results, and the ranking results were added to Prophet model in turn to improve the prediction accuracy. Although the Prophet model is flexible, it cannot consider the influence of the characteristics of multidimensional factors. Therefore, achieving accurate prediction requires a more complete prediction scheme.
4.3 MultiPL model based on Prophet and LSTM combination
Based on the characteristics of the Prophet and LSTM models, we propose the MultiPL model to make up for the limitations of a single model, and can effectively use the park information and the advantages of the two models to achieve higherprecision noise prediction.
Firstly, the noise feature sequence of adjacent stations based on wind direction was constructed, and the wind direction was classified as direction labels with time series features. Extract the noise value of the corresponding site during the time according to the tag, stitch the extracted noise value into a new time series feature, which is the noise feature in Fig. 5, and construct a multielement LSTM model by combining the time series features of traffic flow and wind speed. The above work is based on the multivariate data set \({\text{Train}}\;{\text{Set}}\;1\). The data of each site in the multistation dataset \({\text{Train}}\;{\text{Set}}\;2\) were, respectively, used in the Prophet model, sorted according to the size of RMSE of different sites, and added to the Prophet model in the order of RMSE from small to large.
Use the cftool (Curve Fitting Tool) curve fitting toolbox in MATLAB to fit the two model prediction results and the real noise value in the training set, and obtain the formula (9) between the actual noise value and the model prediction value:
The method of obtaining the relationship between the actual value and the predicted value by fitting method is closer to the true value than the linear weighting method of the predicted value of the two models, and has the property of constant compensation, which prevents the training result of a certain model from being too high or too low leading to deviations in forecast results.
5 Experiment and result analysis
Firstly, the proportion and evaluation indexes used in the training set are described, and then, the 3σ criterion and multivariate multistation prediction results analysis are introduced. Finally, the MultiPL proposed in this paper is compared with Prophet + LSTM linear weighted combination model, LSTM, Prophet, BP neural network model, traditional Kalman filter prediction model and other prediction models, to verify that the proposed method has better accuracy and prediction ability.
5.1 Train set proportion and evaluation index
The proportion of training set and test set in multivariate data set and multistation data set is determined by experimental comparison. The LSTM deep neural network is prone to overfitting, and the Prophet model has good stability. Taking singlesite prediction as an example, the difference in RMSE between different data set ratio experiments does not exceed 0.5. Therefore, the LSTM model is used as the basis for data set division to ensure, however, the best is selected on the basis of fitting. According to Table 1, 72% of the training set is finally determined, and the rest is the test set.
In order to verify the validity of MultiPL prediction model, this paper uses three evaluation indexes: root mean square error (RMSE), mean absolute error (MAE) and coefficient of determination (\(R^{2}\)). The calculation formula is as follows:
\(\overline{x}\) is the mean value of the true value of noise, \(x = (x_{1} ,x_{2} , \ldots ,x_{n} ),x_{i} \in R^{n}\) is the true value of noise, \(\tilde{x} = (\tilde{x}_{1} ,\tilde{x}_{2} , \ldots ,\tilde{x}_{n} ),\tilde{x}_{i} \in R^{n}\) is the predicted value of noise in Eqs. (10) and (11), expressed as the fitted value of the predicted values of the two models in Eq. (12), and \(n\) is the number of time series values. The smaller the number of values, RMSE and MAE, the better the predictive ability of the model. The closer \(R^{2}\) is to 1, the better the predictive effect of the fitted model.
5.2 Analysis of forecast results
The 3σ criterion assumes that a set of data contain only random errors, and the noise value \({\text{noise}} \in (u  3\sigma ,u + 3\sigma )\) interval accounts for about 99.74%. It is believed that any error exceeding this interval is not a random error but a gross error. The data containing this error should be removed or replaced, u represents the mean value of noise, σ is the noise standard deviation, and noise is the noise value.
This article replaces the noise range at \(0 \le {\text{noise}} < u  3\sigma ({\text{dB}})\) with the mean value. Take the noise data of Station 10 in Fig. 6 as an example, part A is the original noise value containing the zero value of the sparse mutation point, and the unbiased standard deviation of the sample is 4.45. Part B represents the noise value after the above 3σ treatment, and the unbiased standard deviation of the sample is 4.28.
According to Table 2, RMSE decreases by at least 0.1 dB and MAE also decreases for both singlestation and multistation predictions using the 3σ criterion; compared with singlestation data, the predicted RMSE and MAE of multistation data set used in Prophet model are reduced by 5.3% and 7.3%, respectively. Each station is used for noise prediction of station 10. The RMSE and MAE of each station are shown in subpictures 1 and 3 in Fig. 7. After the stations are sorted according to RMSE, they are shown in subpictures 2 and 4. According to the order, the multistation data are added to the Prophet model as regression variables, and the RMSE and MAE of the disorderly prediction are reduced by 26.3% and 22.8%, respectively. In the multisite ordered Prophet method, the RMSE and MAE of the data processed by the 3σ principle are reduced by 32.2% and 23.3%; compared with singlesite data, the RMSE and MAE predicted by using the multivariate data set in the LSTM model are reduced by 9.3% and 15.9% dB, respectively.
It can be seen from Table 2 that the prediction results with the application of 3σ criterion have higher accuracy. The Prophet model uses multistation ordered data with the highest accuracy, and the LSTM model uses multivariate data sets with higher accuracy than the original station. On this basis, the data predicted by LSTM and Prophet training set were fitted, and the relationship between the real noise value of the training set and the predicted value of LSTM and Prophet training set was obtained as shown in Eq. (13), where \(L(t),P(t)\) are the predicted results of LSTM and Prophet training set, respectively. \(f(t)\) is the fitting predicted value.
The RMSE of \(f(t)\) obtained by fitting and the true value is 0.54. The data points in Fig. 8 are basically fitted to the same plane and the coefficient of determination \(R^{2} = 0.962\). The fitting effect is good. After the test set was fed into LSTM and Prophet, the predicted value was put into the verification Eq. (13), and the prediction result \(f_{{{\text{test}}}} (t)\) of the MultiPL model was obtained as shown in Fig. 9.
Among them, \({\text{Test}}\;{\text{Set}}\;1\) is from multivariate data set and \({\text{Test}}\;{\text{Set}}\;2\) is from multistation data set. Figure 10 shows the true value, LSTM and Prophet noise predicted value. Compared with the predicted value fitted by the MultiPL model in Fig. 11, MultiPL makes up for the prediction deviation of the two models and improves the prediction accuracy of outliers contained in the noise. The RMSE and MAE of \(f_{{{\text{test}}}} (t)\) and the true value were 0.53 and 0.46 dB, respectively. The prediction result of MultiPL model is obviously better than that of single LSTM and Prophet model.
5.3 Comparison results of different prediction models
In order to verify the prediction performance of MultiPL model, two evaluation indexes, RMSE and MAE in Sect. 4.1, are used to evaluate Kalman filter prediction, BP neural network, LSTM, Prophet, and Prophet + LSTM linear weighted model (optimal weight:\(\omega_{{{\text{LSTM}}}}\) = 0.5, \(\omega_{{{\text{Prophet}}}}\) = 0.5). According to Table 3, the prediction results of MultiPL are better than other prediction methods, and the accuracy of RMSE and MAE is improved by 45.9% and 25.9%, respectively, compared with the linear weighted model. MultiPL model can be used as an effective prediction model for chemical industry parks.
6 Conclusions
It is very important to analyze the noise law and influencing factors in chemical industry park and improve the prediction accuracy of noise, which is of great significance to guide the working time planning and workers' hearing health protection. Based on the appearance law of time series data such as noise, traffic flow, wind direction and wind speed in a chemical park, this paper uses the 3σ criterion to replace the zero value of noise, and proposes a MultiPL model based on multivariate information and multistation information. Design and implement the comparative experiment with Prophet + LSTM weighted model, single model, Kalman filter prediction model and traditional BP neural network model under each weight coefficient. The results show that the time series data of park noise processed by the 3σ criterion have better performance in the prediction model, and the prediction error of multistation Prophet and multivariate LSTM neural network model is lower than the traditional Kalman filter prediction model and BP neural network model. Moreover, Prophet + LSTM linear weighted combination model has a slightly higher prediction accuracy than the above models, and MultiPL model which can effectively use park data and has constant compensation property has the best effect. Compared with linear weighted combination model, RMSE and MAE errors are reduced by 0.45 dB and 0.36 dB, respectively. MultiPL can be used as an effective noise prediction model in chemical industry park. On the basis of the wide application of intelligent parks, this study can provide a new idea for noise prediction in parks.
This paper only constructs the prediction model fitted by two multifactor models. In the future, the traditional prediction model based on statistical method can be introduced to make up the disadvantage of neural network and get more accurate noise prediction results. In addition, transfer learning or reinforcement learning can be used to predict the overall noise level of the park.
Availability of data and materials
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
Abbreviations
 3σ criterion:

A method of outlier discrimination based on normal distribution
 RNN:

Recurrent neural network
 LSTM:

Long shortterm memory
 Prophet:

An open source time series prediction algorithm provided by Facebook
 MultiPL:

Noise prediction model based on LSTM and Prophet
 Kalman:

An algorithm for optimal estimation of system state by using linear system state equation
 BP:

Back propagation
 RMSE:

Root meansquared error
 MAE:

Mean absolute error
 R ^{2} :

R squared
References
X. Liu, X. Zhang, Rate and energy efficiency improvements for 5Gbased IoT with simultaneous transfer. IEEE Internet Things J. 6(4), 5971–5980 (2019)
J. Wu, H. Miao, J. Liu, Hearing status of workers in automobile industry and the correlation of influencing factors of noise deafness. Clin. Res. China 30(5), 713–716 (2017)
Z.U.R. Farooqi, Assessment of noise pollution and its effects on human health in industrial hub of Pakistan. Environ. Sci. Pollut. Res. Int. 27(3), 2819–2828 (2020)
G.R. Taffere, M. Bonsa, M. Assefa, Magnitude of occupational exposure to noise, heat and associated factors among sugarcane factory workers in Ethiopia, 2017. J. Public Health (Berl.) 28, 517–523 (2020)
I.P. Nyarubeli, A.M. Tungu, Variability and determinants of occupational noise exposure among iron and steel factory workers in Tanzania. Ann. Work Expos. Health 62(9), 1109–1122 (2018)
K. Ayda, G. Perihan, “Noise factory”: a qualitative study exploring healthcare providers’ perceptions of noise in the intensive care unit. Intensive Crit. Care Nurs 63, 102975 (2020)
M. Susan, Life: Industrial noise disrupts plants: effects on animals alter dispersal of seeds, pollen. Sci. News 181(8), 15 (2012)
X. Liu, X.B. Zhai, W. Lu, C. Wu, QoSguarantee resource allocation for multibeam satellite industrial internet of things with NOMA. IEEE Trans. Ind. Inform. 17(3), 2052–2061 (2021)
H. Zhang, Intelligent 5G: when cellular networks meet artificial intelligence. IEEE Wirel. Commun. 24(5), 175–183 (2017)
X. Liu, X. Zhang, NOMAbased resource allocation for clusterbased cognitive industrial internet of things. IEEE Trans. Ind. Inf. 16(8), 5379–5388 (2020)
W. PoJiun, H. Chihpin, Noise prediction using machine learning with measurements analysis. Appl. Sci. 10(18), 6619 (2020)
X. Zhang, M. Zhao, R. Dong, Timeseries prediction of environmental noise for urban IoT based on long shortterm memory recurrent neural network. Appl. Sci. 10(3), 1144 (2020)
S. Jaffry, Cellular traffic prediction with recurrent neural network//2020 IEEE 5th International Symposium on Telecommunication Technologies (ISTT). (IEEE, 2020)
A.E. Vela, Y. OleyaeiMotlagh, Ground level aviation noise prediction: a sequence to sequence modeling approach using LSTM recurrent neural networks//2020 IEEE/AIAA 39th Digital Avionics Systems Conference (DASC). (IEEE, 2020)
X. Tao, S. Han, Y. Guoqing, Integrated model of airport noise prediction based on spatial fitting and neural network. China Environ. Sci. 36(04), 1250–1257 (2016)
C. Liu, Research on Interactive Prediction of Airport Noise Monitoring Points Based on Time Series Similarity Measure (Nanjing University of Aeronautics and Astronautics, Nanjing, 2018)
B. Sun, L. Chen, Univariate traffic noise prediction considering traffic flow state of road network. Noise Vib. Control 41(02), 190–195 (2021)
S. Sameer, K. Satish, Assessment and prediction of environmental noise generated by road traffic in Nagpur City, India. Environ. Pollut. 77, 167–180 (2018)
K. Mashael, L. Kaouther, A. Nada, Time series Facebook Prophet model and python for COVID19 outbreak prediction. Comput. Mater. Continua 67(3), 3781–3793 (2021)
J. Song, H. Xie, B. Gao, Maximum likelihoodbased extended Kalman filter for COVID19 Prediction. Chaos Solitons Fractals 146, 110922 (2021)
F. Li, K. Lam, X. Liu, J. Wang, K. Zhao, L. Wang, Joint pricing and power allocation for multibeam satellite systems with dynamic game model. IEEE Trans. Veh. Technol. 67(3), 2398–2408 (2018)
Z. Liu, Z. Zhu, J. Gao, C. Xu, Forecast methods for time series data: a survey. IEEE Access 9, 91896–91912 (2021)
Y. Wang, D. Wang, Y. Tang, Clustered hybrid wind power prediction model based on ARMA, PSOSVM, and clustering methods. IEEE Access 8, 17071–17079 (2020)
M. Wu, Y. Ye, N. Hu, Q. Wang, ‘EMDGMARMA model for mining safety production situation prediction.’ Complexity 2020, 1–14 (2020)
P. Unnikrishnan, V. Jothiprakash, Hybrid SSAARIMAANN model for forecasting daily rainfall. Water Resour. Manag. 34(11), 3609–3623 (2020)
C.G. Ozoegwu, Artificial neural network forecast of monthly mean daily global solar radiation of selected locations based on time series and month number. J. Cleaner Prod. 216, 1–13 (2019)
S. Wu, F. Shao, R. Sun, ‘Corn futures price forecast based on ARIMA time series and support vector machine. in Proceedings of 4th International Conference on System Computing and Big Data, vol. 5, pp. 41–49 (2019)
S. Chan, I. Oktavianti, V. Puspita, A deep learning CNN and AItuned SVM for electricity consumption forecasting: Multivariate time series data, IEEE 10th Annual Information Technology, Electronics and Mobile Communication. Conference (IEMCON), pp. 488–494 (2019)
I.E. Livieris, E. Pintelas, P. Pintelas, A CNN–LSTM model for gold price timeseries forecasting. Neural Comput. Appl. 32, 17351–17360 (2020)
B.B. Sahoo, R. Jha, A. Singh et al., Long shortterm memory (LSTM) recurrent neural network for lowflow hydrological time series forecasting. Acta Geophys. 67, 1471–1481 (2019)
S. Hochreiter, J. Schmidhuber, Long shortterm memory. Neural Comput. 9(8), 1735–1780 (1997)
Acknowledgements
Not applicable.
Funding
This work was supported by the National Natural Science Foundation of China under Grant No. 61701284, U1931207 and 61702306, the Innovative Research Foundation of Qingdao under Grant No.19621cg, the Application Research Project for Postdoctoral Researchers of Qingdao, the Sci. & Tech. Development Fund of Shandong Province of China under Grant No. ZR202102230289, ZR202102250695 and ZR2019LZH001, the Humanities and Social Science Research Project of the Ministry of Education under Grant No.18YJAZH017, the Taishan Scholar Program of Shandong Province, the Shandong Chongqing Science and technology cooperation project under Grant No. cstc2020jscxlyjsAX0008, the Sci. & Tech. Development Fund of Qingdao under Grant No. 2115zlyj1zc, SDUST Research Fund under Grant No. 2015TDJH102, and the Science and Technology Support Plan of Youth Innovation Team of Shandong higher School under Grant No. 2019KJN024.
Author information
Authors and Affiliations
Contributions
QTZ, YL, GC and CGL conceived and designed the experiments. QTZ and YL performed the experiments. GC, HD analyzed the data. YL, GC and QTZ wrote the paper. All authors have contributed to this research work and read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zeng, Q., Liang, Y., Chen, G. et al. Noise prediction of chemical industry park based on multistation Prophet and multivariate LSTM fitting model. EURASIP J. Adv. Signal Process. 2021, 106 (2021). https://doi.org/10.1186/s13634021008156
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13634021008156
Keywords
 Chemical industry park
 Noise prediction
 Prophet
 LSTM
 MultiPL