 Research
 Open Access
 Published:
The impact of feature extraction and selection for the classification of gait patterns between ACL deficient and intact knees based on different classification models
EURASIP Journal on Advances in Signal Processing volumeÂ 2021, ArticleÂ number:Â 95 (2021)
Abstract
The anterior cruciate ligament (ACL) plays an important role in stabilizing translation and rotation of the tibia relative to the femur. Individuals with ACL deficiency usually demonstrate alterations in gait characteristics. Evidence indicates that walking speed, alterations in kinetics and kinematics on the ACL deficient limb, and interlimb asymmetries between deficient and intact knees may contribute to poor longterm outcomes following ACL deficiency. They corrode function of the knee joint and put it at higher risk of degeneration. For the purpose of developing an automatic and highly accurate system for detection of ACL deficiency, this study investigated the classification capability of different dynamical features extracted from gait kinematic and kinetic signals when evaluating their impact on different classification models. A general feature extraction framework was proposed and various dynamical features, such as recurrence rate, determinism and entropy from the recurrence quantification analysis, fuzzy entropy, TeagerKaiser energy feature and statistical analysis, were included. Different classification models, including support vector machine (SVM), Knearest neighbor (KNN), naive Bayes (NB) classifier, decision tree (DT) classifier and ensemble learning based Adaboost (ELA) classifier, derived for discriminant analysis of multiple dynamical gait features were evaluated for a comparative study. The effectiveness of this strategy was verified using a dataset of knee, hip and ankle kinematic and kinetic waveforms from 43 patients with unilateral ACL deficiency. When evaluated with 2fold, 10fold and leaveoneout crossvalidation styles, the highest classification accuracy for discriminating between groups of ACL deficient and contralateral ACL intact knees was reported to be 91.22 \(\%\), 95.12\(\%\) and 96.34\(\%\), respectively,by using the SVM classifier and the optimal feature set. For other four classifiers, KNN achieved the accuracy of 78.05\(\%\), 85.37\(\%\) and 87.80\(\%\), respectively. NB achieved the accuracy of 57.56\(\%\), 60.98\(\%\) and 61.22\(\%\), respectively. DT achieved the accuracy of 77.56\(\%\), 80.49\(\%\) and 83.66\(\%\), respectively. ELA achieved the accuracy of 73.66\(\%\), 78.05\(\%\) and 79.27\(\%\), respectively. Compared with other stateoftheart methods, the results demonstrate superior performance and support the validity of the proposed method.
1 Introduction
The anterior cruciate ligament (ACL) contributes mainly to the knee joint stability which can stabilize translation and rotation of the tibia relative to the femur [1, 2]. ACL injury is one of the most common musculoskeletal pathology causing pain and reduced performance of daily living activities, which is also linked to altered joint kinematics, kinetics and load partitioning in gait because of the loss of stability [3,4,5]. Currently, diagnosis of ACL injury mainly relies on clinical exam [6], arthroscopy [7] or imaging like Xrays [8] and magnetic resonance imaging (MRI) [9]. However there exist some limitations in these tools. For example, it is subjective through clinical exam due to the experience of the physicians. It is invasive for the arthroscopy [7] while it is highly required for the imaging in terms of cost, radiation, and equipment requirements [10]. In addition, the obtained images do not provide any functional or dynamical information concerning the association between ACL and daily activities [10]. Because of the radiation, subjects are not recommended to be exposed to Xrays or MRI frequently when undergoing medical examinations, which makes it difficult to monitor the progression of ACL injury over time.
Therefore, developing an alternative diagnosis method, such as gait analysis, which can offer quick, dynamic, noninvasive, objective and low cost measurement is required in the clinical applications. It has been reported in the literature that ACLdeficient (ACLD) patients may demonstrate abnormalities in their gait patterns several years after the injury [11,12,13,14,15,16,17]. It has been revealed that patients with ACL deficiency tend to adopt an asymmetrical gait pattern which includes reduced knee flexion and internal knee extensor moment, thereby reproduce the abnormality of the injured leg also in the contralateral intact leg [18, 19]. Some studies have deduced that degenerative changes might result from altered gait or functional mechanics of the ACL deficiency [20, 21]. All these findings indicate that gait analysis might act as an alternative or assistant tool for the diagnosis of ACL deficiency in addition to the traditional techniques. How to extract variable and effective information from gait for the diagnosis of ACL injury still remains an open question.
In the ACL literature, clinical and biomechanical studies typically rely on discrete measures to characterize movement disorders [22]. However, singular measures are limited in their ability to capture all the variability and complexity of human gait. Hence, statistical parametric mapping [23] and nonlinear dynamics [24, 25] have been used as alternative methods to provide additional insights. HebertLosier et al. [26] proposed a functional analysis of variance (ANOVA) method based on the interval testing procedure to examine kneekinematic curves. It helped detect precise time intervals where statistical differences occurred between ACLD and ACLintact (ACLI) groups. Many nonlinear parameters linked to the variability of knee motion have been extracted to quantify and classify gait patterns between ACLD and ACLI knees. Among these parameters, the Lyapunov exponent can assess the knee joint stability [27], the fractal dimension and entropy can measure the complexity or the degree of disorder of the knee motion [28], the sample entropy (SampEn) [29] and detrended fluctuation analysis (DFA) can quantify the regularity of the knee joint signals [30, 31]. Stergiou et al. [32] and Moraiti et al. [33] proposed the nonlinear measures including Lyapunov exponent to compute the local stability in ACLD knee when compared to the contralateral intact knee. However extraction of these parameters relies on longlength time series of gait signals (i.e., hundreds to thousands of gait cycles included). This may not be easy to achieve in clinical practice since it is usually required to measure threedimensional (3D) gait kinematics and kinetics of patients with ACL deficiency in a short period.
In addition, the lower extremities act as links of a chain [34, 35]. The position of each link in space will influence the adjoining links. Forces applied at one link can propagate up and down the entire chain [34]. For example, if one link is injured which results in a limitation of motion between two links, then in order to achieve fully normal motion, the collection of healthy connections in the chain must necessarily increase their motion to make up for the loss in one connection [34]. There exist conditions of the ankle and hip that will compel the knee to be subjected to pathological forces through the kinetic chain. Movement patterns in the hip and ankle joints of the injured limb have been found to be altered after ACL injury [35,36,37]. It is also necessary to focus on the kinematic and kinetic variation of knee, hip and ankle joints between ACLD and ACLI groups. Related gait parameters are recommended to be extracted for analysis.
Another potential diagnosis tool is with the dynamical and nonlinear features and machine learning algorithms [38,39,40,41]. Christian et al. [38] proposed a machine learning method with SVM tool for the discrimination of kinematic gait patterns in patients with a ruptured ACL. Features were extracted from motional 3D marker trajectories of knees with principal component analysis (PCA) and recursive feature elimination method. Seven patients were involved and \(100\%\) classification accuracy was achieved. Nonetheless, the experiments were based on a small database and the effectiveness is doubtable. Berruto et al. [39] employed tibial accelerometers to measure the variation of knee pivotshift in patients with unilateral ACL injuries. Magnitudes of accelerations were used as features for the classification of ACLD and ACLI knees and the achieved accuracy was roughly \(90\%\). Kopf et al. [40] carried out a study with similar method to Berruto et al. [39], in which inertial sensor modules fastened to the tibia and femur were used to grade 20 patients with unilateral ACL deficiency. Acceleration difference between ACLD and ACLI knees were used for classification and the reported accuracy was 95\(\%\) (19 of 20). Almosnino et al. [41] used strength curve features to measure the difference between injured and uninjured knees with PCA method. Fortythree patients with unilateral ACL deficiency were involved and the reported specificity, sensitivity and accuracy were 60.5\(\%\), 60.5\(\%\) and 62.20\(\%\), respectively. Nonlinear analysis has been widely applied to assess the human locomotion during normal and pathological gait. Considering the characteristics of nonstationary and recurrent nature of gait signals [42], it is not suitable to perform the Fourier analysis on longlength biological signals. Therefore, in the present study, we adopted a nonlinear data analysis technique, namely Recurrence Quantification Analysis (RQA) [43], to analyse the gait signals [44]. This is due to its advantages of analyzing linear and nonlinear time signals [45]. RQA is prone to quantifying the dynamics of gait data whose working principle is explore the recurrent nature of gait signal in the reconstructed phase space [46]. Phase space reconstruction (PSR) of the gait signals facilitates the understanding of gait dynamics with more observables [47,48,49]. PSR further employs Recurrence Plots (RP) to visualize the recurrence of gait signal in phase space and depict the structures including single dots, horizontal lines, diagonal lines and vertical lines. Quantifying these structures called as RQA, yields several parameters based on different aspects of quantification [46]. In comparison to RQA, entropy is used to measure the uncertainty of nonlinear dynamical system and equals the rate of information production. Calculation of entropy is usually based upon long data sets. Fuzzy entropy is a method suited to measure the complexity of short length time series. The key aspects are the use of fuzzy membership function to quantify the similarity between a pair of vectors and fuzzy probability to determine the disorder or uncertainty [50]. The nonlinear TeagerKaiser energy operator is able to localize instantaneous amplitude changes of signals. Its usefulness has been approved in some biomedical signal processing field [51,52,53]. It generates a time series that can represent the instantaneous energy of the original gait system dynamics, which can be used as a characteristic feature for the classification of pathological and normal gait patterns.
The main purpose of the current study is to evaluate the effectiveness of different dynamical features for the discrimination between ACLD and contralateral ACLI knees based on different classification models. Thereby we can assess the capabilities of optimal features representing gait characteristics. In addition, we can develop an automatic and noninvasive pattern recognition system to detect the presence of ACL injury. Since the lower extremities act as a kinetic chain during dynamic tasks, control of the hip and ankle joint will interact with knee motion. Related gait kinematic and kinetic parameters from the three joints are extracted. A general feature extraction framework is proposed and various dynamical features, such as recurrence rate, determinism and entropy from the recurrence quantification analysis (RQA), fuzzy entropy (FuzzyEn), TeagerKaiser energy (TKE) feature and statistical analysis, are included. Different classification models, including support vector machine (SVM), Knearest neighbor (KNN), naive Bayes (NB) classifier, decision tree and ensemble learning based Adaboost (ELA) classifier, derived for discriminant analysis of multiple dynamical gait features are evaluated and combined with optimal feature set on the classification accuracy for a comparative study.
2 Methods
2.1 Design
In this section, we propose a pattern recognition method to differentiate gait patterns between ACLD knees and contralateral intact knees using dynamical features obtained from kinematic and kinetic gait signals. Figure 1 illustrates the block diagram of the proposed method for the binary classification problem. The method includes the feature extraction and classification stages and follows the following steps. In the first step, nonlinear and statistical features (including Mean, Standard Deviation, Skewness and Kurtosis) are extracted by using different methods, including RQA, fuzzy entropy, TeagerKaiser energy and statistical analysis. In the second step, feature vectors are fed into different classification models to discriminate between ACLD and ACLI gait patterns. Finally, different performance parameters are used to evaluate the classification results.
2.2 Dataset description
We conducted a crosssectional, observational study of individuals with chronic, unilateral ACLD knees. The contralateral unaffected knee was considered as intact. Potential participants were identified from an orthopaedic clinic database. Those who met the inclusion criteria (e.g. diagnosed with full tear of their ACL by magnetic resonance imaging) were either contacted by telephone or email. The subjects were excluded if they had accompanying damage to the posterior cruciate or collateral ligaments, had injuries on the contralateral limb, or had difficulty or pain in performing activities of daily living including walking. Fortythree participants were recruited from November 2015 to July 2016. Subjectsâ€™ characteristics are summarized in Table 1. The study was approved by the ethical review board (2014/547), The University of Sydney, Australia. A written informed consent was obtained from each participant before data collection began.
2.3 Measurement
All measurements and assessments were conducted in a single session at a laboratory with one assessor. Before undergoing the gait analysis procedure, the participants completed a questionnaire that collected demographic information (age, dominant leg and time since injury) and three other patients reported outcome measures (visual analogue scale, Tegner activity scale and knee injury and osteoarthritis outcome score). Participated patients had their height and weight measured.
A 16camera 3D motion capture system (Motion Analysis Corporation, Santa Rosa, USA) and force plates (model 9281, Kistler, Winterter, Switzerland) were used to collect the data. The camera sampling rate was 200 Hz and it was synchronized with the kinetic data sampled at 1000 Hz. Thirty (18mmdiameter) passive markers were attached bilaterally on the head of the second metatarsal, navicular tuberosity, calcaneal tuberosity, medial and lateral malleolus, lateral tibia distally, lateral midtibia, tibia tuberosity, medial and lateral femoral epicondyle, anterior midthigh, greater trochanter and anterior superior iliac spine and one single marker on the spinous process of Sacrum level 2, thoracic spine level 10 and cervical spine level 7 and manubrium using doublesided tape. Each segment was defined using three markers (six degrees of freedom) and idealized as a rigid body with a local coordinate system defined to coincide with a set of anatomical axes. The 3D positions of markers were used to calculate the location of the joint centres. A static trial was collected as a reference to determine body mass and positions of joint centres of rotation. Segment angles relative to the laboratory and relative joint angles were calculated using joint coordinate systems. Threedimensional moments were calculated using inverse dynamics via Kintrakâ„¢ version 7.0 (The University of Calgary, Canada) and were normalized to the individualâ€™s body weight to compensate for anatomical differences between the participants.
Subjects walked barefoot along a 10m walkway at their selfselected habitual (normal) and fast (walking at a speed fast enough to catch a bus without breaking into a jog) speeds. Figure 2 demonstrates the setting used in this study. We investigated whether walking speed will influence classification accuracy of the proposed classification models and evaluate the robustness of the extracted features and classification models to the walking speed. Fast speed trials occurred after normal speed trials and subjects rested for 2â€“3 min between the trials and 5â€“10 min between each walking condition. The average values from repeated trials at both velocities were calculated for comparisons between ACLD and ACLI knee. Kinematics and kinetics of knee, hip and ankle joints have been obtained from the motion capture system and used for the following feature extraction and selection.
2.4 Feature extraction
In order to obtain more efficient features, this paper considers parameters of recurrence quantification analysis (RQA), Fuzzy entropy and TeagerKaiser energy along with statistical features of knee, hip and ankle joint gait data.
2.4.1 Recurrence quantification analysis (RQA)
RQA is utilized to help understand the nature of gait signals and quantify gait with disorders without relaxing the realtime constraints [46]. In the present study, RQA parameters are extracted from the recurrence plots (RP) of the knee, hip and ankle kinematic and kinetic data, which are various measures of the complexity of the gait signals. RP describes the recurrent property of a dynamical system, i.e. visualizing the time dependent behavior of a gait signal \(x_i\) in a phase space [54], and is defined as follows.
where \(\epsilon\) is a predefined cutoff distance, N is the total number of considered states, \(\Vert \cdot \Vert\) is the Euclidean norm, \(\Theta\) is the Heaviside function. The binary values of \(R_{i,j}\) can be easily visualized using the colours black 1 and white 0, which indicates the time evolution of a signal trajectory. In practical applications, RP alone is not a good choice since it is difficult to witness the smallscale patterns by visual inspection. Hence several measures of complexity which can quantify the smallscale structures in the RP, namely RQA, have been proposed. The present study only adopted three measure variables: recurrence rate (RR), determinism (DET) and entropy (ENTR). For more details please refer to [43].
RR measures the density of recurrent points in a recurrence plot and is given by
DET measures the ratio of recurrence points forming diagonal lines which represent epochs of similar time evolution of the system state. Longer diagonal lines are usually discovered in periodic signals while shorter diagonal lines appear in chaotic signals. Hence long diagonal lines can be more often visualized in subjects with pathological gait than in normal subjects. DET is calculated as follows.
where \(\ell _{min}\) is the length of the minimal diagonal line, \(p(\ell )\) is the histogram of these diagonal lines.
ENTR measures the complexity of the recurrence structure, which is given by
The more complex the recurrence structure is, the larger the value of ENTR is. For ACLD knees, their pathological gait appears more recurrence while the complexity is less.
2.4.2 Fuzzy entropy
Fuzzy entropy is used to measure the variability or irregularity of nonlinear time series based on the concept of approximate entropy and sample entropy. Compared to the other two kinds of entropy, it is suitable for shortlength time series and is described as follows [55, 56].
Given a time series \(\{x_1,x_2,...,x_N\}\) with N samples, one can construct the following vector sequence
where \(X_i^m\) represents m consecutive x values commencing with the ith point, m is the embedding dimension, \(\bar{x}_i\) is the average of vector \(X_i^m\) and is given by
Define the distance \(d_{ij}^m\) between \(X_i^m\) and its neighbor \(X_j^m\) as the maximum absolute difference of corresponding scalar components:
where \(X_i^m(k)\) and \(X_j^m(k)\) are the k element of \(X_i^m\) and \(X_j^m\), respectively.
Given n and r, calculate the degree of fuzzy similarity \(S_{ij}^m\) between \(X_i^m\) and \(X_j^m\) by using the exponential function
For each vector \(X_i^m\), average all the degrees of fuzzy similarity to its neighboring vectors \(X_j^m\) and lead to the average degree of fuzzy similarity
The fuzzy probability \(p_r^m\) (defined in Buckley [57]) that two vector sequences match for all mdimensional points within tolerance r is calculated by
Similarly, for the vector sequence \(X_i^{m+1}\), we can also define the fuzzy similarity \(S_{ij}^{m+1}\) between \(X_i^{m+1}\) and \(X_j^{m+1}\), and the average degrees of fuzzy similarity \(S_r^{m+1}(i)\). The fuzzy probability \(p_r^{m+1}\) is defined as
Fuzzy entropy FuzzyEn(m,Â r) of sequence \(\{x_1,x_2,...,x_N\}\) is defined as the negative natural logarithm of the conditional fuzzy probability
For a finitelength time series \(x_i\) (\(1 \le i \le N\)), the Fuzzy entropy can be changed to
2.4.3 TeagerKaiser energy (TKE) feature
The nonlinear Teagerâ€“Kaiser energy operator (TKEO) provides an unconventional perspective on the instantaneous energy of a signal [58]. It relates energy to square of the signal amplitude and the square of its frequency. The TKEO is defined for discretetime signal x(n) as follows [59]
One of the advantages of TKEO is its nearly instantaneous since only three samples are required for the energy computation at each time instant. In addition, high time revolution combined with a simple operator provides the ability to capture the energy fluctuations of the original gait system dynamics as well as efficiently conduct in implementation [60]. TEKO generates a time series which can represent the instantaneous energy of the original gait system dynamics. To measure the variant energy sequence, the average value of nonlinear energy in the time domain is calculated as [61]
where N is the number of samples in the time series of gait signals, TKE is used as a feature of the original time series.
2.5 Feature selection
In order to improve the classification accuracy, this work considers four statistical features (Mean, Standard Deviation (Std), Skewness, and Kurtosis) of gait signals in addition to the parameters of RQA, Fuzzy entropy and TKE. All the 243 features calculated from the kinematic and kinetic data of the ankle, knee and hip joints are demonstrated in Table 2.
In addition, MannWhitney test is utilized to retain the statistically significant features between ACLD and ACLI legs. Features with p value less than 0.05 are considered to be statistically significant and used for classification. It is seen from Tables 2 and 3 that there exist significant differences in 38 features, which are highlighted with red color and â€˜*â€™ marker.
In order to obtain more efficient features, Hill climbing feature selection method [62] is utilized to find the optimal feature subset from the 38 statistically significant features, which can relieve the computational burden of performing the complete search for different feature combinations. It performs stepbystep search by considering one feature after the other. The set of features that gives better accuracy is considered to be the optimal feature set. In the present study, the optimal feature set contains the follwoing features: F1, F2, F10, F19, F24, F29, F30, F61, F67, F96, F149, F176, F191, F207, F217, F227, which is also summarized in Table 2.
2.6 Classification models
To carry out a comparative study, five popular machine learning methods, i.e., the support vector machine (SVM), Knearest neighbor (KNN), naive Bayes (NB) classifier, decision tree, and ensemble learning based Adaboost (ELA) classifier were evaluated because they are usually utilized to solve the classification problem in nonlinear feature space and are suitable for a small size dataset, which is the case in the present study. For detailed introductions of these models, please refer to references [63,64,65,66,67,68].
2.6.1 Support vector machine (SVM)
SVM is a prevalent machine learning and pattern classification technique which transforms data points into a highdimensional feature space and identifies an optimum hyperplane separating the classes present in the data [63]. In the present study we adopted the popular radial basis function (RBF) kernel.
2.6.2 Knearest neighbor (KNN)
KNN is an effective nonparametric classifier which performs the classification by searching for the test dataâ€™s k nearest training samples in the feature space [64]. It utilizes Euclidean or Manhattan distance as a distance metric for the similarity measurement.
2.6.3 Naive Bayes (NB) classifier
NB classifier is a probabilistic method relying on the assumption that every pair of features involved are independent of each other whose weights are of equal importance [65]. The main advantages of NB are the conditional independence assumption, which leads to a quick classification and the probabilistic hypotheses (results obtained as probabilities of belonging of each class).
2.6.4 Decision tree (DT)
In DT, features are used as input to construct a tree structure in which several rules are extracted to recognize the class of the test data [66].
2.6.5 Ensemble learning based Adaboost (ELA) classifier
Ensemble learning techniques combine the outputs of several base classification techniques to form an integrated output and enhance classification accuracy. Compared to other machine learning methods that try to learn one hypothesis from the training data, ensemble learning relies on constructing a set of hypotheses and combines them for use [67]. For the popular Boosting ensemble method, we adopted the addative boosting (Adaboost) algorithm [68] in this study.
3 Experimental results
We evaluate the classification performance of ACLD knees against ACLI knees using dynamical features on different classification models. Several experiments are carried out to verify the effectiveness of the proposed method. Each participant walked five trials under normal and fast walking speeds, respectively. Ten and fourteen trials under normal and fast speeds, respectively, were abandoned because of the malfunction of the motion capture system during the experimental procedure. Hence training size for the ACLD and ACLI knees under normal speed is \(43\times 510=205\). The total training size is \(205+205=410\). Training size for the ACLD and ACLI knees under fast speed is \(43\times 514=201\). The total training size is \(201+201=402\). The number of gait patterns/trials for the training and classification of ACLI and ACLD knees in different walking conditions is shown in Table 4, which is used to testify the robustness of our proposed method to the variation of walking speed.
Experiments are conducted to assess the effectiveness of the proposed features on different classifiers. For the purpose of evaluation, six performance parameters are utilized, including the Sensitivity (SEN), the Specificity (SPF), the Accuracy (ACC), the Positive Predictive Value (PPV), the Negative Predictive Value (NPV) and the Matthews Correlation Coefficient (MCC). These measurements are defined as follows [69]:
where TP is the number of true positives, FN is the number of false negatives, TN is the number of true negatives and FP is the number of false positives. The sensitivity and specificity correspond to the probabilities that PD patients and healthy controls, respectively, are correctly classified. To be accurate, a classifier must have a high classification accuracy, a high sensitivity, as well as a high specificity [70]. For a larger value of MCC, the classifier performance will be better [69, 71].
Binary classification problems classified using five classificaton models: SVM, KNN, NB, Decision Tree and ELA. Twofold, tenfold and leaveoneout crossvalidation techniques are used and performance outcome such as SEN, SPF, ACC, PPV, NPV and MCC, is calculated to obtain reliable and stable evaluation on the performance of the proposed method. Instead of using all the 243 features (as listed in Table 3) for classification, the 38 statistically significant features that demonstrate the significant difference between ACLD and ACLI knees are employed. In addition, in order to improve the performance of the five classifiers by reducing the computation burden, the optimal feature set containing 16 features (shown in Table 2) was derived using features selection method [62]. The classification performance outcome for the five classifier models under normal and fast walking speeds is illustrated in Tables 5, 6, 7, 8, 9 and 10. Among the five classifier models, the SVM classifier achieves the best classification performance in all the 2fold, 10fold and leaveoneout crossvalidation styles. It also possesses the best robustness to the variation of walking speed. On the contrary, the NB classifier did not work well under both walking speeds and its classification performance is inferior to the other four classifiers.
4 Discussion
Experimental results of this study illustrate that it is with high efficiency and accuracy to detect the gait disparity between chronic ACLD and contralateral ACLI knees and to differentiate between them by means of the established pattern recognition system. The impact of feature extraction and selection on five different classification models has also been demonstrated.
The present study not only revealed that ACLD leg demonstrates altered gait patterns in comparison to contralateral ACLI leg, but also provided effective and objective feature extraction and classification methods to discriminate between the two groups. Comparison of the classification performance to other stateoftheart methods between ACLD and ACLI knees is demonstrated in Fig. 3. Overall, our classification approach achieves greatest accuracy considering the size of the databases. Different from the methods in the abovementioned literature, our method extracted several linear and nonlinear dynamical features to represent the disparity of gait patterns between ACLD and ACLI legs for the discrimination task.
Currently, to the authorsâ€™ knowledge, RQA, fuzzy entropy and TeagerKaiser energy have never been considered for the classification of gait patterns between ACLD and ACLI knees in previous literature. The gait signals were recorded for short durations of about 3 min, which was usually required in clinical practice. The present study demonstrated improved accuracy because RQA, fuzzy entropy and TeagerKaiser energy work well irrespective of data length. The other possible reason could be that RQA could depict the hidden relationship of gait signal (for example: periodic or chaotic nature) without assuming the signal to be stationary, linear and noiseless and thus extract the set of features.
The proposed pattern recognition system may serve not only as a measure of kinematic variability and discrimination between two groups of ACLD and ACLI knees, but also as a noninvasive, objective and assistant technical means to other diagnostic approaches such as Xrays, MRI, arthroscopy, etc.
5 Conclusion
This study investigated the performance of different gait features on five classification models for discriminating between ACLD and contralateral ACLI knees. The results of this study indicate that the pattern classification of lower extremity kinematic and kinetic data can offer an objective and noninvasive method to assess the gait disparity between ACLD and ACLI knees. These results demonstrate the potential of the proposed technique for detecting pathological gait patterns caused by ACL deficiency by analysing and measuring the gait difference using RQA, fuzzy entropy, TeagerKaiser energy and statistical features on different classification models. Utilizing RQA on gait signals assist in understanding the nature of gait signals and quantify gait with disorders. It does not rely on assumptions like nonlinearity and nonstationarity, and is suited for short length gait time series. Fuzzy entropy measures the variability of gait signals with short length. The main objectives of this study include understanding the dynamics of human gait, quantitatively analyzing the gait pattern of ACLD and ACLI knees and improving the accuracy of binary classification problems through different machine learning classifiers.
In terms of the limitations in the present study, there are two concerns: (1) the method was evaluated on a small size of database. In addition, the discriminative model constructed in this study enabled only limited clinical usefulness in discerning between the ACLD and contralateral healthy knees. Improvement in discrimination capabilities may perhaps be achieved by consideration of additional control groups. Future work will include a clinical validation of the proposed technique with a larger number of patients with ACL deficiency and agematched healthy controls. (2) there are limited types of gait signals extracted from the participants, including knee joint angles and translations in 6DOF. Various gait signals like knee joint angular velocity and acceleration, knee kinetic parameters (force, moment, etc) may also considered in future work to comprehensively reflect the characteristic of pathological and normal gait patterns between ACLD and ACLI knees.
Abbreviations
 ACL:

Anterior cruciate ligament
 ACLD:

ACL deficient
 ACLI:

ACLintact
 MRI:

Magnetic resonance imaging
 ANOVA:

Analysis of variance
 SampEn:

Sample entropy
 DFA:

Detrended fluctuation analysis
 EEG:

Electroencephalography
 ECG:

Electrocardiography
 RQA:

Recurrence Quantification Analysis
 RP:

Recurrence plots
 PSR:

Phase space reconstruction
 FuzzyEn:

Fuzzy entropy
 TKE:

Teagerâ€“Kaiser energy
 SVM:

Support vector machine
 KNN:

Knearest neighbor
 NB:

Naive Bayes
 ELA:

Ensemble learning based Adaboost
 RR:

Recurrence rate
 DET:

Determinism
 ENTR:

Entropy
 SEN:

Sensitivity
 SPF:

Specificity
 ACC:

Accuracy
 PPV:

Positive predictive value
 NPV:

Negative predictive value
 MCC:

Matthews correlation coefficient
 PCA:

Principal component analysis
 RBF:

Radial basis function
 Adaboost:

Addative boosting
 3D:

Threedimensional
References
B.C. Fleming, P.A. Renstrom, B.D. Beynnon, B. Engstrom, G.D. Peura, G.J. Badger, R.J. Johnson, The effect of weightbearing and external loading on anterior cruciate ligament strain. J. Biomech. 34(2), 163â€“170 (2001)
C. Yang, Y. Tashiro, A. Lynch, F. Fu, W. Anderst, Kinematics and arthrokinematics in the chronic ACLdeficient knee are altered even in the absence of instability symptoms. Knee Surg. Sports Traumatol. Arthrosc. 26(5), 1406â€“1413 (2018)
B. Gao, M.L. Cordova, N.N. Zheng, Threedimensional joint kinematics of ACLdeficient and ACLreconstructed knees during stair ascent and descent. Hum. Mov. Sci. 31(1), 222â€“235 (2012)
H. Huang, N. Keijsers, H. Horemans, Q. Guo, Y. Yu, H. Stam, Y. Ao, Anterior cruciate ligament rupture is associated with abnormal and asymmetrical lower limb loading during walking. J. Sci. Med. Sport 20(5), 432â€“437 (2017)
M. Sharifi, A. ShiraziAdl, H. Marouane, Computational stability of human knee joint at early stance in Gait: Effects of muscle coactivity and anterior cruciate ligament deficiency. J. Biomech. 63, 110â€“116 (2017)
T. Lording, S.K. Stinton, P. Neyret, T.P. Branch, Diagnostic findings caused by cutting of the iliotibial tract and anterolateral ligament in an ACL intact knee using a standardized and automated clinical knee examination. Knee Surg. Sports Traumatol. Arthrosc. 25(4), 1161â€“1169 (2017)
S. Brooks, M. Morgan, Accuracy of clinical diagnosis in knee arthroscopy. Ann. R. Coll. Surg. Engl. 84(4), 265 (2002)
P.G. Ntagiopoulos, D.H. Dejour, The use of stress xrays in the evaluation of ACL deficiency, in Rotatory Knee Instability. ed. by V. Musahl, J. Karlsson, R. Kuroda, S. Zaffagnini (Springer, New York, 2017)
C. Delin, S. Silvera, J. Coste, P. Thelen, N. Lefevre, F.P. Ehkirch, P. Legmann, Reliability and diagnostic accuracy of qualitative evaluation of diffusionweighted MRI combined with conventional MRI in differentiating between complete and partial anterior cruciate ligament tears. Eur. Radiol. 23(3), 845â€“854 (2013)
R.E. Andersen, L. ArendtNielsen, P. Madeleine, Knee joint vibroarthrography of asymptomatic subjects during loaded flexionextension movements. Med. Biol. Eng. Comput. (2018). https://doi.org/10.1007/s1151701818566
T.P. Andriacchi, C.O. Dyrby, Interactions between kinematics and loading during walking for the normal and ACL deficient knee. J. Biomech. 38, 293â€“298 (2005)
M.R. Torry, M.J. Decker, H.B. Ellis, K.B. Shelburne, W.I. Sterett, J.R. Steadman, Mechanisms of compensating for anterior cruciate ligament deficiency during gait. Med. Sci. Sports Exerc. 36(8), 1403â€“1412 (2004)
M. Lindstrom, L. FellanderTsai, T. Wredmark, M. Henriksson, Adaptations of gait and muscle activation in chronic ACL deficiency. Knee Surg. Sports Traumatol. Arthrosc. 18(1), 106â€“114 (2010)
K. Takeda, T. Hasegawa, Y. Kiriyama, H. Matsumoto, T. Otani, Y. Toyama, T. Nagura, Kinematic motion of the anterior cruciate ligament deficient knee during functionally high and low demanding tasks. J. Biomech. 47(10), 2526â€“2530 (2014)
B. Shabani, D. Bytyqi, S. Lustig, L. Cheze, C. Bytyqi, P. Neyret, Gait changes of the ACLdeficient knee 3D kinematic assessment. Knee Surg. Sports Traumatol. Arthrosc. 23(11), 3259â€“3265 (2015)
S.A. Ismail, K. Button, M. Simic, R. Van Deursen, E. Pappas, Threedimensional kinematic and kinetic gait deviations in individuals with chronic anterior cruciate ligament deficient knee: A systematic review and metaanalysis. Clin. Biomech. 35, 68â€“80 (2016)
S. Shanbehzadeh, M.A.M. Bandpei, F. Ehsani, Knee muscle activity during gait in patients with anterior cruciate ligament injury: a systematic review of electromyographic studies. Knee Surg. Sports Traumatol. Arthrosc. 25(5), 1432â€“1442 (2017)
W.J. Hurd, L. SnyderMackler, Knee instability after acute ACL rupture affects movement patterns during the midstance phase of gait. J. Orthop. Res. 25(10), 1369â€“1377 (2007)
K.S. Rudolph, M.J. Axe, T.S. Buchanan, J.P. Scholz, L. SnyderMackler, Dynamic stability in the anterior cruciate ligament deficient knee. Knee Surg. Sports Traumatol. Arthrosc. 9(2), 62â€“71 (2001)
L.S. Lohmander, P.M. Englund, L.L. Dahl, E.M. Roos, The longterm consequence of anterior cruciate ligament and meniscus injuries: osteoarthritis. Am. J. Sports Med. 35(10), 1756â€“1769 (2007)
E.S. Gardinier, K. Manal, T.S. Buchanan, L. SnyderMackler, Gait and neuromuscular asymmetries after acute ACL rupture. Med. Sci. Sports Exerc. 44(8), 1490 (2012)
S.J. Shultz, R.J. Schmitz, A. Benjaminse, M. Collins, K. Ford, A.S. Kulas, ACL research retreat VII: an update on anterior cruciate ligament injury risk factor identification, screening, and prevention. J. Athl. Train. 50(10), 1076â€“1093 (2015)
M.A. Robinson, C.J. Donnelly, J. Tsao, J. Vanrenterghem, Impact of knee modeling approach on indicators and classification of ACL injury risk. Med. Sci. Sports Exerc. 46, 1269â€“1276 (2013)
L.M. Decker, C. Moraiti, N. Stergiou, A.D. Georgoulis, New insights into anterior cruciate ligament deficiency and reconstruction through the assessment of knee kinematic variability in terms of nonlinear dynamics. Knee Surg. Sports Traumatol. Arthrosc. 19(10), 1620â€“1633 (2011)
N. Stergiou, L.M. Decker, Human movement variability, nonlinear dynamics, and pathology: is there a connection? Hum. Mov. Sci. 30(5), 869â€“888 (2011)
K. HebertLosier, L. Schelin, E. Tengman, A. Strong, C.K. HÃ¤ger, Curve analyses reveal altered knee, hip, and trunk kinematics during dropjumps long after anterior cruciate ligament rupture. Knee 25(2), 226â€“239 (2018)
S. Mehdizadeh, The largest Lyapunov exponent of gait in young and elderly individuals: a systematic review. Gait Posture 60, 241â€“250 (2017)
A.R. Jac Fredo, T.R. Josena, R. Palaniappan, A. Mythili, Classification of normal and knee joint disorder vibroarthrographic signals using multifractals and support vector machine. Biomed. Eng.: Appl. Basis Commun. 29(03), 1750016 (2017)
Y. Zhang, X. Ji, B. Liu, D. Huang, F. Xie, Y. Zhang, Combined feature extraction method for classification of EEG signals. Neural Comput. Appl. 28(11), 3153â€“3161 (2017)
J.M. Yentes, N. Hunt, K.K. Schmid, J.P. Kaipust, D. McGrath, N. Stergiou, The appropriate use of approximate entropy and sample entropy with short data sets. Ann. Biomed. Eng. 41(2), 349â€“365 (2013)
J.P. Kaipust, J.M. Huisinga, M. Filipi, N. Stergiou, Gait variability measures reveal differences between multiple sclerosis patients and healthy controls. Mot. Control 16(2), 229â€“244 (2012)
N. Stergiou, C. Moraiti, G. Giakas, S. Ristanis, A.D. Georgoulis, The effect of the walking speed on the stability of the anterior cruciate ligament deficient knee. Clin. Biomech. 19(9), 957â€“963 (2004)
C. Moraiti, N. Stergiou, S. Ristanis, A.D. Georgoulis, ACL deficiency affects stridetostride variability as measured using nonlinear methodology. Knee Surg. Sports Traumatol. Arthrosc. 15(12), 1406â€“1413 (2007)
Z. Englander S.K. Stinton, T.P. Branch, How to Predict Knee Kinematics During an ACL Injury. In: V. Musahl, J. Karlsson, W. Krutsch, B. Mandelbaum, J. EspregueiraMendes, P. dâ€™Hooghe (Eds.), Return to Play in Football. Springer, Berlin, Heidelberg (2018)
H. Koga, A. Nakamae, Y. Shima, R. Bahr, T. Krosshaug, Hip and ankle kinematics in noncontact anterior cruciate ligament injury situations: video analysis using modelbased image matching. Am. J. Sports Med. 46(2), 333â€“340 (2018)
E. Wellsandt, J.A. Zeni, M.J. Axe, L. SnyderMackler, Hip joint biomechanics in those with and without posttraumatic knee osteoarthritis after anterior cruciate ligament injury. Clin. Biomech. 50, 63â€“69 (2017)
H.F. Hart, N.J. Collins, D.C. Ackland, S.M. Cowan, K.M. Crossley, Gait characteristics of people with lateral knee OA after ACL reconstruction. Med. Sci. Sports Exerc. 47(11), 2406â€“2415 (2015)
J. Christian, J. KrÃ¶ll, G. Strutzenberger, N. Alexander, M. Ofner, H. Schwameder, Computer aided analysis of gait patterns in patients with acute anterior cruciate ligament injury. Clin. Biomech. 33, 55â€“60 (2016)
M. Berruto, F. Uboldi, L. Gala, B. Marelli, W. Albisetti, Is triaxial accelerometer reliable in the evaluation and grading of knee pivot shift phenomenon? Knee Surg. Sports Traumatol. Arthrosc. 21(4), 981â€“985 (2013)
S. Kopf, R. Kauert, J. Halfpaap, T. Jung, R. Becker, A new quantitative method for pivot shift grading. Knee Surg. Sports Traumatol. Arthrosc. 20(4), 718â€“723 (2012)
S. Almosnino, S.C. Brandon, A.G. Day, J.M. Stevenson, Z. Dvir, D.D. Bardana, Principal component modeling of isokinetic moment curves for discriminating between the injured and healthy knees of unilateral ACL deficient patients. J. Electromyogr. Kinesiol. 24(1), 134â€“143 (2014)
I. McCarthy, D. Hodgins, A. Mor, A. Elbaz, G. Segal, Analysis of knee flexion characteristics and how they alter with the onset of knee osteoarthritis: A case control study. BMC Musculoskelet. Disord. 14(1), 169 (2013)
C.L. Webber Jr, N. Marwan, Recurrence quantification analysis. Theory and Best Practices (2015)
J.P. Zbilut, A. Giuliani, C.L. Webber Jr., Recurrence quantification analysis and principal components in the detection of short complex signals. Phys. Lett. A 237(3), 131â€“135 (1997)
U.R. Acharya, S.V. Sree, S. Chattopadhyay, W. Yu, P.C.A. Ang, Application of recurrence quantification analysis for the automated identification of epileptic EEG signals. Int. J. Neural Syst. 21(03), 199â€“211 (2011)
P. Prabhu, A.K. Karunakar, H. Anitha, N. Pradhan, Classification of gait signals into different neurodegenerative diseases using statistical analysis and recurrence quantification analysis. Pattern Recognit. Lett. (2018). https://doi.org/10.1016/j.patrec.2018.05.006
F. Takens, Detecting strange attractors in turbulence. In: Dynamical Systems and Turbulence, Warwick 1980, Springer, Berlin/Heidelberg, 1981, pp. 366â€“381 (1980)
Xu B, Jacquir S, Laurent G, Bilbault JM, Binczak S (2013) Phase space reconstruction of an experimental model of cardiac field potential in normal and arrhythmic conditions, in: 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 3274â€“3277
JosiÅ„ski H, ÅšwitoÅ„ski A, Michalczuk A, Wojciechowski K (2015) Phase space reconstruction and estimation of the largest Lyapunov exponent for gait kinematic data. In AIP Conference Proceedings (Vol. 1648, No. 1, p. 660006). AIP Publishing
H.B. Xie, W.T. Chen, W.X. He, H. Liu, Complexity analysis of the biomedical signal using fuzzy entropy measurement. Appl. Soft Comput. 11(2), 2871â€“2879 (2011)
S. Solnik, P. Rider, K. Steinweg, P. DeVita, T. Hortobagyi, TeagerKaiser energy operator signal conditioning improves EMG onset detection. Eur. J. Appl. Physiol. 110(3), 489â€“498 (2010)
M.U.B. Altaf, T. Butko, B.H.F. Juang, Acoustic gaits: Gait analysis with footstep sounds. IEEE Trans. Biomed. Eng. 62(8), 2001â€“2011 (2015)
Jabloun M (2017) A new generalization of the discrete TeagerKaiser energy operatorapplication to biomedical signals. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 41534157
J.P. Eckmann, S.O. Kamphorst, D. Ruelle, Recurrence plots of dynamical systems. Europhys. Lett. 4(9), 973 (1987)
W. Chen, Z. Wang, H. Xie, W. Yu, Characterization of surface EMG signal based on fuzzy entropy. IEEE Trans. Neural Syst. Rehabil. Eng. 15(2), 266â€“272 (2007)
H.B. Xie, B. Sivakumar, T.W. Boonstra, K. Mengersen, Fuzzy entropy and its application for enhanced subspace filtering. IEEE Trans. Fuzzy Syst. 26(4), 1970â€“1982 (2018)
J.J. Buckley, Fuzzy Probability and Statistics (Springer, Heidelberg, 2006), pp. 223â€“234
Kaiser JF (1990) On a simple algorithm to calculate the â€˜energyâ€™ of a signal. In IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 381384
P. Maragos, J.F. Kaiser, T.F. Quatieri, Energy separation in signal modulations with application to speech analysis. IEEE Trans. Signal Process. 41(10), 3024â€“3051 (1993)
F. AlThobiani, A. Ball, An approach to fault diagnosis of reciprocating compressor valves using TeagerKaiser energy operator and deep belief networks. Expert Syst. Appl. 41(9), 4113â€“4122 (2014)
Y. Xia, Q. Gao, Q. Ye, Classification of gait rhythm signals between patients with neurodegenerative diseases and normal subjects: Experiments with statistical features and different classification models. Biomed. Signal Process. Control 18, 254â€“262 (2015)
M. Yang, H. Zheng, H. Wang, S. McClean, J. Hall, N. Harris, A machine learning approach to assessing gait patterns for complex regional pain syndrome. Med. Eng. Phys. 34(6), 740â€“746 (2012)
V.N. Vapnik, Statistical Learning Theory (Wiley, New York, 1998)
S. Zhang, X. Li, M. Zong, X. Zhu, D. Cheng, Learning k for knn classification. ACM Trans. Intell. Syst. Technol. 8(3), 43 (2017)
J.O. Berger, Statistical Decision Theory and Bayesian Analysis (Springer, New York, 2013)
J. Tanha, M. van Someren, H. Afsarmanesh, Semisupervised selftraining for decision tree classifiers. Int. J. Mach. Learn. Cybern. 8(1), 355â€“370 (2017)
G. Wang, J. Sun, J. Ma, K. Xu, J. Gu, Sentiment classification: the contribution of ensemble learning. Decis. Support Syst. 57, 77â€“93 (2014)
Y. Freund, R.E. Schapire, Experiments with a New boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 148â€“156 (1996)
A.T. Azar, S.A. ElSaid, Performance analysis of support vector machines classifiers in breast cancer mammography recognition. Neural Comput. Appl. 24, 1163â€“1177 (2014)
K. Chu, An introduction to sensitivity, specificity, predictive values and likelihood ratios. Emerg. Med. Australas. 11(3), 175â€“181 (1999)
Q. Yuan, C. Cai, H. Xiao, X. Liu, Y. Wen, Diagnosis of breast tumours and evaluation of prognostic risk by using machine learning approaches, in Advanced intelligent computing theories and applications. With aspects of contemporary intelligent computing techniques. ed. by D.S. Huang, L. Heutte, M. Loog (Springer, New York, 2007), pp. 1250â€“1260
Acknowledgements
Not applicable.
Funding
This work was supported by the National Natural Science Foundation of China (Grant nos. 61773194, 61304084), by the Natural Science Foundation of Fujian Province of China (Grant no. 2018J01542) and by the Program for New Century Excellent Talents in Fujian Province University. The funders were not involved in the study design, data collection, analysis, decision to publish, or production of this manuscript.
Author information
Authors and Affiliations
Contributions
Study concept and design (WZ); drafting of the manuscript (WZ); critical revision of the manuscript for important intellectual content (WZ); obtained funding (WZ); administrative, technical, and material support (SI and EP); study supervision (WZ and EP). All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This study was approved by the ethical review board (2014/547), The University of Sydney, Australia. A written informed consent was obtained from each participant before data collection began.
Consent for publication
Not applicable.
Avaliability of data and materials
The datasets used and/or analyzed during in current study are available from the corresponding author on reasonable requests.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zeng, W., Ismail, S.A. & Pappas, E. The impact of feature extraction and selection for the classification of gait patterns between ACL deficient and intact knees based on different classification models. EURASIP J. Adv. Signal Process. 2021, 95 (2021). https://doi.org/10.1186/s13634021007966
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13634021007966
Keywords
 Gait analysis
 Anterior cruciate ligament
 Dynamical features
 Gait classification
 Feature extraction