Skip to main content

Efficiency of deep neural networks for joint angle modeling in digital gait assessment

Abstract

Reliability and user compliance of the applied sensor system are two key issues of digital healthcare and biomedical informatics. For gait assessment applications, accurate joint angle measurements are important. Inertial measurement units (IMUs) have been used in a variety of applications and can also provide significant information on gait kinematics. However, the nonlinear mechanism of human locomotion results in moderate estimation accuracy of the gait kinematics and thus joint angles. To develop “digital twins” as a digital counterpart of body lower limb joint angles, three-dimensional gait kinematic data were collected. This work investigates the estimation accuracy of different neural networks in modeling lower body joint angles in the sagittal plane using the kinematic records of a single IMU attached to the foot. The evaluation results based on the root mean square error (RMSE) show that long short-term memory (LSTM) networks deliver superior performance in nonlinear modeling of the lower limb joint angles compared to other machine learning (ML) approaches. Accordingly, deep learning based on the LSTM architecture is a promising approach in modeling of gait kinematics using a single IMU, and thus can reduce the required physical IMUs attached on the subject and improve the practical application of the sensor system.

1 Introduction

Wearable sensor systems (WSSs) allow extensive data acquisition in a simple and convenient way regarding their portability and flexible attachment to any part of the body (e.g., lower limbs, upper limbs, torso) [1]. Among them, inertial measurement units (IMUs) are of particular interest to scientists and engineers in diverse application fields due to their small size, low cost, light weight, good precision, and non-invasive characteristics. An inertial sensor performs multiparameter sensing, such as 3D linear acceleration, 3D angular velocity, and 3D magnetic field, and thus allows to capture a wide range of locomotor activities [2, 3]. The main challenge here is to analyze, extract, and translate the relevant information on normal and pathological gait behavior into effective and affordable interventions. This is posed by the high dimensionality and great heterogeneity of the gait data as well as the time and effort involved in the sensor placing and configuration [4]. The issue concerning the high dimensionality of the gait data can be solved by applying conventional dimension reduction techniques such as principal component analysis (PCA) and linear discriminant analysis (LDA). In [5], PCA is applied to get a smaller set of features for the classification of gait data recorded by a multi-sensor wearable system. A similar approach was used in [6] to classify subjects using one IMU placed at different body locations.

To keep the system complexity as low as possible, one solution approach is to reduce the number of sensors in the WSS via sensor virtualization or “digital twins.” The digital equivalent can replace the physical counterpart (see Fig. 1) and requires no special knowledge of the physical structure of the human biomechanics, which in the case of the human movement is complex [7]. In other words, by applying signal and statistical processing methods to data from a smaller number of sensors, the remaining sensor signals can be estimated rather than being directly measured. The authors in [8] proposed a novel memory polynomial model for the estimation of the lower limb joint angles based only on the magnitude of the acceleration signal of one IMU located at the ankle. In [9], an extended Kalman filter (KF) was used to estimate the vertical hip acceleration and sagittal trunk posture applying a heuristically modified Fourier series model based on the vertical acceleration and sagittal angular velocity from one IMU placed at the ankle. A novel double-pendulum model was proposed in [10], in which a small number of sensors attached on both sides of the shank were used to estimate the movements of the thighs. The feasibility of the estimation of gait kinematics with a reduced number of sensors has been demonstrated in previous studies. This serves as the basis for the realization of high-precision, robust, and customizable digital twins capable of reducing hardware sensors, which is to the great advantage of digital healthcare and bioinformatics applications.

Fig. 1
figure 1

Wearable system concept: the gait kinematic data x(n) are collected and processed with machine learning methods in the Android application for digital and biomedical healthcare systems. On the left side, the traditional sensor fusion algorithm based on KF estimates the lower limb joint signals using the information from four IMUs. On the right side, the novel machine learning approach estimates the lower limb joint angles based on the information of only one IMU placed on the foot. The dashed line represents the reference data y(n) for the training and test phases of the different machine learning (ML) approaches

Nevertheless, the most promising way to utilize the vast amount of gait data generated by modern wearable sensors is the use of ML due to its capability of integrating both stochastic and computer science for identifying patterns in large datasets [11, 12]. In the field of gait analysis, healthcare, and bioinformatics, ML is applied in general to either the task of classification or estimation. Regarding the first, different studies have been conducted in the field of gait analysis, such as activity or gait phases recognition [13, 14]. In [15], the authors trained a long short-term memory (LSTM) based on linear acceleration and angular velocities from inertial sensors to automatically classify human activities. A similar approach was proposed in [16], where the data from five different sensors placed on the body were collected and a convolutional neural network (CNN) was trained for subject identification. Regarding estimation, different studies apply ML and deep learning to simulated gait data obtained from the markers of camera systems to assess lower limb kinematics [17, 18]. In [19], a generalized regression neural network (GRNN) was trained to estimate foot, lower leg, and thigh kinematics in the sagittal plane from emulated 2D foot acceleration signals from a complex camera system and four IMUs during walking. In [20], a multilayer perceptron (MLP) was used to simulate the complexities of lower limb motions together with a camera system as input data for the neural network (NN). Few studies have applied deep learning techniques for estimation tasks with real kinematic data. A deep CNN was proposed in [21] for gait parameter extraction based on one IMU attached to the shoe. The authors in [22] obtained a value of 7 RMSE in the estimation of the knee joint angle with mechanomyography signals and a CNN. The potential of NNs and deep learning for modeling the nonlinear relationship of lower body joint angles from foot movement and its applicability as “digital twin” for gait kinematic analysis has not yet been explored. Therefore, different from the aforementioned studies, this work evaluates the performance of different neural networks in modeling lower body joint angles using nonlinear methods to extract significant information from the gait kinematics records of a single IMU attached to the foot.

The rest of the paper is organized as follows: Section 2 presents the system concept, methodology, and the framework applied for the gait segmentation followed by a description of the networks investigated in this work. In Sections 3 and 4, the evaluation results are presented and discussed, respectively. Finally, the main conclusions of this work are presented in Section 5.

2 Methods

Figure 1 illustrates the overall concept of a WSS. It starts with collecting 3D linear acceleration a(t)I R3xT and 3D angular velocity g(t)I R3xT and is followed by the gait cycle segmentation, the extension of the kinematic information, and the estimation of the lower limb joint angles based on different input sets and the training of different NNs. A brief description of the procedure and applied techniques is given in this chapter.

2.1 Wearable sensor system

A digital mobile gait measurement system based on four IMUs was used to collect the reference gait kinematic data for training the networks. The sensor system consists of four IMUs integrated into a sensor platform developed by Shimmer. Specifically, we used the Shimmer 3 sensor platform, which provides real-time motion sensing. The kinematic gait data are transmitted by Bluetooth to an Android application, which we developed for this purpose. The received data is feed to a multi-sensor fusion algorithm, where the lower body joint angles in the sagittal plane are estimated from the kinematic signals of the four IMUs and the gait data are analyzed according to [23]. Figure 2 shows the joint angles estimated with the kinematic gait data of the IMUs. The IMUs were attached using straps to the pelvis, foot, lower leg, and upper leg as shown in Fig. 1. To provide comparable conditions, the same sensors were attached at the same position on each subject. We chose straps due to their flexible structure, lower cost, and easy practical application. The installation is simple, and it takes only a few minutes. There is no need for special laboratory equipment, and the gait measurements can be done in a corridor or in an open space like a park or at home. The IMUs include a 3D linear accelerometer a(t) and 3D gyroscope g(t). The sensor specifications are shown in Table 1. The kinematic gait data were sampled synchronously at a sampling rate of 60 Hz.

Fig. 2
figure 2

Reference lower body joint angles in the sagittal plane during one gait cycle using the kinematic gait data from the IMUs

Table 1 Sensor specification for data acquisition

2.2 Data pre-processing

2.2.1 Gait cycle segmentation

Human walking can be explained and described in the context of a cycle. A stride is the distance between the initial contact (IC) of the first foot and the next IC of the same foot. In other words, a gait cycle is made up of two steps. Each stride contains a stance and a swing phase. The stance and swing phases are the minimum number of phases in which a gait cycle can be divided. A more complex phase model with eight sub-phases is shown in Fig. 3 [24]. The IC and toe off (TOE) can be extracted from the foot angular velocity on the sagittal plane exploiting the feature which, within each gait cycle, the foot alternatively rotates clockwise and counterclockwise about the ankle joint [25].

Fig. 3
figure 3

Gait events and gait phases in one gait cycle. The gait cycle is divided into stance and swing phase. Stance starts from IC to terminal contact TOE. Stance phase nearly represents about 60% of gait cycle. Swing phase begins as soon as the toe leaves the ground, and ends just prior to IC. Swing phase occupies the rest 40% of the gait cycle

The local maxima of the foot angular velocity are detected, and they are associated with the MS phase [26]. As shown in Fig. 4, within each pair of MS peaks, the first negative peak of the foot angular velocity is associated with the IC and the second one with the TOE. The optimal values, for which the peaks comply with the actual/true IC and TOE, are specified as follows: For the MS, the values of minimal peak distance and height were set to 50 samples and 1.7 rad/s, respectively. For the IC/TOE, the minimal peak distance and height were set to 30 samples and 0.5 rad/s, respectively. The value of the minimal peak distance for MS events was determined using the autocorrelation function of the foot angular velocity and calculating the mean and variance of the distance (in samples) between MS peaks of the autocorrelation. From the knowledge on the IC and TOE, it is possible to define the duration of the stride, stance, and swing phases and accordingly other temporal parameters (cadence, step length, gait speed, etc.). The defined IC events were used to segment the kinematic gait data into gait cycles. Each gait cycle was resampled to the length of 100 samples so that all segments have the same length.

Fig. 4
figure 4

Illustration of the angular velocity in the sagittal plane from different locations. The IC and TOE events are obtained from the foot angular velocity. The red triangle, green circle, and black square markers represent the mid-swing (MS), IC, and TOE, respectively

2.2.2 Hilbert-Huang transformation

In [27, 28], nonlinear and non-stationary methods are proposed to analyze the gait signals due to its potential to extract complex relationships in the gait signals, which cannot be found with linear methods. Therefore, the Hilbert-Huang transformation (HHT) is used in this work to extend the input data of the NN exploiting the nonlinear relation between the foot signals and the lower limb joint angles [29]. The HHT applies the empirical mode decomposition (EMD) and the Hilbert transform (HT) [28]. The most important part of the HHT is the EMD method which allows the decomposition of any data into a finite small number of intrinsic mode functions (IMFs). The IMF fulfills two conditions: First, the number of maxima and zero crossing values must be equal or differ at least by one. Second, the average value of the envelopes corresponding to those created by the local maxima and minima must be zero. The EMD offers a possibility to exploit the information hidden in the gait signals, and can be calculated from each gait cycle s(t) using the following steps:

  • Detection of all extrema (minima and maxima) of s(t).

  • Interpolation and cubic spline curve fitting of the maxima and minima to obtain the upper envelope u(t) and the lower envelope l(t), respectively.

  • Calculation of the mean \(m_{1}(t)=\frac {u(t)+l(t)}{2}\) and the mode \(g_{1}(t)=\frac {u(t)-l(t)}{2}\) function of the envelopes.

  • Calculation of the first component by subtracting the mean envelope function from the segment c1(t)=s(t)−m1(t).

  • In case of c1(t) satisfies the IMF conditions, imf1(t)=c1(t) and continuing with the next step, otherwise replacing s(t) with c1(t) and iteration of the first four steps.

  • Calculation of the residual r1(t)=s(t)−c1(t) and iteration of the previous steps until becoming a monotonic function as final residual, which is the trend of the segment.

Once the algorithm ends, the gait cycle s(t) can be expressed as a linear superposition of IMFs by:

$$ s(t)=\sum_{i=1}^{n} \mathrm{{imf}_{i}}(t)+r_{n}(t)\, {,} $$
(1)

where i=1,…,n is the number of IMFs. Figure 5 shows the first two IMFs of the foot angular velocity in the sagittal plane. Once the gait cycle representation is obtained as a superposition of zero mean oscillatory modes, the HT can be applied to each IMF as follows:

$$ \mathrm{H}[\mathrm{{imf}_{i}}(t)]=\frac{1}{\pi} PV \int_{-\infty}^{\infty}\frac{\mathrm{{imf}_{i}}(\tau)}{t-\tau}\mathrm{d}\tau. $$
(2)
Fig. 5
figure 5

IMF signal decomposition for the foot angular velocity in the sagittal plane which is used as input for the networks

PV denotes the Cauchy principal value of the integral. The residue rn(t) should be left out of the Hilbert spectral analysis, since it is a monotonic function or a constant. The analytic signal zi(t) is defined by:

$$ z_{i}(t)=\mathrm{{imf}_{i}}(t)+j\mathrm{H}[\mathrm{{imf}_{i}}(t)]=a_{i}(t)e^{j\theta_{i}(t)}\, {,} $$
(3)

where \(a_{i}(t) = \sqrt {{\mathrm {{imf}_{i}(t)}}^{2}+{\mathrm {H}[\mathrm {{imf}_{i}}(t)]^{2}}}\) is the instantaneous amplitude. To extract the instantaneous frequency (IF) and the instantaneous energy (IE) of each IMF, the derivative of the phase θi(t) and the squared magnitude of ai(t) are computed as below:

$$\begin{array}{*{20}l} f_{i}(t)&=\frac{1}{\pi}\frac{\mathrm{d}\theta_{i}(t)}{\mathrm{d}t}\, {,} \end{array} $$
(4)
$$\begin{array}{*{20}l} e_{i}(t) &= {|a_{i}(t)|}^{2}. \end{array} $$
(5)

2.2.3 Norm of the kinematic gait data

One method to extend the information of the raw kinematic gait data is to include signals norm [16]. The norm of the 3D linear acceleration a(t) and 3D angular velocity g(t) signals can be obtained by \(\boldsymbol {A}(t)= \sqrt {\mathbf {a}_{x}(t)^{2}+\mathbf {a}_{y}(t)^{2}+\mathbf {a}_{z}(t)^{2}}\) and \(\boldsymbol {G}(t)= \sqrt {\boldsymbol {g}_{x}(t)^{2}+\boldsymbol {g}_{y}(t)^{2}+\boldsymbol {g}_{z}(t)^{2}}\) for t=1,2,...,T where T is the length of the kinematic signal.

2.2.4 Dataset

For this work, 20 healthy subjects (mean age, 28±4 years; height, 181±3.5 cm) were considered. The medical history of all participants showed no pathological findings or surgical intervention in the lower limbs. The data recording was performed via wearable wireless IMUs as described in Section 2.1. Only the three-dimensional linear acceleration and three-dimensional angular velocity signals were recorded for the investigations using the WSS and an Android tablet. The lower limb joint angles in the sagittal plane were calculated using the information from four IMUs and a KF. The IMUs were placed on the right side at the foot, lower leg, upper leg, and pelvis of the participants as shown in Fig. 1. They were secured with tight tape to reduce motion artifacts. Each participant performed a walk test in forward direction of around 20 m at a preferred velocity, and subsequently five walking trials. The kinematic gait data of the subjects were recorded and segmented in gait cycles. In order to extend the information of the kinematic gait signals, for each gait cycle, the norm of the acceleration and the gyroscope and the HHT were calculated. The number of IMFs used in this work is two.

Table 2 shows the different input sets used to train the different networks. The input sets have the dimensions Di × 124,400, where Di is the total number of signals of each input set (i=1,...,5) and 124,400 is the length of the kinematic signals. The first input set comprised 3D linear acceleration and angular velocity signals from the IMU on the foot. The second set extends the information of the kinematic signals using the norm of the acceleration and angular velocity. The third set includes the IMFs of the kinematic signals. The fourth and the fifth sets additionally include the IF and the IE information, respectively. The output of the networks is the lower body joint angles in the sagittal plane with dimensions 3 × 124,400.

Table 2 Input sets of the neural networks

2.3 Neural networks

A NN is a computing model whose layered structure resembles the networked structure of neurons in the brain, with layers of connected nodes [30]. NNs can be trained to recognize patterns, classify data, and estimate future events. A NN breaks down the input into different layers of abstraction. Its behavior is defined by the way in which individual neurons are connected and by the weights of those connections. These weights are automatically adjusted during training according to a specified learning rule until the neural network achieves a desired level of performance. Regarding the sequential and nonlinear characteristics of the kinematic gait signals, in this we considered GRNN, nonlinear autoregressive network with exogenous inputs (NARX), and LSTM networks for black box modeling between the kinematic gait data and the lower body joint angles in the sagittal plane. A brief introduction of the network architectures is provided in the following sections.

2.3.1 Generalized regression neural networks

GRNNs are used in different applications related to modeling, system identification, prediction, and control of dynamic systems [31]. It has been shown that GRNNs can be also applied for joint angle estimation using kinematic data [19] and it will be used in this work as reference to compare the performance of the different networks. The GRNN is a single-pass neural network which uses a Gaussian activation function in the hidden layer [32]. The model process is based on kernel density estimation from a set inputs xi (kinematic data) and outputs yi (joint angles). The GRNN estimation applies the conditional expectation of output \(\hat {y}_{i}(n)\) given the input xi(n):

$$ \hat{y}_{i}(x) = E[y_{i}|x_{i}] = \int y_{i}\, p(y_{i}|x_{i}) \mathrm{d}y_{i}= \frac{\int y_{i}\,p(y_{i},x_{i})\mathrm{d}y_{i}}{\int p(y_{i},x_{i})\mathrm{d}y_{i}}\, {,} $$
(6)

where p(yi|xi) is the conditional probability density function [32]. It is possible to estimate the joint probability density \(\hat {p}(y_{i},x_{i})\) given by:

$$ \hat{p}(y_{i},x_{i}) = \frac{1}{K} \sum_{i=1}^{K}\frac{1}{(2\pi)^{(D_{i}+1/2)}\epsilon^{(D_{i}+1)}}e^{-\left(\frac{(y-y_{i})^{2}+\left\|x-x_{i}\right\|^{2}}{2\epsilon^{2}}\right)}\, {.} $$
(7)

After simplifying the integrals in the numerator and denominator, the final expression of the estimator is given by:

$$ hat{y}_{i}(x)= \frac{\sum_{i=1}^{K}y_{i} e^{\left(-\left\|x-x_{i}\right\|^{2} / 2\epsilon^{2}\right)}}{\sum_{i=1}^{K}e^{\left(-\left\|x-x_{i}\right\|^{2} / 2\epsilon^{2}\right)}}\, {,} $$
(8)

where the parameter ε is the bandwidth of the Gaussian kernel.

2.3.2 Recurrent neural network

Recurrent neural networks (RNNs) can represent time series, audio, video, and anything that is presented by means of data sequences. In the sequence data, the present values depend on their past values, as it is the case for the joint angles. RNNs are able to learn arbitrary nonlinear dynamical mappings, such as those commonly found in nonlinear time series prediction [33]. They are not only of interest for the prediction of time series but also generally for the control of the dynamical systems. In [34, 35], the authors explored the possibilities of knee and ankle angle prediction using the surface electromyography signal by applying NARX and LSTM networks. They proved the efficient applicability of recurrent neural network-based nonlinear models for predicting human lower limb joint angles. Compared with feedforward neural networks (FNNs), where the data flow occurs only in one direction, RNNs apply a back-coupling which results in an asynchronous data flow between nodes. The architecture of a simple RNN is similar to that of a MLP, except that the output of the neuron in hidden layer is fed back to itself with a weight and a time delay as depicted in Fig. 6. The feedback of previous hidden values (memory effect) allows the network to learn the temporal dynamics of sequential data. A RNN maps a input sequence \(\boldsymbol {x}_{t}^{l}\) to a sequence of hidden values \(\mathbf {h}_{t}^{l}=(\boldsymbol {h}_{1}^{l},\ldots,\boldsymbol {h}_{T}^{l})\) and outputs a sequence \(\boldsymbol {y}_{t}^{l}\) iteratively using following equations:

$$\begin{array}{*{20}l} \boldsymbol{h}_{t}^{l}&=\phi\left(W_{xh}^{l}\boldsymbol{x}_{t}+W_{hh}^{l}\boldsymbol{h}_{t-1}^{l}+\boldsymbol{b}_{h}^{l}\right) \, {,} \end{array} $$
(9)
Fig. 6
figure 6

Architecture of a RNN with an input, a hidden, and an output layer. The network maps the input sequence \(\boldsymbol {x}_{D_{i},t}\) to a hidden sequence hn,t and to a sequence of outputs ym,t. The parameters Di, n, and m are the number of signals, the number of hidden units, and the number of outputs, respectively. Wxh, Whh, and Why are the input to hidden, hidden to hidden, and hidden to output matrices, respectively. The bias vectors of the network are represented by bh for the hidden layer and by by for the output layer

$$\begin{array}{*{20}l} \boldsymbol{\hat{y}}^{l+1}_{t}&=W_{hy}^{l}\boldsymbol{h}_{t}^{l}+\boldsymbol{b}_{y}^{l}\, {,} \end{array} $$
(10)

from t=1,2,…,T where ϕ(.) is the hidden layer activation function, W is the weight matrices (\(W_{xh}^{l}\) is the input to hidden weight matrix, \(W_{hh}^{l}\) is the hidden to hidden weight matrix, and \(W_{hy}^{l}\) is the hidden to output weight matrix), \(\boldsymbol {b}_{h}^{l}\) is the hidden bias vector, and \(\boldsymbol {b}_{y}^{l}\) is the bias vector of the output.

2.3.3 Nonlinear autoregressive networks with exogenous inputs

NARXs have been considered as a good predictor for time series problems and used to model various nonlinear dynamic systems [36]. They provide the ability to incorporate past values of estimated output y(t) and exogenous inputs x(t). This property of the NARX network makes it more suitable for the modeling problem as the time history information of inputs (kinematic gait data) and past values of the output (estimated lower body joint angles) carry a significant amount of information. The general mathematical relationship between inputs and outputs for a NARX neural network model is represented as:

$$ \boldsymbol{\hat{y}}(t) = \phi\left(\boldsymbol{x}(t),\ldots,\boldsymbol{x}(t-p),\boldsymbol{y}(t-1),\ldots,\boldsymbol{y}(t-q)\right)\, {,} $$
(11)

where the value of the estimated output signal \(\boldsymbol {\hat {y}}(n)\) depends on q previous output values and p previous input values. ϕ(.) is the nonlinear mapping function and is approximated by the feedforward neural network. The TDL blocks introduce past values (q and p) of the input and output signals to the network. Due to the advantages over the parallel architecture, such as higher accuracy of the feedforward network input, pure feedforward architecture, and use of static backpropagation for training, the serial-parallel architecture shown in Fig. 7 is considered in this work. The hyperparameters used for the training are explained in Section 2.3.5.

Fig. 7
figure 7

A NARX network with series-parallel architecture. The taped delay line (TDL) blocks introduce past values (memory effect) of the input and output signals to the network

2.3.4 Long short-term memory network

LSTM networks are a special type of RNN. The LSTM cell reads the input time series sequentially and transforms the input data into a hidden state at each time step, whereby the current hidden state is a nonlinear function of the current input and the previous hidden state. The advantage of LSTM networks over other types of RNN is that the dependency of the current on the previous hidden state is designed in such a way that the LSTM obtains the ability to keep parts of its hidden state over a larger number of time steps than is possible with other RNN architectures, such as NARX. In [18], this type of network was used to estimate the lower body joint angles with simulated kinematic data obtained from the markers of a camera system. The main cell of a LSTM shown in Fig. 8 is made of input, output, and forget gates. The concept of gate was introduced to avoid the problems with vanishing or exploding gradients [37]. The LSTM cell remembers the values over an arbitrary interval of time and the other gates can be seen as neurons with an activation function based on the current data xt, a hidden state ht−1 from the previous iteration, the weight matrices Wij, and bias bi associated to the gates i and j. The activation functions are sigmoid (σ) or tanh. The gates can be seen as the flow regulator of values through the LSTM connections, and control which operation is performed by the cell at each iteration. For the sake of clarity, the super index l has been omitted. For a LSTM cell, the evolution of its parameters is determined at each iteration by:

$$\begin{array}{*{20}l} \boldsymbol{\mathrm{i}}_{t}&=\phi_{i}\left(W_{xi}\boldsymbol{x}_{t}+W_{hi}\boldsymbol{h}_{t-1}+\boldsymbol{b}_{i}\right) \end{array} $$
(12)
Fig. 8
figure 8

LSTM cell. The cell can process data sequentially and keeps its hidden state h over the time

$$\begin{array}{*{20}l} \boldsymbol{\mathrm{f}}_{t}&=\phi\left(W_{xf}\boldsymbol{x}_{t} + W_{hf}\boldsymbol{h}_{t-1} +\boldsymbol{b}_{f}\right) \end{array} $$
(13)
$$\begin{array}{*{20}l} \boldsymbol{\mathrm{o}}_{t}&=\phi\left(W_{xo}\boldsymbol{x}_{t}+W_{ho}\boldsymbol{h}_{t-1}]+\boldsymbol{b}_{o}\right) \end{array} $$
(14)
$$\begin{array}{*{20}l} \boldsymbol{\mathrm{g}}_{t}&=\phi\left(W_{xg}\boldsymbol{x}_{t}+W_{hg}\boldsymbol{h}_{t-1}]+\boldsymbol{b}_{g}\right) \end{array} $$
(15)
$$\begin{array}{*{20}l} \boldsymbol{\mathrm{c}}_{t}&=\boldsymbol{\mathrm{f}}_{t}\boldsymbol{\mathrm{c}}_{t-1}+\boldsymbol{\mathrm{i}}_{t}\boldsymbol{\tilde{\mathrm{g}}}_{t} \end{array} $$
(16)
$$\begin{array}{*{20}l} \boldsymbol{\mathrm{y}}_{t}&=\boldsymbol{h}_{t}=\boldsymbol{o}_{t}\phi(\boldsymbol{c}_{t})\, {,} \end{array} $$
(17)

where it, ft, ot, and ct are the input, forget, output, and cell activation gates, respectively. The weights Wij and biases bi of the gate connections are learned or updated during the network training.

2.3.5 Training

The networks analyzed in this study were implemented in Matlab. They were trained to model the relation between the kinematic gait data from one IMU placed on the foot and the lower body joint angles in the sagittal plane. Five different sets of kinematic signals (see Table 2) were used as input and reference lower body joint angles in the sagittal plane as output. The input sets were divided in 80%/20% for the training and test phases, respectively. The performance metric used to compare the different networks is the RMSE, which is calculated according to:

$$ \text{RMSE}=\sqrt{\frac{1}{L}\sum_{n=1}^{L}\left(y(n)-\hat{y}(n)\right)^{2}}\, {,} $$
(18)

where L is the length of the signals. y(n) and \(\hat {y}(n)\) are the reference and estimation signals, respectively. The evaluation of the different networks is based on a 10-fold cross-validation scheme to reflect random influences of the data. Due to the different ranges of motion of the lower limbs and the amplitudes of input signals, a normalization of the signals (training and test) is carried out separately. For that reason, the signals are normalized to the range [ −1,1] using the following equation:

$$ y'=2\frac{y-\text{min}(y)}{\text{max}(y)-\text{min}(y)}-1 $$
(19)

where y is the normalized reference signal. After the training phase, the signals were scaled back to the original amplitudes. Since ε is the only free parameter in GRNN, the network was trained using a grid search to find the optimal value for the ε parameter. The optimal value for ε was determined experimentally and amounts to 1.3.

The NARX network includes three layers (an input layer, a hidden layer, and an output layer) and a feedback connection enclosing the input layer. In the input layer, a TDL of two samples was experimentally found to achieve the best performance for different numbers of neurons. A hyperbolic tangent function and a linear function were selected as the transfer functions of the hidden and output layers. After an iterative process to choose the number of neurons of the hidden layer, the optimal number of neurons was set to 100. The training function applied was the scaled conjugate gradient [38].

The structure of the LSTM network was built with three LSTM layers, a fully connected layer, and a linear layer. To prevent overfitting, dropout layers were used after each LSTM layer. This technique effectively samples a large number of thinned architectures on the hidden layers by randomly dropping nodes during training. The dropout rate was fix to 0.3. The LSTM network was trained using the state-of-the-art Adam optimization method [39], which solves an optimization problem viewed as an error function depending on the network parameters. The error measures the difference between the reference and the estimation on the training input set. The backpropagation algorithm changes the weights and biases of all layers with the goal of minimizing the error. In practice, only random subsets of the training data called mini-batches are given to the optimization algorithm in one iteration of the training phase to improve the speed of the learning phase [40]. Different mini-batch sizes were analyzed, and the best trade-off between speed and performance was found to be 100. The weight initialization was performed using Xavier [37]. The learning rate was set to 0.01. The training epochs were set to 50, which was found to achieve a good trade-off between generalization and classification accuracy, and at the same time avoids overfitting. A stop loss criterion was applied to the training progress by evaluating the validation loss over the validation steps. The training was stopped if there was no improvement in the validation loss during the last 3 validation checks. The configuration of the computer used for training the networks consisted of an Intel®Core 10980XE™, 128 GB RAM and two NVIDIA GeForce RTX 2070 Super.

3 Results

In this section, the results achieved with the neural networks presented in Section 2.3 are given. Table 3 shows the performance of different input sets in the nonlinear estimation of the lower body joint angles in the sagittal plane using the proposed networks. Figure 9 shows the joint angle estimation using the GRNN. Figure 10 presents the joint angle estimation using the proposed NARX network. Figure 11 shows the joint angle estimation using the proposed LSTM network. The blue lines represent the reference joint angles, and the red dashed lines represent the estimated joint angles. As seen in Table 3, the LSTM outperforms the other NNs and achieves up to 1.85 and 0.63 better results for the estimation of the lower limb joint angles compared to GRNN and NARX, respectively. According to previous studies [14, 41], the extension of the information from kinematic signals through transformations and using signals from accelerometer and gyroscopes improves the performance of the networks. In this study, the best results in terms of RMSE were found with input set 4 and an average RMSE of 1.91, 2.12, and 2.57 for the hip, knee, and ankle joint angles was achieved.

Fig. 9
figure 9

Estimated lower limb joint angles in the sagittal plane using the GRNN with input set 4 and the reference lower limb joint angles from the wearable system. Blue solid lines and red dashed lines are the reference and the estimated joint angles, respectively

Fig. 10
figure 10

Estimated lower limb joint angles in the sagittal plane using the NARX with input set 4 and the reference lower limb joint angles from the wearable system. Blue solid lines and red dashed lines are the reference and the estimated joint angles, respectively

Fig. 11
figure 11

Estimated lower limb joint angles in the sagittal plane using the LSTM with input set 4 and the reference lower limb joint angles from the wearable system. Blue solid lines and red dashed lines are the reference and the estimated joint angles, respectively

Table 3 Average RMSE performance comparison of the different neural networks and input sets

4 Discussion

The aim of this study was to evaluate the efficiency of nonlinear techniques in predicting lower limb joint angles from one IMU placed on the foot and provide an easy-to-use wearable system. Therefore, different neural network structures were investigated and the analysis framework was introduced. The first part of the analysis was the segmentation of the gait signals using the gyroscope information to find the IC and TOE events. Afterward, the gait information of each gait cycle was extended by including the norm and the HHT. Five different input sets were fed successively to train the NNs. The lower limb joint angles’ prediction improved using input sets 2 and 4, and the best results were achieved with input set 4. Due to extension of information, the NNs were capable to learn the nonlinear relationship between the foot movement and the joint angles and to reduce the estimation error. The LSTM performed better than NARX and GRNN in terms of RMSE, respectively. One possible reason for the better performance of RNN compared to GRNN is that GRNNs are single-pass neural networks with no back propagation. Another reason is that the GRNNs do not incorporate the previous values for the estimation of the joint angles. The performance difference between LSTM and NARX can be explained due to the LSTM cell structure (see Fig. 8). The cell has the ability to forget parts of its previously stored memory, as well as to add parts of the new information over a larger number of time steps.

Previous studies on the estimation of lower limb joint angles have been conducted in the recent years. However, they differ from our work in the type of applied sensors, the number and location of IMUs, the NN structure, the number of datasets, and the type of data (simulated/virtual kinematic). In [20], an artificial NN is applied to simulate the progression of angle values in the lower limbs, where the angles extracted from a camera system are the inputs for the network, and the correlation coefficient between the input and output signals serves as the performance measure. In [17], the lower limb joint angles were estimated using a CNN with a camera system, 23 markers, and 9 strain sensors. The RMSE results obtained in the sagittal plane for inter-participant were 5.39, 6.38, and 3.92 for the hip, knee, and ankle, respectively. The authors in [22] obtained a value of 7 RMSE in the estimation of the knee joint angle with mechanomyography signals and a CNN. The outcomes reported in [18] which are 1.74, 1.92, and 1.80 for the hip, knee, and ankle joint angles in the sagittal plane, are comparable with those in this work. However, the larger dataset applied in their study and the simulated kinematic data obtained from the markers of the camera system, which do not include the soft tissue movements measured by IMUs, could explain the relatively better performance in the estimation of the lower limb joint angles. According to our studies, the pattern and range of motion of the lower limb joint angles vary from one subject to another, and in particular those for the ankle and knee are less consistent compared to those of the hip. A larger dataset could further improve the estimation results. In addition, the anthropometric differences and various walking styles of the subjects can lead to individual biomechanical gait parameters, and thus estimation errors, which can be tolerated to a certain degree.

5 Conclusions

The main focus of this work is to investigate the efficiency of machine learning and deep neural networks applied in joint angle nonlinear modeling of the lower limbs in the sagittal plane using the kinematic records of a single IMU placed on the foot. The original contributions of this work are summarized as follows. First, an overall performance assessment framework is established, where a thorough accuracy estimation as a function of the applied network and the number and type of datasets is obtained. Second, the application of the magnitude and the HHT of the kinematic gait signals to provide additional information for the network such as IMF, IF, and IE is investigated.

A comparison of three different neural network approaches with different input combinations was performed, and the RMSE was used to assess the estimation accuracy of the lower limb joint angles. The LSTM outperforms the GRNN and NARX networks and achieves up to 1.85 and 0.63 better accuracy estimation of the lower limb joint angles, respectively. It is also shown that, including the magnitude, the IMF and IF of the kinematic signals provide an accuracy improvement of about 0.7 on average. The best results in terms of RMSE were obtained with input set 4, and an average RMSE of 1.91, 2.12, and 2.57 for the hip, knee, and ankle joint angles was achieved. According to the evaluation results, LSTM networks are very accurate in the estimation of lower limb joint angles and of great potential in building digital twins for gait rehabilitation. Future research activities could focus on the lower limb joint estimation using CNNs to reduce the estimation error, perform estimation more precisely, and simplify the pre-processing steps on the kinematic data. The application of the proposed framework to a bigger number of subjects and different walking conditions to obtain a one-fits-all modeling approach will be the subject of future studies as well. Looking ahead, this work supports the use of wearable sensors in combination with machine learning techniques for the estimation of lower limb joint angles in the sagittal plane in digital health and rehabilitation applications. Ultimately, the straightforward and easy use of the proposed wearable system in the form of digital twins has considerable practical implications, and opens new possibilities for in-field diagnosis and better prevention strategies.

Availability of data and materials

The data that support the findings of this study are available on request from the corresponding author J.C.A. The data are not publicly available due to the information that could compromise research participant privacy and consent.

Abbreviations

CNN:

Convolutional neural network

EMD:

Empirical mode decomposition

FNN:

Feedforward neural network

GRNN:

Generalized regression neural network

HHT:

Hilbert-Huang transformation

HT:

Hilbert transform

IC:

Initial contact

IE:

Instantaneous energy

IF:

Instantaneous frequency

IMF:

Intrinsic mode function

IMU:

Inertial measurement unit

KF:

Kalman filter

LDA:

Linear discriminant analysis

LSTM:

Long short-term memory

ML:

Machine learning

MLP:

Multilayer perceptron

MS:

Mid-swing

NARX:

Nonlinear autoregressive network with exogenous inputs

NN:

Neural network

PCA:

Principal component analysis

RMSE:

Root mean square error

RNN:

Recurrent neural network

TDL:

Taped delay line

TOE:

Toe off

WSS:

Wearable sensor system

References

  1. J. Mendes Jr, M. Vieira, M. Pires, S. Stevan Jr, Sensor Fusion and Smart Sensor in Sports and Biomedical Applications. Sensors. 16(10), 1569 (2016). https://doi.org/10.3390/s16101569.

    Article  Google Scholar 

  2. M. Rana, V. Mittal, Wearable sensors for real-time kinematics analysis in sports: a review. IEEE Sensors J.21(2), 1187–1207 (2021). https://doi.org/10.1109/JSEN.2020.3019016.

    Article  Google Scholar 

  3. N. Mohammadian Rad, T. Van Laarhoven, C. Furlanello, E. Marchiori, Novelty detection using deep normative modeling for IMU-based abnormal movement monitoring in Parkinson’s disease and autism spectrum disorders. Sensors. 18(10), 3533 (2018). https://doi.org/10.3390/s18103533.

    Article  Google Scholar 

  4. S. Qiu, L. Liu, H. Zhao, Z. Wang, Y. Jiang, MEMS inertial sensors based gait analysis for rehabilitation assessment via multi-sensor fusion. Micromachines. 9(9), 442 (2018). https://doi.org/10.3390/mi9090442.

    Article  Google Scholar 

  5. D. Kobsar, R. Ferber, Wearable sensor data to track subject-specific movement patterns related to clinical outcomes using a machine learning approach. Sensors. 18(9), 2828 (2018). https://doi.org/10.3390/s18092828.

    Article  Google Scholar 

  6. C. Mao, Y. Li, F. Sun, in 2018 IEEE 23rd International Conference on Digital Signal Processing (DSP). Accelerometer-Based Gait Recognition Using PCA LDA Algorithms, (2018), pp. 1–4.

  7. G. Mayer-Kress, Y. T. Liu, K. M. Newell, Complex systems and human movement. Complexity. 12(2), 40–51 (2006). https://doi.org/10.1002/cplx.20151.

    Article  Google Scholar 

  8. J. Conte Alcaraz, S. Moghaddamnia, M. Fuhrwerk, J. Peissig, in 2019 27th European Signal Processing Conference (EUSIPCO). Efficiency of the Memory Polynomial Model in Realizing Digital Twins for Gait Assessment, (2019), pp. 1–5. https://doi.org/10.23919/EUSIPCO.2019.8903143.

  9. A. Baghdadi, L. A. Cavuoto, J. L. Crassidis, Hip and trunk kinematics estimation in gait through Kalman filter using IMU data at the ankle. IEEE Sensors J.18(10), 4253–4260 (2018).

    Article  Google Scholar 

  10. A. Salarian, P. R. Burkhard, F. J. G. Vingerhoets, B. M. Jolles, K. Aminian, A novel approach to reducing number of sensing units for wearable gait analysis systems. IEEE Trans. Biomed. Eng.60(1), 72–77 (2013). https://doi.org/10.1109/TBME.2012.2223465.

    Article  Google Scholar 

  11. A. S. Alharthi, S. U. Yunas, K. B. Ozanyan, Deep learning for monitoring of human gait: a review. IEEE Sensors J.19(21), 9575–9591 (2019). https://doi.org/10.1109/JSEN.2019.2928777.

    Article  Google Scholar 

  12. R. C. Deo, Machine learning in medicine. Circulation. 132(20), 1920–1930 (2015). https://doi.org/10.1161/CIRCULATIONAHA.115.001593.

    Article  Google Scholar 

  13. T. Zebin, P. J. Scully, K. B. Ozanyan, in 2016 IEEE SENSORS. Human activity recognition with inertial sensors using a deep learning approach, (2016), pp. 1–3. https://doi.org/10.1109/ICSENS.2016.7808590.

  14. F. J. Ordóñez, D. Roggen, Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition. Sensors. 16(1) (2016). https://doi.org/10.3390/s16010115.

  15. T. Zebin, M. Sperrin, N. Peek, A. J. Casson, in 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). Human activity recognition from inertial sensor time-series using batch normalized deep LSTM recurrent networks, (2018), pp. 1–4. https://doi.org/10.1109/EMBC.2018.8513115.

  16. O. Dehzangi, M. Taherisadr, R. ChangalVala, IMU-based gait recognition using convolutional neural networks and multi-sensor fusion. Sensors. 17:, 2735 (2017). https://doi.org/10.3390/s17122735.

    Article  Google Scholar 

  17. Gholami M., A. Rezaei, T. J. Cuthbert, C. Napier, C. Menon, Lower Body Kinematics Monitoring in Running Using Fabric-Based Wearable Sensors and Deep Convolutional Neural Networks. Sensors. 19(23), 5325 (2019). https://doi.org/10.3390/s19235325.

    Article  Google Scholar 

  18. M. Mundt, W. Thomsen, T. Witter, A. Koeppe, S. David, F. Bamer, W. Potthast, B. Markert, Prediction of lower limb joint angles and moments during gait using artificial neural networks. Med. Biol. Eng. Comput.58(1), 211–225 (2020). https://doi.org/10.1007/s11517-019-02061-3.

    Article  Google Scholar 

  19. A. Findlow, J. Y. Goulermas, C. Nester, D. Howard, L. P. J. Kenney, Predicting lower limb joint kinematics using wearable motion sensors. Gait & Posture. 28(1), 120–126 (2008). https://doi.org/10.1016/j.gaitpost.2007.11.001.

    Article  Google Scholar 

  20. M. BłaŻkiewicz, A. Wit, Artificial neural network simulation of lower limb joint angles in normal and impaired human gait. Acta Bioeng. Biomech.20:, 43–49 (2018). https://doi.org/10.5277/ABB-01129-2018-02.

    Google Scholar 

  21. J. Hannink, T. Kautz, C. F. Pasluosta, K. Gaßmann, J. Klucken, B. M. Eskofier, Sensor-based gait parameter extraction with deep convolutional neural networks. IEEE J. Biomed. Health Inform.21(1), 85–93 (2017). https://doi.org/10.1109/JBHI.2016.2636456.

    Article  Google Scholar 

  22. H. Wu, Q. Huang, D. Wang, L. Gao, in 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). A CNN-SVM Combined Regression Model for Continuous Knee Angle Estimation Using Mechanomyography Signals, (2019), pp. 124–131. https://doi.org/10.1109/ITNEC.2019.8729426.

  23. J. Conte Alcaraz, S. Moghaddamnia, J. Peissig, in 2017 22nd International Conference on Digital Signal Processing (DSP). Mobile quantification and therapy course tracking for gait rehabilitation, (2017), pp. 1–5. https://doi.org/10.1109/ICDSP.2017.8096106.

  24. J. Perry, J. M. Burnfield, in Gait analysis : normal and pathological function, 2nd ed.Gait analysis : normal and pathological function (SLACKThorofare, 2010).

    Google Scholar 

  25. G. P. Panebianco, M. C. Bisi, R. Stagni, S. Fantozzi, Analysis of the performance of 17 algorithms from a systematic review: influence of sensor position, analyzed variable and computational approach in gait timing estimation from IMU measurements. Gait & Posture. 66:, 76–82 (2018).

    Article  Google Scholar 

  26. A. M. Sabatini, C. Martelloni, S. Scapellato, F. Cavallo, Assessment of walking features from foot inertial sensing. IEEE Trans. Biomed. Eng.52(3), 486–494 (2005). https://doi.org/10.1109/TBME.2004.840727.

    Article  Google Scholar 

  27. A. Goshvarpour, A. Goshvarpour, Nonlinear Analysis of Human Gait Signals. Int. J. Inf. Eng. Electron. Bus.4:, 15–21 (2012). https://doi.org/10.5815/ijieeb.2012.02.03.

    Google Scholar 

  28. N. Huang, Z. Shen, S. R. Long, M. L. C. Wu, H. H. Shih, Q. Zheng, N. C. Yen, C. C. Tung, H. H. Liu, The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. Ser. A: Math. Phys. Eng. Sci.454:, 903–995 (1998). https://doi.org/10.1098/rspa.1998.0193.

    Article  MathSciNet  Google Scholar 

  29. G. Huang, C. Wu, J. Lin, in 2012 International Conference on Computerized Healthcare (ICCH). Gait analysis by using tri-axial accelerometer of smart phones, (2012), pp. 29–34. https://doi.org/10.1109/ICCH.2012.6724466.

  30. F. Rosenblatt, Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Report (Cornell Aeronautical Laboratory) (Spartan Books, Washington, 1962).

    MATH  Google Scholar 

  31. A. J. Al-mahasneh, S. G. Anavatti, M. Pratama, Applications of General Regression Neural Networks in Dynamic Systems (IntechOpen, Rijeka, 2018). https://doi.org/10.5772/intechopen.80258.

    Book  Google Scholar 

  32. D. Specht, A general regression neural network. IEEE Trans. Neural Netw.2:, 568–576 (1991). https://doi.org/10.1109/72.97934.

    Article  Google Scholar 

  33. K. S. Narendra, K. Parthasarathy, Identification and control of dynamical systems using neural networks. IEEE Trans. Neural Netw.1(1), 4–27 (1990). https://doi.org/10.1109/72.80202.

    Article  Google Scholar 

  34. R. Gupta, I. S. Dhindsa, R. Agarwal, Continuous angular position estimation of human ankle during unconstrained locomotion. Biomed. Signal Process. Control. 60:, 101968 (2020).

    Article  Google Scholar 

  35. X. Ma, Y. Liu, Q. Song, C. Wang, Continuous Estimation of Knee Joint Angle Based on Surface Electromyography Using a Long Short-Term Memory Neural Network and Time-Advanced Feature. Sensors. 20(17), 4966 (2020). https://doi.org/10.3390/s20174966.

    Article  Google Scholar 

  36. H. Liu, X. Song, in 2015 10th Asian Control Conference (ASCC). Nonlinear system identification based on NARX network, (2015), pp. 1–6.

  37. X. Glorot, Y. Bengio, in AISTATS, 9. Understanding the difficulty of training deep feedforward neural networks, (2010), pp. 249–256.

  38. M. F. Møller, A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw.6(4), 525–533 (1993). https://doi.org/10.1016/S0893-6080(05)80056-5.

    Article  Google Scholar 

  39. D. P. Kingma, J. Ba, in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, ed. by Y Bengio, Y LeCun. Adam: A Method for Stochastic Optimization, (2015), pp. 1–15. http://arxiv.org/abs/1412.6980.

  40. I. Goodfellow, Y. Bengio, A. Courville, Deep Learning (MIT Press, Cambridge, 2016).

    MATH  Google Scholar 

  41. S. -M. Lee, S. M. Yoon, H. Cho, in 2017 IEEE International Conference on Big Data and Smart Computing (BigComp). Human activity recognition from accelerometer data using convolutional neural network, (2017), pp. 131–134. https://doi.org/10.1109/BIGCOMP.2017.7881728.

Download references

Acknowledgements

The authors would like to acknowledge all the participants for their contributions to this research study.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Authors

Contributions

Authors’ contributions

Author contributions were as follows: conceptualization, J.C.A.; methodology, J.C.A.; validation, J.C.A.; formal analysis, J.C.A.; investigation, J.C.A.; data acquisition, J.C.A.; writing—original draft preparation, J.C.A. and S.M.; writing—review and editing, J.C.A., S.M. and J.P; visualization, J.C.A.; supervision, J.P. and S.M. All authors approved the final, submitted version of the manuscript.

Corresponding author

Correspondence to Javier Conte Alcaraz.

Ethics declarations

Competing interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Conte Alcaraz, J., Moghaddamnia, S. & Peissig, J. Efficiency of deep neural networks for joint angle modeling in digital gait assessment. EURASIP J. Adv. Signal Process. 2021, 10 (2021). https://doi.org/10.1186/s13634-020-00715-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-020-00715-1

Keywords