### 2.1 MEMS sensor

The physical picture of MEMS sensor is shown as in Fig. 1. The micro-control processing chip used in this article is STMicroelectronics’ STM32L151. The chip uses a high-performance ultra-low-power 32-bit MCU with a high-performance ARMCortexM3RISC core. The operating frequency is between 32KHz and 32MHz. It integrates USB connection power and memory. Protection unit (MPU), high-speed embedded memory (512KB flash memory and 80KBRAM), and enhanced I/O and peripherals connected to two APB buses. The chip has excellent real-time performance, superior efficiency, and maximum integration and is suitable for use in wearable smart devices [5, 6].

Assuming that the probability density function at k-1 time is *p*(*x*_{k − 1}| *Y*_{k − 1}), *p*(*x*_{k}| *Y*_{k − 1}) is obtained from *p*(*x*_{k − 1}| *Y*_{k − 1}), and the states *x*_{k} and *Y*_{k − 1} are independent of each other [7].

$$ p\left(\left.{x}_k,{x}_{k-1}\right|{Y}_{k-1}\right)=p\left(\left.{x}_k\right|{x}_{k-1},{Y}_{k-1}\right)p\left(\left.{x}_{k-1}\right|{Y}_{k-1}\right)=p\left(\left.{x}_k\right|{x}_{k-1}\right)p\left(\left.{x}_{k-1}\right|{Y}_{k-1}\right) $$

(1)

Integrate *x*_{k − 1} to get the CK equation:

$$ p\left(\left.{x}_k\right|{Y}_{k-1}\right)=\int p\left(\left.{x}_k\right|{x}_{k-1}\right)p\left(\left.{x}_{k-1}\right|{Y}_{k-1}\right){dx}_{k-1} $$

(2)

where *p*(*x*_{k}| *x*_{k − 1}) is the state transition probability, which is determined by the system state transition equation and including state noise [8].

Use Bayes’ formula to update the prior probability density to obtain the posterior probability density, the expression is as follows [9]:

$$ p\left(\left.{x}_k\right|{Y}_k\right)=\frac{p\left(\left.{y}_k\right|{x}_k,{Y}_{k-1}\right)p\left(\left.{x}_k\right|{Y}_{k-1}\right)}{p\left(\left.{y}_k\right|{Y}_{k-1}\right)} $$

(3)

According to the observation equation, *y*_{k} is only related to *x*_{k} and noise, and the expression is as follows [10]:

$$ p\left(\left.{x}_k\right|{Y}_k\right)=\frac{p\left(\left.{y}_k\right|{x}_k\right)p\left(\left.{x}_k\right|{Y}_{k-1}\right)}{p\left(\left.{y}_k\right|{Y}_{k-1}\right)} $$

(4)

In the formula *p*(*y*_{k}| *Y*_{k − 1}) = ∫ *p*(*y*_{k}| *x*_{k})*p*(*x*_{k − 1}| *Y*_{k − 1})*dx*_{k}, *p*(*y*_{k}| *x*_{k}) is the likelihood probability, representing the current system state, and the similarity with the actual measured value, determined by the observation state equation, including observation noise [11].

When collecting data, factors such as whether the sensor is stable, where the sensor is worn and other factors have a certain relationship with the classification and recognition of gait. The sensor’s wearing part is different, and the collected acceleration data is also different, so it will directly affect the effectiveness of recognition and classification. Acceleration sensors are often worn on the arms, wrists, waist, chest, and other positions. Energy consumption also affects the size of the sensor power supply module and the overall module size. In addition, the data acquisition module must have sufficient memory space to store the collected data and related software programs [12, 13].

Take three consecutive frames of human motion image sequence and mark them as k+1 frame, k frame, and k-1 frame, respectively. The frame difference method is calculated as [14]:

$$ G\left(x,y\right)=\left[{f}_{k+1}\left(x,y\right)-{f}_k\left(x,y\right)\right]+\left[{f}_k\left(x,y\right)-{f}_{k-1}\left(x,y\right)\right] $$

(5)

$$ H\left(x,y\right)=\left\{\begin{array}{c}1\left|G\left(x,y\right)\right|>T\\ {}0\left|G\left(x,y\right)\right|\le T\end{array}\right. $$

(6)

$$ H\left(x,y\right)=\left\{\begin{array}{c}1\kern1.5em \left.\mathrm{T}1\le \right|\left.G\left(x,y\right)\right|\le T2\\ {}0\kern1.5em others\end{array}\right. $$

(7)

In the formula, *G*(*x*, *y*) is the three-frame difference image, *f*_{k}(*x*, *y*) is the gray component of the human motion image sequence, and (*x*, *y*) is the position representation of the pixel [15].

Since the carrier interference magnetic field error has a greater impact on the accuracy of the magnetometer output, the error modeling is carried out for this error, and its expression is as follows [16]:

$$ \Delta \psi =A+B\ \sin\ {\psi}_m+C\ \cos\ 2{\psi}_m+D\ \sin\ 2{\psi}_m+E\ \sin\ 2{\psi}_m $$

(8)

Among them, *ψ*_{m} is the heading output by the magnetometer, A is the circular deviation, and *B* sin *ψ*_{m} + *C* cos 2*ψ*_{m} is the semicircular deviation [7].

### 2.2 Gymnastics performance

Performing gymnastics is a form of sports performance that integrates gymnastics event elements and performing arts. It is based on gymnastics event elements, with performance as the purpose, and through artistic performance methods, sports content is used as performance material to reflect sports. Culture is a form of sports culture and art. In addition to mastering the correct technical factors, the development of athletes’ difficult movements is also closely related to whether they have the physical fitness level that matches the completion of the difficult movements [17].

Physical stamina is an important condition that determines the formation of technical movements, and the improvement of physical stamina is the basis for completing higher-level difficult movements. Special quality is an important condition that determines the formation of technical movements. Without the improvement of the level of special quality, it is impossible to complete higher levels of difficult movements. Every new development and every update of difficult movements is based on special physical quality based on the creation of corresponding athlete performance. Therefore, special physical fitness plays a vital role in the process of athletes completing a series of movements, and it is the basis and guarantee for athletes to complete difficult movements [18, 19].

Performing gymnastics is a form of expression that specializes in sports culture and sports art using sports performances as artistic materials. Aerobics, cheerleading, sports dance, group gymnastics, recreational gymnastics, rhythmic gymnastics, and other items of technical movements, costumes, props, music, and other items constitute the main elements of performing gymnastics. The understanding of performing gymnastics in this study is to weaken its competitive nature, pay more attention to its performance, and make it more visible and entertaining [20, 21].

Through the means of artistic performance, the content and theme of the performance are reflected, and the content of the theme is deeply expressed through the display of the theme, and the three-dimensional picture is displayed, allowing people to experience its deeper influence and immersive. Through vivid body language to express cultural connotations, music, and clothing are often ignored by the choreographers, only to cater to the needs of the performance theme and to meet the needs of the performance [22]. Gymnastics movements are colorful, different levels have different movements, and different levels of difficulty. It is difficult for teachers to demonstrate every level or every set of movements, and the specifications of the movements they can do are not necessarily high, and it is also impossible to remember every knowledge of gymnastics theory has trouble in the process of demonstrating and explaining to students [23].

### 2.3 Motion capture

Different camera equipment and different shooting scenes will have different effects on the effect of image collection. Moreover, the image sensor will also have a greater impact on the results of the detection of moving human bodies. The choice of light source in the shooting scene will also have a direct impact on the image preprocessing process [24].

Using the one-dimensional center template, the expressions of the gradients in the x and y directions corresponding to the pixels at the position (*x*, *y*) are as follows [25].

$$ {G}_x\left(x,y\right)=H\left(x+1,y\right)-H\left(x-1,y\right) $$

(9)

$$ {G}_y\left(x,y\right)=H\left(x,y+1\right)-H\left(x,y-1\right) $$

(10)

where H represents an image and H(x, y) represents the gray value of the image at the pixel point (x, y). The gradient value at the pixel (x, y) can be calculated by the following formula [26].

$$ G\left(x,y\right)=\sqrt{G_x}{\left(x,y\right)}^2+{G}_y{\left(x,y\right)}^2 $$

(11)

The gradient direction of the pixel (x, y) in the sample is:

$$ \theta \left(x,y\right)={\tan}^{-1}\left(\frac{G_y\left(x,y\right)}{G_x\left(x,y\right)}\right) $$

(12)

The motion capture system based on MEMS sensors completely relies on inertial sensors for the collection of human motion data. In this way, when the human body is collected, the sensor must be installed on the moving limbs according to the captured part. This will inevitably cause There are many data collection and data transmission lines attached to the limbs of the human body during motion capture, which undoubtedly has an inevitable restriction on the movement of the human body [27].

The schematic diagram of human action capture is shown in Fig. 2. Deploying the sensor nodes of the motion capture device on the human body is the basic link to achieve the effect of human motion capture. The location and number of sensor nodes deployed on the human body are directly related to the accuracy of action capture effect [28]. According to different practical uses and needs, the accuracy required for human motion capture is also different, but when motion capture devices are used in medical rehabilitation, film and television production, and other applications, the higher the accuracy, the better. The higher the accuracy is, the more sensor nodes are needed, so the more data flow is generated, and the more data the computer needs to calculate. As the central node, data sink node plays a role of data aggregation and transfer. The sink node reads the sensor data collected by each sensor node in a certain order, and packages it according to a certain format. The data sink node also has a wireless transmission module, which can send the packaged results to the computer terminal in real time. The upper computer program mainly realizes the operation of the sensor data sent by the sink node, which is used for the processing of sensor data and the synthesis of human body posture data [29]. It drives the 3D human body model to make the same action with the user synchronously and plays the role of a large number of sensor data operation and human body action simulation.