Recognition of Spiral Bevel Gear Damage Fault A Comprehensive Analysis of the MSB-CNN Approach

1. Introduction

Spiral bevel gears play a crucial role in various mechanical systems, such as marine power systems, helicopters, aircraft engines, and automotive transmissions, owing to their advantages like smooth operation, high transmission ratio, large torque transmission, reliability, and compact structure. However, under long – term heavy – load and variable – load operating conditions, they are prone to tooth surface damage faults. These faults can range from causing equipment vibration and reducing transmission performance to leading to equipment damage and even endangering human lives. Therefore, accurately identifying the damage degree of spiral bevel gears is of great significance for ensuring their safe and stable operation.

Traditional methods for diagnosing spiral bevel gear faults often face challenges. The complex meshing process and working environment of spiral bevel gears result in strong background noise, high nonlinearity, and non – stationarity in fault vibration signals, making it difficult to identify fault characteristics. Moreover, traditional fault pattern recognition methods struggle to establish complex mapping relationships between faults and signals. With the development of deep learning technology, the convolutional neural network (CNN) has shown great potential in pattern recognition. In combination with the modulation signal bispectrum (MSB), which can effectively handle nonlinear data and suppress noise, a new approach for spiral bevel gear damage degree recognition has emerged.

2. Theoretical Background

2.1 MSB Theory

MSB is an improved bispectrum method derived from the second – order power spectrum. It has a powerful demodulation ability and can obtain the fault characteristics of rotating machinery even in a strong noise environment. For a discrete – time vibration signal \(x(t)\), its Fourier transform \(X(f)\) is defined as.

2.2 CNN Introduction

A typical CNN consists of an input layer, convolutional layers, pooling layers, fully – connected layers, and an output layer. The functions of each layer are as follows:

Layer Name	Function
Input Layer	Pre – processes the original data, mainly including mean – subtraction and normalization, to make the data more suitable for the experiment.
Convolutional Layer	The core layer of CNN. It extracts local features of data through convolution operations. Each neuron in the convolutional layer acts as a filter, and slides over the data matrix according to a set step size to calculate the feature values of each window.
Pooling Layer	Placed after each convolutional layer. Its main purpose is to prevent overfitting. By selecting the maximum or average value in a certain area, it compresses the data while maintaining the data features and reducing the dimensionality.
Fully – Connected Layer	Usually the last one or two layers of CNN. All neurons between two fully – connected layers are pairwise connected. It is used for classifying and labeling samples and solving nonlinear problems.
Output Layer	Gives the final classification result of the model.

3. The MSB – CNN – Based Spiral Bevel Gear Damage Degree Recognition Method

3.1 Method Flow

The recognition process of the spiral bevel gear damage degree based on MSB and CNN is as follows:

Data Collection: Conduct a spiral bevel gear fault vibration test. Use acceleration sensors to collect vibration signals of spiral bevel gears. Set an appropriate sampling frequency and collect a sufficient number of vibration signal segments.
Data Pre – processing and Feature Extraction: Divide the collected vibration signal segments into several equal – length data segments, perform pre – processing and normalization operations. Then, conduct modulation signal bispectrum analysis on the data segments, extract the top – view images of the modulation signal bispectrum to construct a feature map sample set, and divide it into a training set and a test set.
CNN Model Parameter Initialization: Initialize parameters such as the learning rate k, pooling layer sampling size s, minimum training amount n, convolutional kernel size o, and the number of iterations m of the CNN. These parameters form a parameter set P. Set iteration round thresholds \(x_{1}-x_{5}\) and adjustment step sizes \(y_{1}-y_{5}\) for parameters like the learning rate, number of iterations, minimum training amount, convolutional kernel size, and pooling layer sampling size.
Model Training: Input the training set into the CNN model for training. Optimize key parameters such as the number of iterations and learning rate. Select the model parameters with the minimum input model error rate to construct the parameter set P and complete the model training.
Model Testing and Fault Recognition: Input the test sample set into the trained model for identification to verify the effectiveness of the model and achieve the recognition of the spiral bevel gear damage degree.

3.2 Key Points of the Method

Combination of MSB and CNN: MSB can effectively extract the fault features of spiral bevel gears from vibration signals, especially in dealing with nonlinear and noisy signals. CNN, on the other hand, has a strong ability to learn and classify features automatically. By using the modulation signal bispectrum map of the spiral bevel gear vibration signal as the input sample of CNN, the advantages of both methods are combined to improve the recognition accuracy of the damage degree.
Parameter Optimization: Appropriate parameter selection is crucial for the performance of the CNN model. In this method, the hierarchical optimization method is used to select key parameters such as the number of iterations, learning rate, minimum training amount, convolutional kernel size, and pooling layer sampling size. This ensures that the model can achieve high – accuracy recognition.

4. Spiral Bevel Gear Damage Fault Experiment

4.1 Experiment Data Collection

To verify the effectiveness of the proposed method, a spiral bevel gear damage fault vibration simulation experiment was carried out on a comprehensive fault simulation platform for spiral bevel gear systems. The spiral bevel gear reduction gearbox is shown in Figure 1. Three types of spiral bevel gears were set: normal gears, and gears with two different degrees of damage (mild and moderate). Since the fault gears were installed on the input shaft, acceleration sensors were mounted on the input shaft. The BK testing system was used to collect the vibration signals of spiral bevel gears in normal, mild – damage, and moderate – damage states. The sampling frequency was set to 3.2kHz, and the rotational speed was 900r/min.

Each type of vibration signal was divided into several data segments with a length of 1024. Modulation signal bispectrum analysis was performed on each data segment to obtain the modulation signal bispectrum diagram at the default equal – angle view. To observe all feature components and avoid feature loss caused by occlusion, the diagram was adjusted to the top – view state, and then converted into an RGB image as the training sample for the subsequent model.

Gear State	Number of Data Segments	Sampling Frequency	Rotational Speed
Normal	Multiple (e.g., 500 in this experiment)	3.2kHz	900r/min
Mild – damage	Multiple (e.g., 500 in this experiment)	3.2kHz	900r/min
Moderate – damage	Multiple (e.g., 500 in this experiment)	3.2kHz	900r/min

4.2 CNN Model Construction and Parameter Selection

The constructed CNN structure includes 1 input layer, 3 convolutional layers, 2 maximum – pooling layers, 1 flattening layer, 2 fully – connected layers, and 1 output layer. The specific operations are as follows:

Replace the activation function with ReLU.
Adjust the size of the input feature image to 128×128×1.
Use the maximum – pooling method to accelerate the network training speed.
Add the BatchNorm operation after the convolutional layer to accelerate the network convergence speed and improve stability.
Add the Dropout operation in the fully – connected layer to effectively prevent overfitting.

The specific parameters of the CNN are shown in Table 2.

Model Parameter	Feature Map Number	Convolutional Kernel (Sampling) Size	Step Size	Activation Function
Input Layer	1	–	–	–
Convolutional Layer 1	8	3×3	1×1	ReLU
Pooling Layer 1	8	2×2	1×1	–
Convolutional Layer 2	16	3×3	1×1	ReLU
Pooling Layer 2	16	2×2	1×1	–
Convolutional Layer 3	32	3×3	1×1	ReLU
Flattening Layer	1	–	–	–
Fully – Connected Layer 1	1	–	–	ReLU
Fully – Connected Layer 2	1	–	–	ReLU
Output Layer	1	–	–	Softmax

The hierarchical optimization method was used to select the main parameters of the CNN: the number of iterations was 90, the learning rate was 0.0001, the minimum training amount per time was 50, the convolutional kernel size was 3×3, and the pooling layer sampling size was 2×2. The modulation signal bispectrum diagrams of the three different states of spiral bevel gears were used as the input samples of the CNN to construct the MSB – CNN fault diagnosis system. Each fault state contained 500 samples, of which 150 were randomly selected as training samples, and the remaining 350 were used as test samples.

4.3 Spiral Bevel Gear Fault Recognition Results and Analysis

To verify the advantages and effectiveness of the modulation signal bispectrum, experiments were carried out by comparing the methods of “vibration signal + CNN” and “modulation signal bispectrum + CNN”. The classification confusion matrices of different input samples are shown in Figure 2. In the figure, 0, 1, and 2 represent normal gears, mild – damage gears, and moderate – damage gears respectively.

To eliminate the differences in single – diagnosis results, each group of experiments was repeated 100 times, and the average value of the repeated experiments was taken as the fault state recognition result. The recognition results and model training times of the two methods are shown in Table 3.

Fault Recognition Method	Number of Test Samples	Average Recognition Accuracy (%)	Training Time (s)
Modulation Signal Bispectrum + CNN	1050	99.91	67
Vibration Signal + CNN	1050	93.53	107

As an input sample, the vibration signal is one – dimensional data and contains a large amount of noise, so the average recognition accuracy is relatively low. MSB can not only effectively handle the nonlinear components of the signal but also suppress various noises. Using MSB as the sample input for CNN can extract more accurate features, resulting in a higher average recognition accuracy.

To further verify the advantages and effectiveness of the proposed method compared with other intelligent fault recognition methods, the recognition results of “modulation signal bispectrum + SVM” and “modulation signal bispectrum + BP” were selected for comparison. The SVM used the RBF kernel function, and the parameters were selected as: kernel function parameter \(\delta = 3\), penalty factor \(C = 4\). The number of hidden layer nodes of the neural network was set according to relevant research, and the activation function was selected accordingly. The classification confusion matrix of the test set obtained in the last experiment is shown in Figure 3.

The recognition results and model training times are shown in Table 4.

Fault Recognition Method	Number of Test Samples	Average Recognition Accuracy (%)	Training Time (s)
Modulation Signal Bispectrum + CNN	1050	99.91	97
Modulation Signal Bispectrum + SVM	1050	99.42	2374
Modulation Signal Bispectrum + BP	1050	89.37	621

In dealing with multi – classification problems, due to the construction of the classifier, the SVM model is complex and the training speed is slow. Moreover, the SVM classifier only trains on sample data with the same label, resulting in slow training and testing classification speeds. Using the modulation signal bispectrum as the sample input can make the SVM extract more accurate features, so the average recognition accuracy is relatively high. The BP neural network has a shallow structure and limited ability to handle nonlinear problems, which makes it difficult to identify some features of the image and restricts its ability to recognize fault information. The CNN has a deep structure, strong nonlinear processing ability, and the weight – sharing mechanism of the convolutional layer reduces the number of trainable parameters, improving the training efficiency and achieving a higher average recognition accuracy.

5. Conclusion

The constructed CNN model and the sample construction method using the modulation signal bispectrum combine automatic feature learning with fault classification, overcoming the shortcoming of traditional fault recognition methods that require manual feature extraction and simplifying the diagnosis process.
Compared with using vibration signals as input samples, using the modulation signal bispectrum as input samples can achieve a higher average recognition accuracy.
Compared with traditional intelligent fault recognition methods, the proposed recognition method has advantages in both average recognition accuracy and model training time.

In general, the method based on MSB and CNN provides a reliable and efficient solution for the damage degree recognition of spiral bevel gears, which has important practical application value in the field of mechanical fault diagnosis. Future research can be carried out in directions such as further optimizing the CNN model structure, expanding the types of fault data, and improving the generalization ability of the model.

6. Future Research Directions

Model Structure Optimization: Explore more advanced CNN architectures, such as adding attention mechanisms or using more complex convolutional block designs. These improvements may enhance the model’s ability to focus on relevant fault features, further improving the recognition accuracy.
Data Expansion: Collect more diverse types of spiral bevel gear fault data, including different degrees of damage, different fault locations, and different operating conditions. This can help improve the generalization ability of the model and make it more applicable in real – world scenarios.
Combination with Other Techniques: Combine the MSB – CNN method with other signal processing techniques or machine learning algorithms. For example, integrating deep learning models with traditional signal processing methods like wavelet transform can extract more comprehensive fault features.
Online Monitoring and Real – Time Diagnosis: Develop a system for online monitoring and real – time diagnosis of spiral bevel gears based on the MSB – CNN method. This requires solving problems such as high – speed data processing and communication to ensure timely detection and diagnosis of gear faults during equipment operation.