Wind Turbine Gearbox Edge – Side Fault Diagnosis A Comprehensive Exploration of Low – Pass Screening Optimized Neural Architecture Search

1. Introduction

1.1 Significance of Wind Turbine Gearbox Fault Diagnosis

Wind turbine gearboxes are crucial components in wind power generation systems. Their reliable operation is vital for the overall efficiency and stability of wind farms. However, due to the harsh working environment, high – speed rotation, and heavy – load conditions, wind turbine gearboxes are prone to various faults. These faults not only lead to increased maintenance costs but also cause power generation interruptions, resulting in significant economic losses. Therefore, effective fault diagnosis techniques for wind turbine gearboxes are essential.

1.2 Limitations of Traditional Diagnosis Methods

Cloud – based Diagnosis: The traditional cloud – based diagnosis model, which involves transmitting monitoring data from wind farms to a cloud computing center for analysis and then distributing the results back to the user and wind turbine ends, has several drawbacks. With the exponential growth of wind turbine equipment and monitoring data, the cloud computing center’s burden becomes overwhelming. The long – distance concurrent data transmission leads to high transmission pressure, low data processing efficiency, and long feedback delays. In case of a fault, this can cause extended downtime, major economic losses, and even secondary failures and serious accidents.
Manual Design of Lightweight Models: For edge – side fault diagnosis, lightweight deep learning models are required due to the limited storage and computing resources of edge hardware. Existing lightweight model design methods mainly rely on manual design by experts. This approach is time – consuming, labor – intensive, and highly dependent on expert knowledge. Moreover, these manually designed models often do not consider the configurable resource capacity of edge hardware, so they may not meet the deployment requirements.

1.3 The Emergence of Neural Architecture Search in Fault Diagnosis

Neural architecture search (NAS) has emerged as a potential solution to automate the model design process. It transforms the model design into an optimization problem, leveraging the computing power of computers to search for suitable models. However, the conventional NAS does not take into account the model complexity and the configurable resource capacity of edge hardware. As a result, the automatically designed models may exceed the resource capacity of the target hardware and cannot be deployed. To address these issues, a low – pass screening optimized neural architecture search algorithm (LSNAS) is proposed for wind turbine gearbox edge – side fault diagnosis.

2. Neural Architecture Search Basics

2.1 The Concept and General Process of NAS

Neural architecture search aims to automatically construct deep neural network models. It has been widely applied in various fields such as image processing, semantic segmentation, and medical image reconstruction, providing a solid foundation for its development in the fault diagnosis field.

In the fault diagnosis area, the general process of NAS is depicted in Figure 1. It converts the design of the fault diagnosis model into an optimization problem. First, the search space is defined based on the model’s structure and hyperparameters, and the optimization goal is set according to the diagnosis task requirements. Then, a specific search strategy is adopted to iteratively explore the search space. Finally, the optimal model that meets the diagnosis task is found.

Step	Description
Define Search Space	Determine the possible layer types, hyperparameters, and connection rules for the model.
Set Optimization Goal	Such as maximizing diagnosis accuracy while considering model complexity.
Adopt Search Strategy	Use methods like reinforcement learning to explore the search space.
Find Optimal Model	Select the model that best meets the task requirements.

2.2 Search Strategies in NAS

Among various search strategies, the reinforcement learning – based search strategy is a mainstream choice. In this strategy, an agent samples models from the search space. The evaluation strategy assesses the performance of the sampled models. Based on the evaluation results, the agent adjusts the search strategy to obtain a more satisfactory model. For example, in the ɛ – greedy Q – learning method, which is a typical optimization strategy in NAS, the agent balances exploration and exploitation. It randomly selects an action with a probability of ε and chooses the action with the maximum Q – value in the current state with a probability of 1 – ε. This ensures a comprehensive exploration of the search space.

3. Low – Pass Screening Optimized Diagnosis Model Search

3.1 Design of the Empirically Inspired Search Space

Design Principles: To obtain lightweight and high – accuracy models, several empirical design rules are introduced in the search space design. First, the depth – separable convolution module is incorporated to reduce the model’s parameters and computational load. Second, residual blocks are added to prevent gradient vanishing and gradient explosion. Third, when the convolution kernel is larger than or equal to 5, the convolution step – size is set to 2, and when it is smaller than 5, the step – size is set to 1 to reduce the model’s computational amount.
Definition of State and Action Spaces: The state space defines the basic layer types and their corresponding selectable hyperparameter sets. Seven mainstream layer types are chosen, including convolution (C), depth – separable convolution (D), max – pooling (P), residual block (R), spatial pyramid pooling (SPP), fully – connected (F), and the Softmax layer (as the termination state). The hyperparameters for each layer type are listed in Table 1.

Layer Type	Hyperparameters	Parameter Values
Convolution (C)	Kernel Size	{1×1, 3×3, 5×5, 7×7}
	Channel Depth	{8, 16, 32, 64, 96, 128}
Depth – separable Convolution (D)	Kernel Size	{1×1, 3×3, 5×5, 7×7}
	Channel Depth	{8, 16, 32, 64, 96, 128}
Max – pooling (P)	Kernel Size	{5×5, 3×3, 2×2}
	Step – size	{3×3, 2×2}
Residual Block (R)	Kernel Size	{3×3, 1×1}
	Step – size	{1×1}
Spatial Pyramid Pooling (SPP)	SPP Level	{3, 4}
Fully – connected (F)	Number of Neurons	{128, 96, 64, 32, 16}
Termination State (T)	Type	Softmax Layer

The action space consists of the connection rules for each layer. For example, the maximum depth of the convolution layer is . The convolution depths of C, D, and R are considered as 1, 2, and 3 respectively, and other layer types are regarded as 0. When the convolution layer depth does not exceed and the layer type is not SPP, the current layer can be connected to any selectable parameter state including C, D, P, R, and SPP layers. These rules ensure the rationality and diversity of the model structure in the search process.

3.2 Modeling of the Low – Pass Screening Reward Function

Consideration of Model Accuracy and Complexity: The reward function is a formal and numerical representation of the search goal. Since the computing resources of edge hardware are limited, when designing a diagnostic model for it, the reward function should take into account both model accuracy and model complexity. Model accuracy is measured by the accuracy of the sampled model on the test set, as shown in the formula , where is the number of correctly tested samples and is the total number of test samples.
Measurement of Model Complexity: Among the typical metrics for measuring model complexity, such as the number of parameters, inference time, and floating – point operations (FLOPs), FLOPs is chosen as the metric in this study. It is easy to calculate, cost – effective, and can reliably evaluate model complexity. The FLOPs of a model is the sum of the FLOPs of all its layers, denoted as.

3.3 Optimization with ɛ – greedy Q – learning Search Strategy

Model Sampling: In the ɛ – greedy Q – learning search strategy, model sampling is a crucial step. Given the state space and action space of the empirically inspired search space, the agent selects an action in the current state according to the ɛ – greedy action strategy . This leads to a transition from state to state . The process continues until the termination state is reached, forming an action trajectory , which is then transformed into a diagnostic model. The ɛ – greedy action strategy is defined as , where the agent explores with a probability of and exploits with a probability of .
Model Evaluation: Model evaluation is mainly based on obtaining two evaluation indicators of the model according to the formulas for accuracy and FLOPs. Then, the reward value of the sampled model is calculated using the reward function. To improve the model search efficiency, an early – stopping strategy is adopted. A small training iteration threshold is set, and when the threshold is reached, the training of the sampled diagnostic model stops. The test accuracy at this time is used as the indicator of the model in the model search stage.
Iterative Optimization: In the iterative optimization process, the agent’s goal is to maximize the total expected reward in the set empirical search space, as expressed in the formula . To solve this goal, the objective function is transformed into an iterative update based on the Q – function to obtain an approximate solution. The iterative update formula is , where is the Q – learning rate and is the discount factor.

3.4 Model Selection

After the search is completed, to select a model that balances accuracy and FLOPs, the Pareto – dominance rule is used. For two sampled diagnostic model individuals and , when and only when , model is said to Pareto – dominate model , denoted as .

The Pareto – optimal solutions are those diagnostic model individuals that are not dominated by any other individuals. The set of all non – dominated solutions is called the Pareto – optimal solution set, and the line formed by the optimal solution set is called the Pareto – front. In practical deployment, users can select the best – trade – off diagnostic model from the Pareto – optimal set according to the hardware’s resource capacity and diagnostic accuracy requirements.

3.5 Search Process of the Low – Pass Screening Optimized Diagnosis Model

The overall process of the low – pass screening optimized neural architecture search method for automatically designing models according to the configurable resource capacity of edge hardware to achieve edge – side fault diagnosis is shown in Figure 2.

Dataset Construction: First, the vibration signals of the wind turbine gearbox are obtained through acceleration sensors. Then, the original vibration signals in the samples are transformed into order spectra using order analysis technology. Finally, the dataset is divided into a training dataset and a test dataset.
Model Search: An empirically inspired search space is constructed, and a low – pass screening reward function is modeled. The agent samples models, evaluates the sampled models on the wind turbine edge – side diagnosis task to obtain reward values, updates the Q – values using the reward values according to the formula, and repeats these steps iteratively until the iteration ends.
Model Selection: The Pareto – dominance method is used to obtain the Pareto – optimal solution set. The best – trade – off model is selected from the optimal solution set for fine – tuning to achieve edge – side fault diagnosis.

4. Experimental Case Analysis

4.1 Data Description

To verify the effectiveness of the proposed method in automatically designing models considering the configurable resource capacity of edge hardware for wind turbine gearbox edge – side fault diagnosis, a fault – simulated dataset generated by a power transmission system diagnostic simulation test bench is used. The test bench mainly consists of a motor, a two – stage planetary gearbox, a two – stage fixed – axis gearbox, a torque controller, and a magnetic particle brake, as shown in Figure 3.

In the experiment, an acceleration sensor (model: PCB 352C03) is installed at the input end of the planetary gearbox, with a sampling frequency of 25.6 kHz. Nine gearbox health states are simulated, as listed in Table 3.

Health State Description	Category Label	Training Samples/Test Samples
Normal	cls0	1000/200
Gear Tooth Root Crack	cls1	1000/200
Gear Tooth Missing	cls2	1000/200
Bearing Rolling Element Fault	cls3	1000/200
Gear Tooth Broken	cls4	1000/200
Gear Tooth Pitting	cls5	1000/200
Bearing Inner Ring Fault	cls6	1000/200
Bearing Outer Ring Fault	cls7	1000/200
Bearing Composite Fault	cls8	1000/200

To simulate the actual working conditions, the rotational speed is linearly increased from 20 Hz to 38.7 Hz four times for each health state. The collected vibration signals are segmented, Gaussian white noise with a signal – to – noise ratio of 0 dB is added, and then they are transformed into two – dimensional order spectra using order analysis technology and resampled to an image size of 128×128. Finally, the order spectra are randomly divided into training and test samples at a ratio of 5:1 for each health state.

4.2 Parameter Settings of the Proposed Method

Training Parameters of the ɛ – greedy Q – learning Search Strategy: Each Q – value is initialized to 0.5. The Q – learning rate is set to 0.1, and the discount factor is set to 1. The weights and in the reward function are both set to 1, indicating equal importance of the two goals. The value of gradually decreases from 1 to 0.1, and the corresponding number of iterative search rounds for each step is set to ensure a smooth transition from the exploration stage to the exploitation stage. The arrangement is shown in Table 4.

Evaluation Parameters of Sampled Models: To improve the search efficiency, an early – stopping strategy is adopted for diagnosing model evaluation. The number of training iterations for all sampled diagnostic models is set to 15. The training batch size is set to 32, and the Adam optimizer is used for network optimization. The model weights are randomly initialized using the Kaiming method. In the following comparative experiments, the number of training iterations for the selected diagnostic model during fine – tuning is extended to 50. For a fair comparison, the training settings of the comparative models are the same as those of the selected diagnostic model. To reduce the influence of random factors, the accuracy of all models in the comparative experiments is taken as the average of 10 trials.

4.3 Result Analysis

In this experimental verification case, the maximum FLOPs of the deployable model on the edge hardware is set as . The search results and the drawn Pareto front are shown in Figure 7a. Two diagnostic models, LSNAS – Neta and LSNAS – Netb, selected from the Pareto – optimal solution set, have the structures shown in Figure 7b and 7c.

To evaluate the edge – diagnosis performance of the models automatically designed by the low – pass screening neural architecture search algorithm under the set hardware – configurable resource conditions, LSNAS – Neta and LSNAS – Netb are compared with advanced manually – designed models, including deep models (GoogLeNet – v1, GoogLeNet – v2, GoogLeNet – v3, ResNet – 18) and edge – friendly models (MobileNet – v1, MobileNet – v2, ShuffleNet). A brief introduction to the comparative models is shown in Table 5.

Model	Brief Introduction
GoogLeNet – v1	The champion of the 2014 ImageNet Challenge classification task. It is constructed by stacking multiple InceptionV1 modules with multi – scale convolutions, and has a depth of 22 layers.
GoogLeNet – v2	Based on GoogLeNet – v1, it adds a batch – normalization layer to alleviate the gradient vanishing problem, reduces the learning difficulty of the neural network, and decomposes large convolution kernels into multiple small convolution kernels to reduce the number of parameters.
GoogLeNet – v3	Based on GoogLeNet – v2, it further increases the network depth by asymmetrically decomposing 2D convolutions and enhances the network’s non – linear expression ability.
ResNet – 18	Constructed by stacking multiple residual blocks, where 18 represents the number of layers with weights, including 17 convolutional layers and 1 fully – connected layer.
MobileNet – v1	Mainly uses depth – separable convolutions to replace the original ordinary convolutions, reducing the model’s computational amount and accelerating the model inference speed. It has become a typical representative of lightweight models.
MobileNet – v2	Improved and optimized based on MobileNet – v1. It designs inverted residual blocks based on depth – separable convolutions to further reduce the model’s parameters and computational amount.
ShuffleNet	A typical edge – friendly model. Its core is to use point – group convolutions and channel shuffling operations, which can greatly reduce the model’s computational amount while ensuring accuracy.

The comparison results are shown in Table 6.

Model	Type	Accuracy (%)	Parameters (M)	FLOPs (G)
GoogLeNet – v1	Manual	92.16	5.6	14.89
GoogLeNet – v2	Manual	85.53	7.36	17.25
GoogLeNet – v3	Manual	93.58	21.8	23.39
ResNet – 18	Manual	96.03	11.17	18.18
MobileNet – v1	Manual	82.91	3.25	6.02
MobileNet – v2	Manual	91.10	2.23	3.19
ShuffleNet	Manual	92.99	2.49	3.08
LSNAS – Neta	Automatic	98.36	0.48	1.95
LSNAS – Netb	Automatic	98.79	0.32	2.03

As shown in the table, compared with the advanced manually – designed networks, the automatically designed models LSNAS – Neta and LSNAS – Netb achieve higher accuracy, fewer parameters, and lower FLOPs. In particular, the fault recognition rate of LSNAS – Netb is 98.78%, which is 6.63% and 2.76% higher than that of the deep models GoogLeNet – v1 and ResNet respectively, and 7.69% and 5.8% higher than that of the edge – friendly models MobileNet – v2 and ShuffleNet. Its parameters and FLOPs are only 0.32 M and 2.03 G, which are 1/17.5 and 1/7.33 of GoogLeNet – v1, 1/34.9 and 1/8.96 of ResNet, 1/6.97 and 1/1.57 of MobileNet – v2, and 1/7.78 and 1/1.52 of ShuffleNet.

As shown in Figure 8, LSNAS – Neta and LSNAS – Netb are located in the upper – left corner of the trade – off graph, further confirming that they are superior to other competing models in terms of recognition accuracy, parameters, and FLOPs. In practical use, they require less memory and computing resources and can achieve high recognition accuracy, showing obvious advantages in edge – side fault diagnosis.

In addition, in the environment where the maximum FLOPs of the deployable model on the hardware is set to 2.5 GFLOPs, the manually – designed competing models are not designed for the configurable resource capacity conditions of the hardware, so their model capacities all exceed 2.5 GFLOPs and cannot meet the deployment requirements. In contrast, the models automatically designed considering the hardware – configurable resource capacity meet the deployment requirements. These results indicate that the low – pass screening optimized neural architecture search method can automatically design diagnostic models with more balanced accuracy, parameters, and computational amount indicators according to the configurable capacity of hardware resources, realizing edge – side fault diagnosis of wind turbine gearboxes.

To prove that the low – pass screening reward function can guide the agent to iteratively screen diagnostic models that meet the deployment requirements, this section compares the performance changes of Q – learning under two reward functions: (considering both accuracy and FLOPs) and (only considering accuracy). The models with are defined as FLOPs – dominant models, and the models with accuracy > 90% are defined as accuracy – dominant models. At the same time, the percentages of FLOPs – dominant models and accuracy – dominant models in every 50 search rounds during the ɛ – greedy Q – learning optimization process are counted. The change trends of the percentages of FLOPs – dominant models and accuracy – dominant models with iteration under the two reward functions are shown in Figure 9.

It can be seen that in both search modes, when , the percentages of FLOPs – dominant models and accuracy – dominant models show randomness because the agent has no prior knowledge of the search space at this stage and randomly explores the search space. With the progress of iteration, in the search with the low – pass screening reward function, the percentages of both FLOPs – dominant models and accuracy – dominant models show an upward trend. In the search with the reward function, only the percentage of accuracy – dominant models shows an upward trend, while the percentage of FLOPs – dominant models has no certain pattern. This is because when only considering the accuracy index, the agent samples models only focusing on the accuracy of the model, and the FLOPs index is not constrained. The low – pass screening reward function guides the agent to pay attention to both the accuracy and FLOPs of the sampled models and takes into account the configurable resource capacity conditions of the hardware. As the iteration progresses, the number of models with higher accuracy and meeting the FLOPs constraint conditions screened by the agent continues to increase. Therefore, the low – pass screening reward function can automatically design models that meet the deployment requirements according to the configurable resource capacity conditions of the hardware to realize edge – side fault diagnosis.

5. Application Case Analysis

5.1 Data Explanation

The data is collected from the vibration state monitoring systems of multiple wind turbines in a domestic wind farm. The monitored wind turbine gearboxes mainly consist of a main shaft, a one – stage planetary gear system, and two – stage parallel gearboxes. The structure diagram and the wind farm site are shown in Figure 10. An acceleration sensor is installed on the low – speed shaft, with a sampling frequency of 25600 Hz and a sampling length of 131072 points.

After long – term fault accumulation, five different health states of the wind turbine gearbox are collected, as shown in Table 7.

Health State Description	Category Label	Training Samples/Test Samples
Ball Drop	cls0	1050/210
Bearing Wear	cls1	1050/210
Cage Cracking	cls2	1050/210
Tooth Damage	cls3	1050/210
Normal	cls4	1050/210

For each health state, the vibration signals are divided into 1260 signal segments, each with a length of 1.28 s. Using order analysis technology and down – sampling technology, each segment is transformed into a 64×64 sample. Finally, for each health state, all samples are randomly divided into training and test samples at a ratio of 5:1.

5.2 Result Analysis

To simulate the differences in the configurable resource capacities of different edge hardware, in the wind power example, the maximum FLOPs of the deployable model on the edge hardware is set as , and the models with are defined as FLOPs – dominant models. The search results and the drawn Pareto front in the wind power measured case are shown in Figure 12a. The structures of the two models selected from the Pareto – optimal trade – off solution set are shown in Figure 12b and 12c. The comparison results between LSNAS – Neta, LSNAS – Netb, and the competing models are shown in Table 8.

Model	Type	Accuracy (%)	Parameters (M)	FLOPs (G)
GoogLeNet – v1	Manual	96.08	5.6	3.72
GoogLeNet – v2	Manual	88.75	7.34	4.31
GoogLeNet – v3	Manual	92.62	21.02	4.75
ResNet – 18	Manual	97.45	11.17	4.55
MobileNet – v1	Manual	90.80	3.25	1.51
MobileNet – v2	Manual	95.49	2.23	0.80
ShuffleNet	Manual	94.74	2.49	0.77
LSNAS – Neta	Automatic	98.87	0.15	0.55
LSNAS – Netb	Automatic	99.14	0.36	0.68

The results show that the edge – side diagnostic models automatically designed for the measured data of the wind farm are superior to the comparative models in terms of accuracy, FLOPs, and parameters. Compared with the deep models GoogLeNet – v1 and GoogLeNet – v2, the accuracy of LSNAS – Netb is increased by 3.06% and 10.39%, and its parameters and floating – point operations are only 1/15.56 and 1/5.47 of GoogLeNet – v1, and 1/20.39 and 1/6.34 of GoogLeNet – v2. Compared with the advanced edge – side – friendly models MobileNet – v2 and ShuffleNet, the accuracy of LSNAS – Netb is increased by 3.65% and 4.4%, and its parameters and floating – point operations are 1.87 M and 0.12 G less than those of MobileNet – v2, and 2.13 M and 0.09 G less than those of ShuffleNet.

In the environment with the designed hardware – deployable model condition , the deep models cannot meet the deployment requirements, and although the edge – friendly models can meet the deployment requirements, LSNAS – Neta and LSNAS – Netb achieve higher accuracy and lower FLOPs. They require less memory and computing resources on the edge – side devices while achieving high accuracy, making them more suitable for deployment on wind turbine edge – side hardware devices to realize real – time diagnosis of gearboxes.

To fully illustrate the diagnostic performance of each model, t – SNE is used to transform the high – dimensional features of the last fully – connected layer of each model into two – dimensional data. The visualization results are shown in Figure 13. For the categories cls0, cls2, and cls3, there are overlapping phenomena in the models GoogLeNet – v1, GoogLeNet – v2, MobileNet – v1, and MobileNet – v2, indicating that these categories are prone to misclassification. Compared with other models, LSNAS – Neta and LSNAS – Netb can effectively distinguish the fault features of each category, with almost no misclassification, and the distribution boundaries of each category are clear, indicating that the models automatically designed for the measured tasks of the wind farm have strong diagnostic capabilities. These results further verify the effectiveness of the proposed low – pass screening neural architecture search scheme, which can automatically design high – accuracy diagnostic models.

The change trends of the percentages of FLOPs – dominant models and accuracy – dominant models with iteration under the two reward functions and are shown in Figure 14. With the progress of iteration, the change trends of the percentages of FLOPs – dominant models and accuracy – dominant models under the guidance of the two reward functions are consistent with those in the test bench experiment. This further proves that in the face of different hardware – configurable resource environments, the established low – pass screening reward function can guide the agent to automatically design models that meet the deployment requirements according to the configurable resource conditions of the hardware, verifying the effectiveness of the low – pass screening reward function.

6. Conclusion

A low – pass screening optimized neural architecture search algorithm has been developed. This method designs an empirically inspired search space and a low – pass screening reward function, and then uses the Q – learning optimization algorithm to iteratively solve the automatic design problem of edge – side diagnostic models. The proposed method can automatically design models for edge hardware according to its configurable resource capacity, realizing edge – side fault diagnosis and laying a foundation for the shift of the diagnosis method from the cloud – computing – based model to the edge – side.

In two cases, compared with advanced manually – designed deep learning models and edge – friendly models, the automatically designed edge – side fault diagnosis models not only achieve a better balance in accuracy, FLOPs, and parameters, but also consider the configurable resource conditions of edge hardware in the model design process, making them more suitable for deployment on edge hardware for fault diagnosis.

By comparing the Q – learning performance changes under the two reward function configurations of and , it is proved that the established low – pass screening reward function can guide the agent to pay attention to both the accuracy and FLOPs of the model and consider the configurable resource capacity of the hardware during the search process, continuously seeking diagnostic models with higher accuracy and better meeting the deployment requirements.

In addition, by setting different hardware – configurable resource capacity environments in the two experimental cases, the consistent change trends of the percentages of FLOPs – dominant models and accuracy – dominant models further prove that this method can automatically design models that meet the deployment requirements according to the configurable resource capacity conditions of edge hardware.