A Low Carbon Optimization Decision Method for Gear Hobbing Process Parameters Driven by Small Sample Data

In modern manufacturing, the gear industry stands as a cornerstone of mechanical components, with gear hobbing being a pivotal process for producing high-precision gears. However, gear hobbing operations, particularly on CNC hobbing machines, contribute significantly to carbon emissions due to high energy consumption. Traditional optimization methods often rely on empirical models or large historical datasets, which may not accurately capture dynamic production environments or are unavailable in many small-to-medium enterprises due to limited digital infrastructure. To address this, we propose a data-driven approach for low-carbon optimization of gear hobbing process parameters using small sample data. This method integrates Box-Behnken experimental design, backpropagation neural networks, an improved multi-objective gray wolf optimization algorithm, and entropy-TOPSIS comprehensive evaluation to minimize carbon emissions and processing time. By leveraging small but well-designed datasets, we aim to provide a practical solution for sustainable gear hobbing without extensive data requirements.

The core of our approach lies in optimizing key process parameters in gear hobbing, specifically the spindle speeds and feed rates for rough and semi-finish cuts. We define these as optimization variables: rough cutting spindle speed $n_1$, rough cutting feed rate $f_1$, semi-finish cutting spindle speed $n_2$, and semi-finish cutting feed rate $f_2$. The optimization objectives are total carbon emissions $C$ and total processing time $T$. Carbon emissions primarily originate from electricity consumption and tool usage, expressed as:

$$C = C_{elec} + C_{tool}$$

where $C_{elec}$ is the carbon emissions from electricity, calculated as $C_{elec} = F_{elec} E$, with $F_{elec}$ being the carbon emission factor for electricity (e.g., 0.8042 kg CO₂/kWh for the Southern China grid) and $E$ the total energy consumption measured from the gear hobbing machine. The tool-related carbon emissions $C_{tool}$ account for the embodied carbon of the hob distributed over its lifespan:

$$C_{tool} = \frac{t_{ct} m_{tool} F_{tool}}{T_{tool}}$$

Here, $t_{ct}$ is the cutting time, $m_{tool}$ is the hob mass, $F_{tool}$ is the tool carbon emission factor (29.6 kg CO₂/kg), and $T_{tool}$ is the tool life estimated by the empirical formula:

$$T_{tool} = k_0 n^{k_1} f^{k_2}$$

where $k_0, k_1, k_2$ are life coefficients, and $n$ and $f$ are the spindle speed and feed rate, respectively. The total processing time $T$ includes standby time $t_{st}$, air-cutting time $t_{airc}$, and cutting time $t_{ct}$:

$$T = t_{st} + t_{airc} + t_{ct}$$

Thus, the multi-objective optimization model is formulated as:

$$F(n_1, f_1, n_2, f_2) = \min(C, T)$$

subject to constraints such as spindle speed limits $n_{min} \leq n \leq n_{max}$, feed rate limits $f_{min} \leq f \leq f_{max}$, and surface roughness requirements $0.312 f^2 / r \leq R_a$, where $r$ is the hob tip radius and $R_a$ is the surface roughness. This model ensures that optimized parameters maintain gear quality while reducing environmental impact and enhancing efficiency.

To build accurate prediction models for $C$ and $T$ with limited data, we employ Box-Behnken experimental design, a response surface methodology that efficiently samples the parameter space with a small number of experiments. This design is ideal for gear hobbing as it captures nonlinear relationships between process parameters and objectives. We define three levels for each of the four variables (e.g., $n_1$: 280, 320, 360 rpm; $f_1$: 3.5, 4.0, 4.5 mm/min; $n_2$: 360, 390, 420 rpm; $f_2$: 4, 6, 8 mm/min) and conduct experiments accordingly. The data collected includes energy consumption via power analyzers and processing times, yielding a small but representative dataset for modeling.

Using this data, we develop a backpropagation neural network (BPNN) as a surrogate model to predict $C$ and $T$ based on the input parameters. The BPNN architecture includes an input layer with four nodes (for $n_1, f_1, n_2, f_2$), a hidden layer with five nodes determined by empirical formula, and an output layer with two nodes (for $C$ and $T$). We normalize the data to a range of [-1, 1] using:

$$y = \frac{x – x_{min}}{x_{max} – x_{min}} (y_{max} – y_{min}) + y_{min}$$

and employ Bayesian regularization as the training algorithm to prevent overfitting, which is crucial for small sample scenarios. The network is trained on 23 data points and tested on 5, achieving high prediction accuracy. The performance is evaluated using correlation coefficients $R$, with values close to 1 indicating good fit. For instance, our BPNN achieves $R = 0.99564$ for training and $R = 0.98192$ for testing, with relative errors under 3%, demonstrating robustness despite limited data.

Compared to other modeling approaches like support vector regression or random forests with random sampling, our Box-Behnken-driven BPNN shows superior performance in gear hobbing applications, as summarized in Table 1. This validates the effectiveness of experimental design in enhancing data quality for small sample learning.

Table 1: Comparison of Prediction Models for Gear Hobbing
Modeling Method	Maximum Relative Error (%)	Correlation Coefficient R
Box-Behnken BPNN (Our Method)	2.02	0.98192
Standard BPNN	14.03	0.84292
Support Vector Regression	15.40	0.84052
Random Forest	18.50	0.83162

With the BPNN as a fitness function, we perform multi-objective optimization using an improved multi-objective gray wolf optimization (MOGWO) algorithm. Gray wolf optimization mimics the social hierarchy and hunting behavior of wolves, where solutions are represented as wolves, and the best solutions are alpha, beta, and delta wolves guiding the search. To enhance exploration and avoid local optima in gear hobbing parameter optimization, we introduce two improvements: Latin hypercube sampling for initial population generation to ensure diversity, and a modified control parameter strategy where the parameter $a$ decreases non-linearly:

$$a = 2 – \frac{2}{I-1} (I^{i/i_{max}} – 1)$$

Here, $I$ is a control coefficient (set to 100), $i$ is the current iteration, and $i_{max}$ is the maximum iterations. This allows for broader exploration early in the optimization. Additionally, each wolf (candidate solution) is given autonomous exploration ability by randomly perturbing its position in one dimension:

$$X^*(k) = X(k) + r$$

where $X(k)$ is the k-th dimension element, $r$ is a random number in (-1, 1), and the new position $X^*$ is adopted if it dominates the old one. The algorithm maintains an external archive for non-dominated Pareto solutions and uses a leader selection strategy to update alpha, beta, and delta wolves. We set the population size to 100 and run for 100 iterations, optimizing the four gear hobbing parameters to minimize $C$ and $T$.

The optimization yields a set of Pareto-optimal solutions, representing trade-offs between carbon emissions and processing time. To select the best compromise solution for gear hobbing, we apply the entropy-weighted TOPSIS method. Entropy weighting objectively assigns weights based on the variability of each objective in the Pareto set, avoiding subjective bias. For each solution, we calculate the Euclidean distances to the positive ideal solution (minimum $C$ and $T$) and negative ideal solution (maximum $C$ and $T$), then compute a comprehensive score:

$$S_i = \frac{D_i^-}{D_i^+ + D_i^-}$$

where $D_i^+$ is the distance to the positive ideal, and $D_i^-$ is the distance to the negative ideal. The solution with the highest score is chosen as optimal. This approach ensures a balanced decision considering both environmental and efficiency goals in gear hobbing.

We validate our method through experimental gear hobbing on a YS3120CNC6 CNC high-speed hobbing machine, processing a gear made of 45 steel with module 4.5, 34 teeth, and 32 mm width. The process involves rough and semi-finish cuts with a 1 mm allowance. Energy consumption is monitored using a HIOKI PW6001 power analyzer, and carbon emissions are computed with the Southern China grid factor. The experimental design and results are shown in Table 2, which includes 28 runs from Box-Behnken design with measured energy $E$, time $T$, and carbon emissions $C$.

Table 2: Box-Behnken Experimental Design and Results for Gear Hobbing
Run	$n_1$ (rpm)	$f_1$ (mm/min)	$n_2$ (rpm)	$f_2$ (mm/min)	$E$ (kJ)	$T$ (s)	$C$ (kg CO₂)
1	280	3.5	390	6	16052.4	1831	4.058946
2	360	3.5	390	6	16643.2	1832	4.264138
3	280	4.5	390	6	14104.0	1588	3.657571
4	360	4.5	390	6	14165.0	1582	3.759648
5	320	4.0	360	4	17948.8	2026	4.392604
6	320	4.0	420	4	18048.6	2026	4.487854
7	320	4.0	360	8	13687.0	1539	3.634665
8	320	4.0	420	8	13800.0	1540	3.801001
9	280	4.0	390	4	17932.0	2024	4.384952
10	360	4.0	390	4	18184.0	2020	4.521968
11	280	4.0	390	8	13523.0	1526	3.621164
12	360	4.0	390	8	13877.0	1529	3.783049
13	320	3.5	360	6	16918.0	1838	4.240730
14	320	4.5	360	6	14622.4	1591	3.766713
15	320	3.5	420	6	17006.2	1842	4.366966
16	320	4.5	420	6	14894.0	1594	3.934686
17	280	4.0	360	6	15190.0	1697	3.836031
18	360	4.0	360	6	15299.0	1702	3.941555
19	280	4.0	420	6	15235.0	1703	3.952889
20	360	4.0	420	6	15499.4	1689	4.087040
21	320	3.5	390	4	19734.6	2165	4.806601
22	320	4.5	390	4	17393.4	1706	4.278315
23	320	3.5	390	8	15211.2	1659	4.015769
24	320	4.5	390	8	13030.4	1416	3.569705
25	320	4.0	390	6	15098.8	1699	3.904240
26	320	4.0	390	6	15121.7	1703	3.911508
27	320	4.0	390	6	15106.9	1696	3.904467
28	320	4.0	390	6	15122.1	1703	3.911597

The BPNN model trained on this data shows excellent prediction capability, as indicated by linear regression plots and low error percentages. We then apply the improved MOGWO algorithm, which generates a Pareto front of non-dominated solutions after 100 iterations, as visualized in Figure 1. The Pareto front illustrates the trade-off between carbon emissions and processing time for gear hobbing, with solutions ranging from low-carbon but slower to faster but higher-carbon options. Using entropy-TOPSIS, we rank these solutions and select the optimal one: $n_1 = 314$ rpm, $f_1 = 4.5$ mm/min, $n_2 = 360$ rpm, $f_2 = 8$ mm/min. This optimal gear hobbing parameter set yields a total time of 1412.3 s and carbon emissions of 3.496 kg CO₂.

To benchmark our method, we compare it with two approaches: empirical parameter settings (common in practice) and optimization using the NSGA-II algorithm. The empirical setting uses mean values from the design: $n_1 = 320$ rpm, $f_1 = 4.0$ mm/min, $n_2 = 390$ rpm, $f_2 = 6$ mm/min, resulting in $T = 1699.9$ s and $C = 3.879$ kg CO₂. Our optimized gear hobbing parameters reduce processing time by 20.4% and carbon emissions by 11%, demonstrating significant improvements. Moreover, compared to NSGA-II, our improved MOGWO shows faster convergence and better solution diversity, as summarized in Table 3. The improved MOGWO achieves a more distributed Pareto front earlier in the iterations, making it more efficient for gear hobbing optimization.

Table 3: Comparison of Optimization Algorithms for Gear Hobbing (100 Iterations)
Optimization Algorithm	Carbon Emissions $C$ (kg CO₂)	Processing Time $T$ (s)
Improved MOGWO (Our Method)	Optimal: 3.4654, Average: 3.5049	Optimal: 1396.3, Average: 1410.0
NSGA-II	Optimal: 3.4668, Average: 3.5091	Optimal: 1398.5, Average: 1411.3
Empirical Setting	3.879	1699.9

The effectiveness of our small sample data-driven approach is further underscored by its ability to handle limited data while maintaining high accuracy. In gear hobbing, where historical data may be scarce, the combination of Box-Behnken design and BPNN provides a robust predictive model. The improved MOGWO algorithm enhances search capabilities, avoiding local optima that often plague traditional optimization methods. The entropy-TOPSIS decision-making adds objectivity, ensuring that the selected parameters balance both environmental and economic factors. This integrated framework offers a practical tool for manufacturers aiming to adopt sustainable gear hobbing practices without extensive data collection efforts.

In conclusion, we have developed a comprehensive low-carbon optimization decision method for gear hobbing process parameters, leveraging small sample data. Our approach addresses the challenges of data scarcity in manufacturing by using experimental design to gather informative data, neural networks for accurate prediction, and advanced algorithms for efficient multi-objective optimization. The results confirm that optimized gear hobbing parameters can significantly reduce carbon emissions and processing time, contributing to greener production. Future work could explore the integration of tool wear dynamics and real-time adaptive control to further enhance the sustainability of gear hobbing processes. By focusing on small sample solutions, we enable broader adoption in industry, paving the way for data-driven sustainable manufacturing in gear production and beyond.

The mathematical formulations and algorithmic steps detailed here provide a replicable framework for similar manufacturing processes. For instance, the carbon emission model can be adapted to other machining operations, while the optimization strategy can handle additional objectives like tool life or surface quality. In gear hobbing specifically, the emphasis on spindle speed and feed rate optimization underscores the importance of parameter tuning in achieving eco-efficiency. As industries move towards carbon neutrality, such methods will become increasingly valuable, and our work demonstrates that even with small datasets, meaningful improvements are attainable through intelligent data-driven design.

Throughout this study, we have highlighted the role of gear hobbing as a critical process in mechanical engineering, and our optimization method aims to make it more sustainable. The repeated focus on gear hobbing parameters, such as speeds and feeds, ensures that the discussion remains centered on practical applications. By incorporating tables and formulas, we summarize key data and relationships, facilitating understanding and implementation. The image of gear hobbing illustrates the process context, while the tables and equations provide actionable insights. Ultimately, this research contributes to the growing body of knowledge on sustainable manufacturing, offering a viable path for reducing the environmental footprint of gear production through smart, data-informed decisions.