### 1 Introduction

1. Automate feature extraction for time series signal data, eliminating the arduous task of manual feature engineering and minimizing information loss.

2. Enhance the representation of signal data characteristics by robustly learning features, thereby diminishing noises and variations that are typical in machine health diagnostics.

3. Utilize Bayesian optimization to refine generalization capabilities of the proposed framework, ensuring reliable model performance across different machine states and scenarios.

### 2 Background Theory

### 2.1 1D-convolutional Autoencoder

*x*

*, which are then regenerated as*

_{i}*x̂*

*in the output to foster a harmony of dimensions between these two layers. This is instrumental for the autoencoder to learn and reconstruct the input data effectively.*

_{i}*N*input maps culminate in

*N*refined output features, with the downsampling dynamics described by the equation:

*β*and

*b*correspond to multiplicative and additive parameters, respectively. The subsampling function, down(·), amalgamates multiple (

*n*× 1) dimensional input feature maps into a compressed output of reduced dimensionality 1/

*n*.

*x*and

*y*are the respective input and output of the residual block,

*F*symbolizes the residual function to be ascertained, and [

*W*

*] is the ensemble of weights within the block. This residual learning strategy enables networks to be trained by reforming the layers to learn residual functions concerning the inputs of the layers, rather than learning functions without any reference.*

_{i}*X*and

*X̃*denote the input and the output of the encoder, respectively. The gradient descent algorithm is harnessed to minimize the reconstruction error. In this investigation, we employ the Adaptive Moment Estimation (ADAM) optimization algorithm proposed by Kingma and Ba [17] to efficiently drive the minimization of the reconstruction discrepancy.

### 2.2 Bayesian Optimization

*p*(

*w*) denotes the prior probability of an unobserved variable

*w*,

*p*(

*D*|

*w*) is the likelihood, and

*p*(

*w*|

*D*) symbolizes the posterior probability.

*m*(·) and

*k*(·,·) represent the mean function and the covariance function in GPR, respectively. A commonly utilized covariance function is the squared exponential, which is defined as

*t*– 1) instances for the surrogate model

*f*.

*Z-*test. The adjusted hybrid stopping criterion refines this by setting a significance level

*α*for the hypothesis testing framework:

*H*

_{0}

*:*

*μ*(

*λ*) >

*f*(

*λ*

^{+}) versus

*H*

_{1}

*:*

*μ*(

*λ*) >

*f (*

*λ*

^{+}), where the

*p*-value is derived from

*γ*(

*λ*). Here,

*γ*(

*λ*) represents the degree of improvement and

*PI*(

*λ*) =

*Φ*(

*γ*(

*λ*)) represents the probability of improvement, where

*γ*(

*λ*) is expressed. as

*λ*

^{+}is the set of hyper-parameters that maximizes the objective function so far and

*ɛ*is the minimum amount of improvement required for the best function value observed. In the adjusted hybrid stopping criteria, the optimization process halts when the

*p*-value falls below significance level

*α*, suggesting a significant likelihood of the new sample exceeding the current best observation.

### 3 Results and Discussion

### 3.1 Experimental Setup

*α*= 0.001,

*β*

_{1}= 0.9, and

*β*

_{2}= 0.999, as defined by default in Kingma and Ba [17], were tested for comparative purposes. In the third phase, latent features from phase two were served as the input for a multilayer perceptron neural network classifier to assess its ability to accurately classify bearing anomalies post the 710th cycle, as illustrated in the research by Kim et al. [19]. The experiments were facilitated by a computing environment with an Intel Core i7-8750H CPU, 16GB RAM, and an NVIDIA GeForce GTX 1070 GPU.

### 3.2 Data Description

### 3.3 Hyper-parameter Optimization for Adjusted Hybrid Stopping Criteria

*ɛ*, within the context of adjusted hybrid stopping criteria using the objective function value, loss from 1D-CAE as a criterion. Both the original and adjusted hybrid stopping criteria were evaluated with a significance level set at

*α*= 0.01 and baseline hyper-parameters for 1D-CAE. For the original hybrid stopping criteria, the experiment terminated with the most notableg termination at

*ɛ*= 1.0, yielding an objective function value of 0.0138. In contrast, the adjusted criteria demonstrated enhanced performance, concluding the experiment at a superior objective function value of 0.0121 when set at the same

*ɛ*level, highlighting the effectiveness of the adjusted method in refining our 1D-CAE model.