MACHINE LEARNING APPROACHES FOR FAILURE PREDICTION IN MECHANICAL COMPONENTS
Main Article Content
Abstract
In the realm of industrial automation, predictive maintenance has become a pivotal strategy to ensure operational efficiency, minimize equipment downtime, and reduce maintenance costs. This study presents an advanced predictive maintenance framework focused on the early detection of mechanical faults in industrial equipment using machine learning techniques. The primary objective is to classify the operational status of machinery as either "Normal" or "Fault" based on sensor data and operational features. The process starts with data preprocessing, which includes missing value treatment, label encoding, and visual analytics such as class distribution plots, heatmaps, and count plots. The step provided key insights into feature relevance, data imbalance, and potential anomalies, guiding the modeling process. The comparative analysis reveals that although ridge and CatBoost classifiers perform effectively, the Categorical Boosting (CatBoost) classifier offers advantages in handling complex categorical data, model interpretability, and training efficiency, making it a suitable candidate for deployment in real-time industrial environments. Future enhancements may focus on reducing false negatives through hybrid modeling or ensemble techniques for even more reliable fault detection. Here, the existing system employs a Ridge Classifier, which demonstrated strong classification capabilities with a confusion matrix revealing 70,376 true positives, 188,119 true negatives, 4,636 false negatives, and 3,033 false positives. While the overall accuracy was high, the false negatives raised concern due to the criticality of undetected faults in an industrial setting. To address the shortcomings, a proposed system was developed using the CatBoost Classifier, a gradient boosting algorithm optimized for categorical feature handling and high-performance learning. The CatBoost model showed a similar performance, achieving 70,936 true positives, 188,119 true negatives, and the same number of false negatives (4,636) and false positives (3,033) as the existing model, indicating robust predictive power and consistent classification reliability.