The application of Machine Learning (“ML”) algorithms requires the existence of a vast amount of data to trigger decision-making in several industrial processes.
In this regard, the implementation of novel technological paradigms, such as IoT, enables the generation of different types of data structures, as it has been observed in works focusing on Big Data Analytics.
In general, data has a specific life cycle such us: source, collection, storage, processing, visualization, transmission, and application.
However, most of the time, the gathered data that will be processed in subsequent steps, is confused with noisy data generated from the surrounding environment, making it difficult to separate the original data set from noise.
On the other hand, fast changing dynamic environments and different machine working states impose significant challenges to ML-based fault detection.
Overall, the need for reliable and accurate real-time transmission and computation arises, while security issues are becoming increasingly serious, due to the increased level of interconnection among the different subsystems.
The ML algorithms can be categorized into supervised learning, unsupervised learning, reinforcement learning (RL) and deep learning (DL) algorithms.
Each category is briefly described below:
- Supervised learning is a method where expert insert known outputs for specific inputs to train the algorithm and is widely used for classification and regression. Thus, supervised ML is usually employed in scenarios with labeled data availability. Popular algorithms include artificial neural networks (ANNs) and support vector machines (
- Unsupervised learning where there is no feedback provided from anyone and the algorithm finds patterns in unknown data sets (clustering, association rules, self-organised maps) and so, unlabeled data are used for training purposes. The most popular and well-known unsupervised algorithm is principal component analysis (PCA), mainly used for monitoring purposes.
- Reinforcement learning refers to unsupervised ML operation, examining if a chosen action resulted in a reward, for a specific performance metric. RL demands sequential actions and tries their outcome, selecting those better fitting the problem at hand. So, RL significantly departs from other learning categories which are based on leveraging historical data, creating intelligence from previous decisions and rewards.
- Deep learning where multiple layers have been employed in order to build an ANN, which is able to make intelligent decisions, handling large amounts of data with high complexity, without any human intervention. Some DL algorithms are convolutional neural networks (CNNs), restricted Boltzmann machine (RBM) and auto-encoders (AE).
It is evident that as the Industry 4.0 era is upon us, there exists an ever-increasing adoption level of ML algorithms to satisfy the needs of different aspects of industrial settings. These include process monitoring and quality control, fault detection and diagnosis, as well as machine health monitoring and predictive maintenance. Moreover, the capabilities of ML, regarding the timely processing of an abundance of data are critical to safeguarding the cyber-security of the Industrial Internet of Things enabled interconnected manufacturing environments, accurately detecting and mitigating threats