Analisa dan Perancangan Machine Learning Untuk Mendeteksi Kegagalan Job di Apache Spark

Abstract

A collection of data stored in a database, so the longer the data, the bigger the data, because the data processed is very large, processing time in Apache Spark can take up to a dozen or tens of hours. Sometimes, the Apache Spark application even fails. Therefore, to minimize the waiting time that could have been avoided or reduced, artificial intelligence through Machine Learning will be used to detect whether an Apache Spark application will fail or run smoothly. Factors to determine this failure are called features and are generated through the feature engineering process. The purpose of this research is to design Machine Learning so that it is able to find out what features will determine the success or failure of the Apache Spark application. The research method used is the Prototyping process model.