ERIM: An ensemble of rare itemset mining and its application in the automotive industry


Akdas D. N., BİRANT D., Taser P. Y.

EXPERT SYSTEMS, 2022 (SCI-Expanded) identifier

  • Publication Type: Article / Article
  • Publication Date: 2022
  • Doi Number: 10.1111/exsy.13122
  • Journal Name: EXPERT SYSTEMS
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, ABI/INFORM, Applied Science & Technology Source, Biotechnology Research Abstracts, Business Source Elite, Business Source Premier, Compendex, Computer & Applied Sciences, INSPEC, Library, Information Science & Technology Abstracts (LISTA), Psycinfo
  • Keywords: anomaly detection, artificial intelligence, automotive industry, data mining, ensemble learning, rare itemset mining, ASSOCIATION RULE, DOWNTIME
  • Dokuz Eylül University Affiliated: Yes

Abstract

Discovering previously unknown anomalies that are rare and dramatically differ from the majority of the data is a critical need for the automotive industry. Rare itemset mining (RIM), one of the pattern-based methods, has been used for anomaly detection due to providing successful analysis results. However, several aspects still need to be explored, such as improving the mining process by identifying more targeted, valuable and reliable rare itemsets. Motivated by this fact, this study proposes a novel approach, named ensemble of rare itemset mining (ERIM), which investigates weak rare itemsets (WRIs) using different algorithms and aggregates these rules to obtain strong rare itemsets (SRIs). This study also combines four different RIM algorithms (Apriori Rare, Apriori Inverse, CORI and RP-Growth) as base learners for the first time. The proposed ERIM approach is a general methodology that can be applied to any field, but, in this study, it was used in the automotive industry as a case study. In the experiments, ERIM was applied to a real-world gear manufacturing dataset to discover anomalies in machine downtimes. The experimental results were evaluated in terms of the number of itemsets and the length of itemsets by giving some samples, as well. The results showed that the proposed ERIM approach gives more reliable common knowledge by jointly considering the relation between WRIs discovered by the base learners. The findings indicated that the proposed ERIM technique was successful in detecting anomalies whose support values are below 7.12. Furthermore, it is clear from the experimental results that the ERIM discovered the highest number of SRIs, 1403, each of which is a 3-itemset. Finally, the results showed that our method performed 43.37% better on average than state-of-the-art methods on the same dataset.