Hiding Sensitive Itemsets Using Sibling Itemset Constraints


Creative Commons License

YILDIZ B., KUT R. A., YILMAZ R.

SYMMETRY-BASEL, vol.14, no.7, 2022 (SCI-Expanded) identifier

  • Publication Type: Article / Article
  • Volume: 14 Issue: 7
  • Publication Date: 2022
  • Doi Number: 10.3390/sym14071453
  • Journal Name: SYMMETRY-BASEL
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Communication Abstracts, INSPEC, Metadex, zbMATH, Directory of Open Access Journals, Civil Engineering Abstracts
  • Keywords: frequent itemset mining, privacy-preserving data mining, sensitive itemset hiding, FREQUENT, KNOWLEDGE
  • Dokuz Eylül University Affiliated: Yes

Abstract

Data collection and processing progress made data mining a popular tool among organizations in the last decades. Sharing information between companies could make this tool more beneficial for each party. However, there is a risk of sensitive knowledge disclosure. Shared data should be modified in such a way that sensitive relationships would be hidden. Since the discovery of frequent itemsets is one of the most effective data mining tools that firms use, privacy-preserving techniques are necessary for continuing frequent itemset mining. There are two types of approaches in the algorithmic nature: heuristic and exact. This paper presents an exact itemset hiding approach, which uses constraints for a better solution in terms of side effects and minimum distortion on the database. This distortion creates an asymmetric relation between the original and the sanitized database. To lessen the side effects of itemset hiding, we introduced the sibling itemset concept that is used for generating constraints. Additionally, our approach does not require frequent itemset mining executed before the hiding process. This gives our approach an advantage in total running time. We give an evaluation of our algorithm on some benchmark datasets. Our results show the effectiveness of our hiding approach and elimination of prior mining of itemsets is time efficient.