One Size Does Not Fit All: Trade-offs between Misuse Probability and Level of Sanitization for Big Data
D. Radhika1, D. Aruna Kumari2

1D. Radhika, Research Scholar, Department of Computer Science Engineering, K L University, Guntur (A.P), India.
2Dr. D. Aruna Kumari, Professor, Department of CSE, VJIT, Hyderabad (Telangana), India.
Manuscript received on 19 October 2019 | Revised Manuscript received on 25 October 2019 | Manuscript Published on 02 November 2019 | PP: 3606-3611 | Volume-8 Issue-2S11 September 2019 | Retrieval Number: B14510982S1119/2019©BEIESP | DOI: 10.35940/ijrte.B1451.0982S1119
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Big data privacy has assumed importance as the cloud computing became a phenomenal success in providing a remote platform for sharing computing resources without geographical and time restrictions. However, the privacy concerns on the big data being outsourced to public cloud storage are still exist. Different anonymity or sanitization techniques came into existence for protecting big data from privacy attacks. In our prior works, we have proposed a misusability probability based metric to know the probable percentage of misusability. We additionally planned a system that suggests level of sanitization before actually applying privacy protection to big data. It was based on misusability probability. In this paper, our focus is on further evaluation of our misuse probability based sanitization of big data approach by defining an algorithm which willanalyse the trade-offs between misuse probability and level of sanitization. It throws light into the proposed framework and misusability measure besides evaluation of the framework with an empirical study. Empirical study is made in public cloud environment with Amazon EC2 (compute engine), S3 (storage service) and EMR (MapReduce framework). The experimental results revealed the dynamics of the trade-offs between them. The insights help in making well informed decisions while sanitizing big data to ensure that it is protected without losing utility required.
Keywords: Big data, Privacy of Big Data, Sanitization, Misuse Probability, Utility of Big Data.
Scope of the Article: Big Data Analytics and Business Intelligence