Skip to content

WA5. Big Data and Privacy

Statement

For this week’s written assignment, answer the following questions:

  • Explain the three main threats to big data security for organizations.
  • Identify and explain three steps organizations need to take to ensure they address these threats with appropriate plans to mitigate in case of a breach.

Answer

Introduction

Data security and privacy are two different concepts. Data security is concerned with the CIA triad: Confidentiality, Integrity, and Availability; while privacy is concerned with the protection of personal information even from legitimate data owners (Serious Data Solutions, 2020).

When speaking about big data security, usually it refers to the combination of both concepts protecting all data at all stages of its life cycle starting from collection, storage, analysis, utilization, and destruction (Koo & Kim, 2016).

Three Main Threats to Big Data Security for Organizations

During each step of the big data life cycle, data may be leaked, tampered with, or lost due to insider or outsider factors. Outsider factors include hackers, malware, third-party services, and any other external entity that the organization does not have control over. Insider factors include employees, hardware, internal software, and anything else controlled by the organization. The three main threats align with hindering the three pillars of the CIA triad.

Data leaks involve hindering the confidentiality part of the CIA triad; it means unauthorized entities have access to data that they should not access. Outsider factors may include backdoor attacks, malware, authentication-related attacks, or any other attack that tries to bypass authentication/authorization measures. Insider factors may include buggy software, misconfigured systems, or employees with malicious intent. The danger of data leaks goes beyond the organization as stolen data can be used to impersonate users, or as a vector for further stronger attacks.

Data tampering involves hindering the integrity part of the CIA triad; it means unauthorized entities can modify data without the consent or knowledge of the data owner and/or manager. Outsider factors usually involve taking control over data sources and generating incorrect data, or modifying data during its transmission through the network by intercepting connections (man-in-the-middle attack). Insider factors usually involve employees modifying data for personal gain, or due to human error. The danger of data tampering is that it can lead to wrong conclusions and decisions.

Data loss involves hindering the availability part of the CIA triad; it means data is lost due to hardware failure, software bugs, or malicious attacks. Outsider factors usually involve DoS (denial of service) attacks, ransomware, or any other attack that aims to take down hardware or corrupt stored data. Insider factors may include data deletion or hardware destruction by mistake or on purpose. The permanent loss of data can lead to catastrophic losses on multiple levels, although this is hard if cloud storage with proper backup mechanisms is used.

Mitigation Steps in Case of a Breach

To mitigate data leaks, organizations need to implement fine-grained access control mechanisms that effectively follow the principle of least privilege where users have access to the data only when and if they need it. Organizations should put extra effort into securing networks and using good firewalls, but it is recommended to use a well-tested cloud solution as implementing these things in-house is expensive. All data at all times should be encrypted using a proper algorithm with keys not stored in proximity to the data; that is, even if data is leaked, it can not be read.

To mitigate data tampering, proper signing techniques should be used when data sources contact the base system where any alterations during transit are flagged. Effective surveillance and intrusion detection mechanisms at the base should be used to detect anomalies from data sources, such as if one source is hacked. Proper logging and/or ledger databases that prevent any data mutations from going unnoticed.

To mitigate data loss, organizations need to implement proper backup mechanisms where multiple backups are being updated regularly and physically separated from each other. Cloud solutions are also good at implementing backups with ease. Good rate limiting and load balancing techniques are especially important to keep hardware running and prevent distributed DoS attacks.

Conclusion

Privacy and security of big data are two close concepts and it is hard to focus on one without the other; however, all organizations need to stay informed about the standards, best practices, and laws around the security of big data and apply any changes as soon as they are due.

Data leakage, tampering, and loss are of great danger to the organization (data owners and managers), users, and the society itself. Using good cloud solutions, focusing on network security and tools, encryption, proper access control, and good surveillance and monitoring tools can mitigate a big portion of data security threats.

References