JA5. Big data security¶
Statement¶
Big data security encompasses many different items that can effectively make big data vulnerable, and if not properly structured can allow threat actors to gain access to this data, which at times can contain very private customer/client information.
Identify at least two areas affecting big data security and provide a description of those threats. Compare the two areas and provide details on how they affect organizations.
Answer¶
Introduction¶
Data security and privacy are two different concepts. Data security is concerned with the CIA triad: Confidentiality, Integrity, and Availability; while privacy is concerned with the protection of personal information even from legitimate data owners (Serious Data Solutions, 2020).
When speaking about areas affecting big data security, we may mean the threats that can affect the data at each stage of its life cycle or the types of threats that may arise such as data leaks, tampering, or loss (Koo & Kim, 2016).
This text will assume the first and will be discussing two areas affecting big data security: data during collection and data during storage.
Data Collection¶
Data collection is the process in which data sources send events to the base system where they will be stored or utilized according to the purpose of the system and/or the organization’s use case. Data collection is the first step in the big data life cycle and it involves senders communicating with the base system through a network.
Threats that arise during data collection may include data leakage, tempering, or loss. Data leakage may occur through eavesdropping on the network where the attacker intercepts the connection and reads the data being sent. Data tempering may occur through man-in-the-middle attacks where the attacker intercepts the connection and modifies the data before it reaches the base system. Data loss may occur when the data being sent is not received by the base system due to network issues. Insider threats are not of big concern at this stage.
The data collection area can be secured by investing in network security and network monitoring tools, encrypting data during transit, using appropriate signing techniques to ensure data integrity, and implementing proper rate limiting and load balancing techniques to prevent DoS attacks.
Data Storage¶
Data storage is the next step in the big data life cycle where data is accumulated and rested awaiting its utilization. Data storage is the most vulnerable stage as data waits there for relatively long periods and the massive amount of data is stored which makes it a hot target for attackers.
Similarly to data collection, threats that arise during data storage may include data leakage, tempering, or loss. Data leakage may occur through unauthorized access to the storage system using backdoor attacks, social engineering, or network attacks. Data tempering may occur through external or internal actors modifying the data after gaining access. Data loss may occur due to hardware failure, software bugs, or malicious attacks. Insider threats such as bad employees or misconfigured systems are of a bigger concern at this stage.
The data storage area can be secured by encrypting data at rest while keeping the keys separate from the data, using proper access control mechanisms that follow the principle of least privilege, implementing good monitoring and logging mechanisms to detect unauthorized modifications, and using proper backup mechanisms that are updated regularly and physically separated from each other.
Conclusion¶
Data security and privacy are two close concepts that affect big data in each stage of its life cycle by either leaking, tampering, or losing data. Organizations need to stay informed about the standards, best practices, and laws around the security of big data and apply any changes as soon as they are due.
The effect on organizations may include financial losses, loss of customers, legal consequences, or just drawing the wrong conclusions from the data. When a large amount of big data from a big organization is leaked, it may lead to a national disaster especially if the data contains sensitive information about the citizens.
Word count: 600.
References¶
- Serious data solutions. (2020, March 5). Data security vs. Data privacy [Video]. YouTube. https://www.youtube.com/watch?v=rxF99MPnIyk
- Koo, J., Kang, G., & Kim, Y.G. (2020). Security and privacy in big data life cycle: A survey and open challenges. Sustainability, 12(24), 10571. https://doi.org/10.3390/su122410571