Skip to content

5. Big Data and Privacy

Addressing Big Data Security Issues and Challenges 1

  • The data generated in the last three years is only 20% Structured and 80% is unstructured.
  • Like other current information systems, big data also faces security risks in the process of storage, processing, and transmission; and similarly possesses the requirements of data security and privacy protection.

Big Data Security Needs In Different Areas

  • Internet Industry:
    • Problems: data from being damaged, tampering, leaking or stealing is very difficult, especially with the rise of e-commerce and mobile usage.
    • Big data security requirements: reliable data storage, secure data analysis, strict operational supervision, following security protection standards, laws and regulations, and industry norms for user privacy.
  • Telecommunications Industry:
    • Problems: data confidentiality, user privacy, and business cooperation in the process of using data through external applications.
    • Big data security requirements: to ensure the confidentiality, integrity and availability of core data and resources and at the same time protect users’ interests, experience and privacy.
    • Efficiency in security protection is a key in the telecommunications industry.
  • E-commerce Industry:
    • The E-Commerce industry has higher requirements for network security and stability.
    • Big data security requirements: data access control, processing algorithms, network security, data management and applications, and the usage of big data security technology to strengthen the internal control of commercial institutions and improve the services to prevent and resolve financial risks.
  • Medical Industry:
    • Safe and reliable data storage is related to the survival of the hospital business.
    • Most medical data owners are reluctant to provide data directly to other organizations or individuals for research and utilization.
    • Big data security requirements: data privacy is higher than security and confidentiality; while requiring secure and reliable data storage, comprehensive data backup and management must be done to help doctors and patients with disease diagnosis, drug development management decisions, improve hospital services, improve patient satisfaction, and reduce patient turnover.

Big Data Security Challenges

  • Since, the security requirements in various fields are changing, from data collection, data integration, data refinement, data mining, security analysis, security situation judgment, and security detection to threat detection, a new complete chain has been formed.
  • In this chain, data can be lost, leaked, over-authorized, tampered with, and even involve user privacy and corporate secrets.
  • 3.1. Big Data User Privacy Protection:
    • Privacy protection has a few subdivisions:
      • Location privacy: to prevent the user’s location information from being leaked.
      • Identifier anonymity: to prevent the user’s identity information from being leaked.
      • Connection anonymity: to prevent the user’s connection information from being leaked.
    • Privacy protection cannot be achieved well only through anonymous protection.
  • 3.2. The credibility of big data:
    • One of the threats to the credibility of big data is forged or deliberately made data, and the wrong data often leads to wrong conclusions.
    • People tamper with data for various reasons, such as to achieve certain goals, to avoid punishment, or to obtain benefits.
    • Example issues: fake reviews, mixed real and fake reviews, etc.
    • It is impossible to identify the authenticity of all sources by means of information security technology.
    • Data may be tampered with during its transmission through the network; however, signing technology can be used to ensure the integrity of the data.
  • 3.3. Mobile data security issues:
    • Mobile phone usage has increased, and the data generated by mobile phones is also increasing.
  • 3.4. Easy attack on big data:
    • In cyberspace, big data is a big target that is easier to spot.
  • 3.5. User privacy protection problem:
    • The emergence of big data inevitably increases the risk of user privacy data disclosure.
  • 3.6. Safe storage of massive data:
    • The storage of massive data is a big challenge since relational databases are not suitable for storing big data.
    • NoSQL databases suffer from access control and privacy management model issues, technical vulnerabilities and maturity issues, security issues for authorization and verification, data management and confidentiality issues, etc.
  • 3.7. Big data life cycle changes:
    • Traditional data security is often deployed around the data lifecycle. That is; the generation, storage, use, and destruction of data.
    • With the increasing use of big data, the data owners and managers are separated, and the original data life cycle is gradually transformed into the generation, transmission, storage and use of data.
  • 3.8. Trust security issues of big data:
    • It is not easy for people to believe and trust the insights obtained through the big data model, and to prove that the value of big data itself is more difficult than the successful completion of a project.

Big Data Security and Privacy Protection 3

  • Big data can be divided into four stages according to its life cycle, including data collection, data storage, data development and mining and data application.
  • It is worth noting that the data owner and manager may not be the same person, so there are some security risks in the follow-up information processing.
  • There are two types:
    • Protect Privacy based on the anonymous model.
    • Protect Privacy based on the differential model.
  • In the past, the method of protecting data was based on the anonymous model; this is not enough to protect the privacy of the data anymore.

Security and Privacy in Big Data Life Cycle: A Survey and Open Challenges 4

Big Data for Qualitative Research 5

Security Challenges of Big Data Computing 6

Big Data: Trade-off Between Data Quality and Data Security 7

Big Data Security and Privacy Protection 8

Data Security vs. Data Privacy 9

  • Security and privacy are two different concepts.
  • Examples:
    • High security, low privacy: A well-protected building with security cameras and guards, but all rooms are open to each other.
    • High privacy, low security: A small boat in the middle of the ocean with no one around.
  • Data security is concerned with the CIA triad: Confidentiality, Integrity, and Availability; it may also be concerned about theft, loss, damage, intrusion detection, physical security, etc.
  • Privacy is concerned with the idea that if a legitimate user has access to the data following all data security protocols, can they do anything with the data?
  • Data privacy is defined as the enterprise keeping its promises of how it will use the data; this includes personal information, license agreements, etc.
  • data security vs privacy

References


  1. Gupta, N. K. (2018). Addressing big data security issues and challenges. International Journal of Computer Engineering & Technology, 9(4), 229-237. https://iaeme.com/MasterAdmin/Journal_uploads/IJCET/VOLUME_9_ISSUE_4/IJCET_09_04_025.pdf 

  2. Hoeren, T.,& Kolany‐Raiser, B. (2017). The Importance of Big Data for Jurisprudence and Legal Practice. In Dopke, C (Ed.), Big data in context: Legal, social and technological insights. (pp 13-19). SpringerOpen. https://link.springer.com/content/pdf/10.1007/978-3-319-62461-7.pdf 

  3. Huang, K. (2021). Big data security and privacy protection. In International Journal of Higher Education Teaching Theory 1(2), pp 200-203. http://www.acadpubl.com/Papers/Vol%202,%20No%201%20(IJHETT%202021).pdf#page=208 

  4. Koo, J., Kang, G., & Kim, Y.G. (2020). Security and privacy in big data life cycle: A survey and open challenges. Sustainability, 12(24), 10571. https://doi.org/10.3390/su122410571 

  5. Mills, K. A. (2019). Big data for qualitative research. Routledge focus. https://doi.org/10.4324/9780429056413 

  6. Sriram, G. K. (2022). Security challenges of big data computing. International Research Journal of Modernization in Engineering Technology and Science, 4(1), 1164-1171. https://www.irjmets.com/uploadedfiles/paper/issue_1_january_2022/18527/final/fin_irjmets1643004117.pdf 

  7. Talha, M., El Kalam, A. A., & Elmarzouqi, N. (2019). Big data: Trade-off between data quality and data security. ScienceDirect,151, 916-922. https://doi.org/10.1016/j.procs.2019.04.127 

  8. Zhang, D. (2018). Big data security and privacy protection. Proceedings of the 8th international conference on management and computer science (ICMCS 2018),77, 275-278. https://www.atlantis-press.com/article/25904185.pdf 

  9. Serious data solutions. (2020, March 5). Data security vs. Data privacy [Video]. YouTube. https://www.youtube.com/watch?v=rxF99MPnIyk