Convergence of
cybersecurity and big data science Prof Jan Eloff Johan Smit
The successful convergence of cybersecurity and big data science necessitates a clear understanding of big data, data science and cybersecurity. A global survey by the international Ponemon Institute, in conjunction with IBM, found that companies that leveraged the convergence of cybersecurity and big data science dramatically improved their overall cyber and information security posture.
Research conducted by the Cybersecurity and Big Data Science Research Group at the University of Pretoria examined the Cybersecurity Framework of the National Institute of Standards and Technology (NIST) to obtain an understanding of the convergence benefits of cybersecurity and big data science. This provided the foundation for several projects aimed at improving detection mechanisms by leveraging these convergence benefits.
DESCRIBING BIG DATA The volume of data is one way to describe big data – and there is much more data today than ever before. On Twitter alone, over 500 million tweets are sent per day and mobile traffic is expected to grow from 11.5 exabytes per month in 2017 to 77 exabytes by 2022. However, big data is defined by more than just the volume of data. It is also described in terms of variety (whether the data is structured or unstructured) and velocity (the speed of data flow and how fast the data is created and moved). One of the most important components of data science is machine learning. This is a field that spans disciplines such as computer science, statistics, mathematics, psychology and brain sciences. Combining machine learning with big data is a powerful development and forms the basis for the convergence of cybersecurity and big data science.
THE CYBERSECURITY FRAMEWORK The Cybersecurity Framework of the NIST was developed in the USA with the aim of assisting companies to understand the scope of cybersecurity and to minimise risk exposure. It consists of five functions that explain the convergence benefits of cybersecurity and big data science: Identify, Protect, Detect, Respond and Recover. 54
R E S E A R C H
F O C U S
big data technologies big data analysis big data visualisation machine learning models automation user behaviour models
IDENTIFY forensics
RECOVER
PROTECT
RESPOND
I N N O V A T E
DETECT big data analysis big data visualisation attack detection
faster response prediction

improve existing tools data protection prediction
1 5
2 0 2 0