![](https://assets.isu.pub/document-structure/210129142227-c30c3a797be5472482fc7f21391edfdd/v1/0670f12ad69fbbd5f2436f2bb8a387ad.jpg?width=720&quality=85%2C50)
5 minute read
Artificial intelligence
Artificial intelligence in the surveillance sector
By Mats Thulin, Director Core Technologies, Axis Communications
Advertisement
![](https://assets.isu.pub/document-structure/210129142227-c30c3a797be5472482fc7f21391edfdd/v1/52979bb3e0a0f8e13d9cf91632577829.jpg?width=720&quality=85%2C50)
If you’re a bit tired of hearing about the potential for artificial intelligence (AI) in our lives and work, you are not alone! AI have been one of the buzzwords of the past few years, and like all buzzwords its overuse and misunderstanding can lead people to be skeptical about its potential. While that’s understandable, we shouldn’t let this prevent us from recognizing some of the real potential of AI in specific applications within video analytics based on machine learning (ML) and deep learning (DL).
Defining AI, ML and DL in surveillance Artificial intelligence is a branch of computer science that studies and develops methods that allow computers to simulate intelligent behavior. In general terms AI is a very broad concept, but in the specific context of video analytics the principal focus is to increase operational efficiency and add value by automatically processing and analyzing video streams. In this context a subcategory of AI, machine learning, is more specifically relevant. As its name suggests, machine learning allows computers to improve algorithms through ‘learning’ based on real-world examples. The improved algorithms are then used to analyze images or video sequences to generate alarms, metadata or other information.
More recently, attention has turned to a subcategory of ML, deep learning, which describes algorithms based on simulated neural networks. The idea for this type of algorithm was inspired by the human vision system, hence its name – neural networks. In DL networks, layers of operations are arranged in a hierarchy of complex and abstract layers, each layer using information from the previous one to draw its final conclusion.
DL models enable more complex analytical algorithms and generally achieve greater precision than traditional ones. In video surveillance systems they are used primarily in the detection, classification and recognition of different types of objects. However, one drawback of DL algorithms is that they require more computational power and more mathematical operations in comparison to traditional algorithms.
Deep learning’s demand for lots of data ML and DL requires relevant huge amounts of input data for training to achieve good quality results. If enough relevant data – and computing power – is available for training, ML- and DL-based methods can efficiently process it to achieve algorithms with higher precision.
The computer can analyze thousands of images to find details that characterize specific objects in different scenarios. If the data and their descriptions are of high quality, therefore, an application based on DL is able to achieve even greater accuracy. But availability of highquality data can be a challenge.
Perhaps countering the general perception of AI, today’s technologies still lack awareness or what might be referred to as general intelligence. In applications where the technology is used, it focuses on very specific problems in limited areas.
For example, for a voice application such as Siri or Alexa to accurately answer our questions, we need to ask very specific and explicit questions. Otherwise, we will get a completely incomprehensible answer. Similarly, in surveillance systems: a poor description of images used for training will result in applications with low accuracy.
Given the current limitations in accuracy of these technologies, and that properly and contextually understanding a scene in real detail from a video is still far away, we must be cautious on how and where to use these technologies. The technology today improves efficiency but the actual decision making in a surveillance scenario must still lie with the security guard or the operator. We must keep a ‘human in the loop’.
Keeping the surveillance use case in mind As any new technology matures beyond the initial ‘hype’, weaknesses and limitations of the technology will become clear and only in the areas where the technology it provides real value we will see growth. In surveillance it is important to start with the use-case: what problem are you trying to solve, or which effect are you looking to achieve? Based on a good understanding of the specific usecase it is much more feasible to apply ML and DL to achieve a good result.
While we are still at the beginning of the AI journey in surveillance, there are applications and use-cases in which DL analytics is already providing real value for organizations, for instance when browsing large amounts of recorded material in search of specific objects or events, what we often refer to as forensic search.
Analytics applications and the use of DL analytics surveillance systems will increase, but a cautious approach is needed. Truly understanding the usecases, the limitations of the technology and thorough testing and evaluation to make sure the intended result is achieved is crucial. High-quality surveillance images as the foundation Fundamental to the ability to analyze video is camera and image quality, or what is known as ‘image usability’: the image quality directly reflects in the quality of the video analytics accuracy. Video cameras in surveillance systems need to operate around the clock, 365 days a year, deal with temperature fluctuations and different lighting conditions, while still analyzing the image correctly in real-time.
One industry trend is that more advanced video analytics are moving to edge devices, with applications running on the cameras themselves. There are a number of benefits in this: for instance, saving bandwidth – as only the extracted data needs to be transferred from the camera – addressing privacy concerns, saving on expensive server-side hardware and more accurate analytics, as the video is analyzed before the video is compressed with the risk of quality degradation. Intelligent analytics on the edge will open up numerous opportunities for applications that will further enhance safety and security and deliver additional benefits in operational efficiency.
Patience is a virtue Those familiar with the Gartner Technology Hype Cycle will know that after initial excitement about innovations in technology, there is an almost inevitable period of frustration when it appears not to be meeting expectations. But rest assured, many people are working behind the scenes to ensure that artificial intelligence – and more specifically machine learning and deep learning – over time will deliver on their potential.
Reference: www.gartner.com/ en/research/methodologies/ gartner-hype-cycle