GRD Journals- Global Research and Development Journal for Engineering | Volume 5 | Issue 4 | March 2020 ISSN: 2455-5703
Women Abuse Detection in Video Surveillance using Deep Learning R. Sandhiya Coimbatore Institute of Technology, Coimbatore, India
A. R. Gokul Prassad Coimbatore Institute of Technology, Coimbatore, India
D. Gokul Krishnan Coimbatore Institute of Technology, Coimbatore, India
S. PrajethBalan Coimbatore Institute of Technology, Coimbatore, India
Abstract In this paper, we proposed women abuse detection method in surveillance video system. The proposed method consists of gender detection using convolutional neural networks (CNNs) for identifying both male and female are present in the location .The abuse action is then detected by using the 4 steps: i) Detection of object region by background subtraction method then apply the morphology filter to reduce noise artifacts. ii) Estimation of the motion vector using the Combined Local-Global approach with Total Variation (CLG-TV) iii) Detection of the abuse action by examining the characteristic of motion vectors produced by using the Motion Co-occurrence Feature (MCF). Keywords- Violence detection, Convolutional Neural Networks, Human action analysis, Video surveillance system
I. INTRODUCTION Women safety is a big question mark through the world. It’s a fact that crimes against women are occurring daily. A concern for women in our family and society has lent a sense of urgency to our action on the critical and pressing issue of women's safety. Women abuse detection plays a major role in women safety in public places such as bus stops, street. Deep learning is an artificial intelligence that copies the human brain working in creating patterns and processing data for use in decision making. It is a subset of machine learning that has networks capable of studying unsupervised from data that is unstructured .and is also known as deep neural learning or deep neural network .Deep learning learns from huge amounts of unlabeled data that would take humans years to interpret and process. Deep learning algorithms resemble our nervous system structure where each neuron connects one another and pass information. Deep learning models have layers, Each layer accepts the information from previous and pass it on to the next one and a typical model at least have three layers. Deep learning let machines to solve complex problems even when using a data set that is very differing, disorganized and inter-connected. The more deep learning algorithms learn, the better they achieve. Let’s consider a neural network to identify photos that contain one cat. But cats don’t look similar. Photos also don’t show them in the same light, angle and size. Therefore we need to compile a training set of images around thousands of of cat faces and label it as “cat”, and pictures of objects that aren’t cats are labelled as “not cat”. The neural network is fed with these images. An image is transformed into data which moves through the network and various neurons assign weights to different elements. the more weight is given to a slightly curved diagonal line than a perfect 90-degree angle .The final output layer puts together all the pieces of information – pointed ears, nose and display the answer as cat. The neural network analyze this answer to the real, human-generated label. If it matches it display the answer , if the image was of a dog , for instance – the neural network makes remark of the error and then goes back to adjusts its neuron weightings. The neural network then collect another image and repeats the same process for thousands of times and adjusting its weightings and improving its cat recognition skills.
II. LITERATURE SURVEY Dennis N´u˜nez Fern´andez proposed a method to detect gender using Convolutional neural network (CNNs) [1]. Jinsol Ha, Jinho Park, Heegwang Kim, Hasil Park, and Joonki Paik proposed a method to estimate a motion vector of the object in the image. [2]. Sarita Chaudharya, Mohd Aamir Khana, Charul Bhatnagara proposed work consequently distinguishes different strange exercises in recordings. The proposed system incorporates three principle steps: moving article discovery, object following and conduct understanding for action acknowledgment[3]. SHIQING ZHANG,XIANZHANG PAN1,YUELI CUI, XIAOMING ZHAO1, AND LIMEI LIU proposed framework utilizes two individual profound convolutional neural systems (CNNs), including a spatial CNN handling static facial pictures and a worldly CN network processing optical flow pictures, to independently learn significant level spatial and fleeting highlights on the separated video portions. These two CNNs are tweaked on track video outward appearance datasets from a pre-prepared CNN model[4]. Xiangru Chen, Yue Yu, Fengxia Li proposed model is used to predict the next several frames of a hard and fast of sensor facts, which is continuous records but is
All rights reserved by www.grdjournals.com
1
Women Abuse Detection in Video Surveillance using Deep Learning (GRDJE/ Volume 5 / Issue 4 / 001)
pre-processed by way of embedding method proposed. EMRD extends the previous Encoder-Recurrent-Decoder (ERD) model and Long Short Terms Memory (LSTM) model which are used inside the video human body motion prediction[8].
III. ABUSE EVENT DETECTION METHOD In this paper, we propose the women abuse detection method by combining both the above methods. The proposed method identifies both gender are present. Afterwards estimates a motion vector of the object in the image. The estimated motion vector quantized in 8 directions. The violence event is detected by analyzing the Co-occurrence in the direction of the quantized motion vector. A. Gender Detection In deep learning, A convolutional neural network(CNN) is a collection of neural networks and is applied in analyzing visual imagery, Convolutional Neural Network (CNN) take an image as input and assign importance such as learnable weights and biases to various objects in the image and differentiate one from the other. The pre-processing required in a Convolutional Neural Network is minor as compared to other classification algorithms. Convolutional Neural Network is able to capture the Spatial and Temporal dependencies in an image through relevant filters. The network performs a better match to the image dataset due to the reduction in the number of parameters involved and reusability of weights. In other words, the network can be trained to understand the of composure the image better. Gender recognition has eight layers, Three convolutional layers, two full- connected layers of 512 neurons length each one and subsampling layers after each convolutional layer. In CNN each neuron takes a input, performs a dot product and followed by a non-linearity, It is optional. The entire network expresses a single differentiable score function from the class scores on one end to raw image pixels at the other. CNN consists of an input, output layer and a multiple hidden layers. The hidden layers consist of a series of convolutional layers that combine with a multiplication or other dot product. The activation function is a RELU layer. It is followed by additional convolutions such as normalization layers , fully connected layers and pooling layers referred to as hidden layers because their inputs and outputs are masked . The final convolution involves back propagation in order to accurately weight the final outcome.
Fig. 1: Architecture of the CNN for gender.
(startX, startY) = f[0], f[1] (endX, endY) = f[2], f[3] #To get corner points of face rectangle cv2.rectangle(frame, (startX ,startY), (endX ,endY), (0 ,255 ,0), 2) # draw rectangle over face face_crop = np.copy(frame[startY:endY ,startX:endX]) # crop the detected face region if (face_crop.shape[0]) < 10 or (face_crop.shape[1]) < 10: continue # preprocessing for gender detection model face_crop = cv2.resize(face_crop, (96 ,96)) face_crop = face_crop.astype("float") / 255.0 face_crop = img_to_array(face_crop) face_crop = np.expand_dims(face_crop, axis=0) # apply gender detection on face conf = model.predict(face_crop)[0] print(conf) print(classes) # get label with max accuracy idx = np.argmax(conf) label = classes[idx] All rights reserved by www.grdjournals.com
2
Women Abuse Detection in Video Surveillance using Deep Learning (GRDJE/ Volume 5 / Issue 4 / 001)
if label == 'male': mal = 1 if label == 'female': femal = 1 #check whether both male and female are if femal == 1 and mal == 1 : class MotionDetection(object):
present in the location
B. Object Region Detection A background subtraction method is used to detect the object region and it is a popular method for isolating the moving parts of a scene by segmenting it into background and foreground .After generating the adaptive background image, if the difference value between the input image and the background image is larger than the previously threshold set value, the background image is updated. From Fig. 1 the sequence of background subtracted images shown is the human's walking action can be easily interpreted. The shape of the human silhouette plays an important role in recognizing human actions several methods based on boundary, skeletal, and global descriptors have been proposed to quantify the shape of the silhouette. Global methods such as moments consider the entire shape region to compute the shape descriptor. Boundary methods consider only the shape outline as the defining characteristic of the shape, It include landmark-based shape descriptors and chain code.
Fig. 2: Background subtracted image.
cv2.imshow("Motion Detection", frame) blur = cv2.GaussianBlur(frame, (19, 19), 0) mask = self.sub.apply(blur) cv2.imshow("sub", mask) #get background subtracted C. Motion Vector Estimation In the abuse event, the objectâ&#x20AC;&#x2122;s movement occurs irregularly. To utilize this characteristic, we estimate the motion vector using the combined local- global approach with total variation (CLG-TV) [3] in the detected object region. The size of an object is affected by the intrinsic and extrinsic parameters of the camera. For that reason, the objectâ&#x20AC;&#x2122; size varies with the location where it is captured. The change of the object size also affects the magnitude of the motion vector. To compensate for this, the estimated u , v )T . motion vector is divided by the size of the object region. Finally, the normalized motion vector m
Fig. 3: Motion vector quantization
D. Motion Feature Extraction We use Motion Co-occurrence Feature (MCF) to analyze the characteristics of motion vectors generated in the object region [4]. MCF represents the Co-occurrence distributions of the motion vector. Motion Co-occurrence Feature (MCF) is global feature that uses local information by considering temporal and spatial of motion vectors. All rights reserved by www.grdjournals.com
3
Women Abuse Detection in Video Surveillance using Deep Learning (GRDJE/ Volume 5 / Issue 4 / 001)
Fig. 4: The result of extracting optical flow and MCF.
img_temp = np.ones(frame.shape, dtype="uint8") * 255 img_temp_and = cv2.bitwise_and(img_temp, img_temp, mask=mask) img_temp_and_bgr = cv2.cvtColor(img_temp_and, cv2.COLOR_BGR2GRAY) hist, bins = np.histogram(img_temp_and_bgr.ravel(), 256, [0, 256]) # motion vector estimation and histogram is produced
IV. EXPERIMENTAL RESULTS The proposed algorithm is tested using a video taken directly. the performance of this system is evaluated in real-time under various conditions such as different backgrounds, distances, and different light conditions.The parameter T = 10 , and the abuse event is detected when the MCF has more than 7 histograms which is larger than threshold value.
a)The result of gender detection
All rights reserved by www.grdjournals.com
4
Women Abuse Detection in Video Surveillance using Deep Learning (GRDJE/ Volume 5 / Issue 4 / 001)
b) The result of abuse detection Fig. 5: result of the violence detection with a abuse video input.
The result of the proposed effectively detects women abuse event using convolutional neural network and generated MCF.
V. CONCLUSION In this paper, we proposed a method to analyze the women abuse event. Proposed method detects whether both male and female are present in the location using CNN. In order to perform classification with high accuracy rate and fast time response, a straightforward architecture for each CNN was designed and afterwards the method estimates the optical flow in order to use the motion vector characteristics that appear in the women abuse event of the object region. The estimated optical flow is used to extract the MCF and detected the abuse event using irregular motion information in MCF. The results show that the proposed method effectively detects women.
ACKNOWLEDGMENT R.Sandhiya has obtained her Bachelors in Information Technology at the CSI College of Engineering from Anna University in 2011. She obtained her Master’s in Information Technology at Anna University Regional Centre, Coimbatore from Anna University in 2014. She is currently working as Professor in the Department of Information Technology at Coimbatore Institute of Technology, Coimbatore, India. She is doing her Ph.D. in the area of Ontology under the guidance of Dr. M.Sundarambal. A.R.Gokul Prassad , D.Gokula Krishnan, S.Prajeth Balan are the students of 3rd Btech Information Technology of Coimbatore Institute of Technology who have worked in this project.
REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
Dennis N´u˜nez Fern´andez, “A Real-Time Recognition System for User Characteristics Based on Deep Learning”, 2018 IEEE XXV International Conference on Electronics, Electrical Engineering and Computing (INTERCON), 2018. Jinsol Ha, Jinho Park, Heegwang Kim, Hasil Park, and Joonki Paik, “Violence detection for video surveillance system using irregular motion information”, 2018 International Conference on Electronics, Information, and Communication (ICEIC),2018. Sarita Chaudharya, Mohd Aamir Khana, Charul Bhatnagara,”Multiple Anomalous Activity Detection in Videos”, 6th International Conference on Smart Computing and Communications, ICSCC ,2017. SHIQING ZHANG,XIANZHANG PAN1,YUELI CUI, XIAOMING ZHAO1, AND LIMEI LIU” Learning Affective Video Features for Facial Expression Recognition via Hybrid Deep Learning” IEEE Access ( Volume: 7 )2019. Devesh K. Jha1, Abhishek Srivastav, and Asok Ray,” Temporal Learning in Video Data Using Deep Learning and Gaussian Processes” International Journal of Prognostics and Health Management 2016. Prasad.D.Garje ,M.S.Nagmode ,Kiran. C.Davakhar,”Optical flow Based Violence Detection in Video Surveillance ” 2018 International Conference On Advances in Communication and Computing Technology (ICACCT),2018. KewenYan ; ShaohuiHuang ; Yaoxian Song ; Wei Liu ; Neng Fan,” Fight recognition for user characteristics using convolution neural networks ” 2017 36th Chinese Control Conference (CCC),2017. Xiangru Chen, Yue Yu, Fengxia Li, ”Multiple RNN Method to Prediction Human Action with Sensor Data ” 2017 International Julieta Martinez ; Michael J. Black ; Javier Romero,” On Human Motion Prediction Using Recurrent Neural Networks” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2017. Maryam Babaee ; Zimu Li ; Gerhard Rigoll, “Occlusion Handling in Tracking Multiple People Using RNN”, 2018 25th IEEE International Conference on Image Processing (ICIP),2018. Kyoungson Jhang,” Gender Prediction Based on Voting of CNN Models”, 2019 International Conference on Green and Human Information Technology (ICGHIT),2019. Kyoungson Jhang ; Junsoo Cho,” CNN Training for Face Photo based Gender and Age Group Prediction with Camera”, 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC),2019. Ryo Arai ; Kazuhito Murakami’ “Hierarchical human motion recognition by using motion capture system” , 2018 International Workshop on Advanced Image Technology (IWAIT),2018.
All rights reserved by www.grdjournals.com
5