Unsupervised Machine Learning to Detect Abnormal Activities using CNN & 3D Spatial-Temporal AutoEncoder (3DSTAE)
DOI:
https://doi.org/10.21467/proceedings.7.6.39Keywords:
camera, long term care, autoencoderAbstract
Live video feeds are very much important now for security, traffic, and keeping factories in check. Deep learning, using things like CNNs and Autoencoders, is really useful for keeping an eye on these videos. This paper talks about using CNNs and Autoencoders to watch live video and catch anything weird. We are teaching a CNN to focus on what matters, and then using an Autoencoder to spot things that do not seem right. The CNN learns what's normal by watching lots of regular video clips. The Autoencoder gets very well at copying those normal clips. When we test it, the Autoencoder looks for trouble by comparing its rebuilt clips with the real ones. The research has been carried out with the UCSD Pedestrian dataset, which has tons of walking scenes. The results say our system is spot-on and better than other ways of finding odd stuff in live video. This could be a game-changer for security, traffic, and factories where you need to catch problems ASAP. So, this study says that CNNs and Autoencoders are a good team for watching video and finding weird action as it happens. It also says that deep learning can really help with video checking in all sorts of places.
References
[1] Du Tran, Rainer Sorokin, Gerard Medioni. Long short-term memory over observation times for activity recognition. European Conference on Computer Vision (ECCV). https://doi.org/10.1007/978-3-319-10599-4_18
[2] Nawaratne, R., Alahakoon, D., De Silva, D., & Yu, X. (2020). Spatiotemporal Anomaly Detection Using Deep Learning for Real-Time Video Surveillance. IEEE Transactions on Industrial Informatics, 16(1), 393-402. https://doi.org/10.1109/tii.2019.2938527
[3] Nguyen, H., Loan, T.T.K., Mao, B. D., & Huh, E (2015). Low cost real-time system monitoring using Raspberry Pi. In International Conference on Ubiquitous and Future Networks. https://doi.org/10.1109/icufn.2015.7182665
[4] Kim, J., & Grauman, K. (2009). Observe Locally, Infer Globally: A Space-Time MRF for Detecting Abnormal activities with Incremental Updates. https://doi.org/10.1109/CVPR.2009.5206757
[5] Ko, T. H. (2008). A survey on behavior analysis in video surveillance for homeland security applications. In Applied Imagery Pattern Recognition Workshop. https://doi.org/10.1109/aipr.2008.4906450
[6] Xu, D., Ricci, E., Yan, Y., Song, J., & Sebe, N. (2015). Learning Deep Representations of Appearance and Motion for Anomalous Event Detection. https://doi.org/10.5244/C.29.8
[7] Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A. K., & Davis, L. S. (2016). Learning Temporal Regularity in Video Sequences. https://doi.org/10.1109/CVPR.2016.597
[8] Zhang et al., 2024. https://doi.org/10.48550/arXiv.2410.05900
[9] Nejad & Haque, 2024. https://doi.org/10.48550/arXiv.2411.08755
[10] Wu et al., 2024. https://doi.org/10.48550/arXiv.2408.05905
[11] Poirier, 2024. https://doi.org/10.48550/arXiv.2410.15909
[12] Rezaee, K., Rezakhani, S. M., Khosravi, M. R., & Moghimi, M. M. (2021). A survey on deep learning-based real-time crowd anomaly detection for secure distributed video surveillance. Personal and Ubiquitous Computing. https://doi.org/10.1007/s00779-021-01586-5
[13] Yun et al., 2024. https://doi.org/10.48550/arXiv.2406.18815
[14] Liu, W., Luo, W., Lian, D., & Gao, S. (2018). Future Frame Prediction for Anomaly Detection – A New Baseline. https://doi.org/10.1109/CVPR.2018.00957
[15] Tang, Y. L., Zhao, L., Zhang, S., Gong, C., Li, G., & Yang, J. (2020). Integrating prediction and reconstruction for anomaly detection. Pattern Recognition Letters, 129, 123-130. https://doi.org/10.1016/j.patrec.2019.11.024
[16] Sadeghi-Tehran, P., & Angelov, P. (2012). A real-time approach for novelty detection and trajectories analysis for anomaly recognition in video surveillance systems. In 2012 IEEE Conference on Evolving and Adaptive Intelligent Systems. https://doi.org/10.1109/eais.2012.6232814
[17] Wang, J., & Xu, Z. (2016). Spatio-temporal texture modelling for real-time crowd anomaly detection. Computer Vision and Image Understanding, 144, 177-187. https://doi.org/10.1016/j.cviu.2015.08.010
[18] https://paperswithcode.com/dataset/ucsd
[19] Tang, Y. L., Zhao, L., Zhang, S., Gong, C., Li, G., & Yang, J. (2020). Integrating prediction and reconstruction for anomaly detection. Pattern Recognition Letters, 129,123-130. https://doi.org/10.1016/j.patrec.2019.11.024
[20] Khan, S. S., Mishra, P. K., Javed, N., Ye, B., Newman, K., Mihailidis, A., & Iaboni, A. (2022). Unsupervised Deep Learning to Detect Agitation From Videos in People With Dementia. IEEE Access, 10, 10349-10358. https://doi.org/10.1109/access.2022.3143990
[21] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention augmented convolutional networks. https://doi.org/10.1109/ICCV.2017.488
[22] Wang, Y., Fan, C., Cheng, K., & Deng, P.S. (2011).
Real-time camera anomaly detection for real-world video surveillance. In International Conference on Machine Learning and Cybernetics. https://doi.org/10.1109/icmlc.2011.6017032
[23] Bertini, M., Del Bimbo, A., & Seidenari, L. (2012). Multi-scale and real-time non-parametric approach for anomaly detection and localization. Computer Vision and Image Understanding, 116(3), 320-329. https://doi.org/10.1016/j.cviu.2011.09.009
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.