International Journal of Scientific & Technology Research

Home About Us Scope Editorial Board Contact Us

IJSTR >> Volume 10 - Issue 10, October 2021 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

A Near Real-Time Traffic Congestion Monitoring System Using Sentiment Analysis On Twitter Data

[Full Text]



Goboitshepo Ororiseng Leroke, Manoj Lall



Social Media; Twitter; Traffic Congestion; Sentiment Analysis; Natural Language Processing; Machine Learning; Topic Modelling.



Traffic congestion is a major challenge facing urban areas around the globe today. A common approach adopted by government agencies for the monitoring of traffic conditions is by making use of CCTV cameras or electronic sensors. These approaches requires the maintenance of a large network of sensors and cameras to monitor every street in the city. This is impractical and very costly. However, with the advancement of social media in all its forms, including blogs, online forums, Facebook, and Twitter, it is possible to treat social media as a human sensor network. In this article, an alternative traffic monitoring approach that is inexpensive and provides traffic information in near real-time is developed. The proposed approach makes use of Twitter data analytics to report on the prevailing traffic conditions in a particular locality. In addition, the reason behind the traffic congestion is also highlighted. Knowing the cause of the traffic congestion is important as it gives an indication of the severity of the problem. For the modelling of the proposed near real time Twitter-based model, 5 000 tweets collected over a period of six months were collected for a particular geographical location. The relevant Twitters were pre-processing to obtain the applicable features such as the location of the origin of a particular post, the time when the tweet was posted. Random Forest, Naïve Bayes, Support Vector Machine and K-Nearest Neighbour were used in the construction of the classification model. The best performing model (Naïve Bayes) was selected for real-time tweet classification. Python’s Natural Language Toolkit (NLTK) and associated libraries, was applied to enhance the suitability of tweets for conducting sentiment analysis and topic modelling. The emotions expressed in the tweets were captured by sentiment analysis and the reason behind traffic congestions were determined by topic modelling. The location, the sentiment and the reasons for the traffic congestions were visualized using street map. It is envisaged that such a model will assist commuters in making an informed decision on route selection.



[1] F.A. Elsafoury, “Monitoring urban traffic status using Twitter messages”, pp. 1–46, 2013.
[2] W. Musakwa, “The use of social media in public transit systems: the case of the Gautrain, Gauteng province, South Africa: analysis and lessons learnt”, Proceedings of 19th International Conference on Urban Planning, Regional Development and Information Society, pp. 721-727, 2014.
[3] Y. Gong, F. Deng, and R.O. Sinnott, “Identification of (near) Real-time Traffic Congestion in the Cities of Australia through Twitter”, Proceedings of the ACM First International Workshop on Understanding the City with Urban Informatics, pp. 7-12, 2015.
[4] T. Kurashima, T. Iwata, G. Irie, and K. Fujimura, “Travel route recommendation using geotags in photo sharing sites”, Proceedings of the 19th ACM international conference on Information and knowledge management, pp. 579-588, 2010.
[5] Y. Somwanshi, V. Salegaonkar, and S. Sharma, “Understanding Social Media Phenomenon, Diversity and Research”, International Journal of Computer Applications, 129(9):5-8., 2015.
[6] T. Mahmood, G. Mujtaba, L. Shuib, N.Z. Ali, A. Bawa, and S. Karim, “Public bus commuter assistance through the named entity recognition of twitter feeds and intelligent route finding”, IET Intelligent Transport Systems, 11(8):521-529, 2017.
[7] C. Khatri, “Real-time road traffic information detection through social media”, arXiv preprint arXiv:1801.05088, 2018.
[8] S. Grosenick, “Real-time traffic prediction improvement through semantic mining of social networks”, Thesis (Master's), University of Washington. URI available at http://www.hdl.handle.net/1773/20911, 2012.
[9] S. Kharche, and L. Bijole, “Review on sentiment analysis of twitter data”, International Journal of Computer Science and Applications, 2015.
[10] G.O. Leroke, and M. Lall, “A (near) real-time traffic monitoring system using social media analytics”, Journal of Engineering and Applied Sciences, Vol 14, No. 21, pp 8055 – 8060, 2019.
[11] T.H. Silva, P.O.V. De Melo, A.C. Viana, J.M. Almeida, J. Salles, and A.A. Loureiro, “Traffic condition is more than colored lines on a map: characterization of waze alerts”, International Conference on Social Informatics. Springer:309-318, 2013.
[12] M.H. Tsou, “Research challenges and opportunities in mapping social media and Big Data”, Cartography and Geographic Information Science, 42(sup1):70-74, 2015.
[13] K. R. Pandhare, M.A. Shah, “Real time road traffic event detection using Twitter and spark”, 2017 International conference on inventive communication and computational technologies (ICICCT). IEEE:445-449, 2017.
[14] S. Hasan, S.V. Ukkusuri, “Urban activity pattern classification using topic models from online geo-location data”, Transportation Research Part C: Emerging Technologies, 44:363-381, 2014.
[15] L. Li, J. Zhang, Y. Wang, and B. Ran, “Missing value imputation for traffic-related time series data based on a multi-view learning method”, IEEE Transactions on Intelligent Transportation Systems, 20(8):2933-2943, 2018.
[16] G. Yu, “Strategies of newsroom convergence: comparing UK and Chinese newspaper groups”, Doctoral Thesis, University of Westminster, UK, 2016.
[17] X. Zheng, W. Chen, P. Wang, D. Shen, S. Chen, X. Wang, Q. Zhang, and L. Yang, “Big data for social transportation”, IEEE Transactions on Intelligent Transportation Systems, 17(3):620-630, 2015.
[18] H. Nguyen, W. Liu, P. Rivera, and F. Chen, “Trafficwatch: real-time traffic incident detection and monitoring using social media”, Pacific-asia conference on knowledge discovery and data mining pp. 540-551, 2016.
[19] R. Kosala, and E. Adi, “Harvesting real time traffic information from Twitter”, Procedia Engineering, 50:1-11, 2012.
[20] F. Chen, and R. Krishnan, “Transportation sentiment analysis for safety enhancement”, Technologies for Safe and Efficient Transportation, Carnegie Mellon University, 2013.
[21] E. Mai, and R. Hranac, “Twitter interactions as a data source for transportation incidents” 92th Annual Meeting on TRB, 2013.
[22] S. Wang, H. Dong, Y. Zhou, L. Jia, and Y. Qin, “Exploring traffic accident locations from natural language based on spatial information retrieval”, 2017 29th Chinese Control And Decision Conference (CCDC), pp. 3490-3495, 2017.
[23] A. Kumar, M. Jiang, and Y. Fang, “Where not to go?: detecting road hazards using twitter”, Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, pp. 1223-1226, 2014.
[24] Y. Lin, and R. Li, “Real-time traffic accidents post-impact prediction: Based on crowdsourcing data”, Accident Analysis & Prevention, 145:105696, 2020.
[25] M.C. Lucic, X. Wan, H. Ghazzai, and Y. Massoud, “Leveraging Intelligent Transportation Systems and Smart Vehicles Using Crowdsourcing: An Overview”, Smart Cities, 3(2):341-361, 2020.
[26] D. McHugh, “Traffic prediction and analysis using a big data and visualisation approach”, Department of Computer Science, Institute of Technology Blanchardstown, 2015. Retrieved from http://leeds.gisruk.org/abstracts/GISRUK2015_submission_20.pdf
[27] D. Abrahams, R.W. Grosse-Kunstleve, and O. Overloading, “Building hybrid systems with Boost”, Python. CC Plus Plus Users Journal, 21(7):29-36, 2003.
[28] M. Lall, “Exploring Interdisciplinary Nature of Postgraduate Research in the Field of Computing Using Text Mining: A Case Study”, IEEE 15th International Conference on Industrial and Information Systems (ICIIS-2020), 500-505, 26th – 28 November, 2020. Ropar, India.
[29] H. Christian, M.P. Agus, and D. Suhartono, D. “Single document automatic text summarization using term frequency-inverse document frequency (TF-IDF)”, ComTech: Computer, Mathematics and Engineering Applications, 7(4):285-294, 2016.
[30] A.K. Sandhu, and R.S. Batth. "Software reuse analytics using integrated random forest and gradient boosting machine learning algorithm." Software: Practice and Experience 51, no. 4: 735-747, 2021.
[31] K. Shah, H. Patel, D. Sanghvi, et al. “A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification”, Augment Hum Res, pp. 5-12, 2020.
[32] L. Muchene, W. Safari, “Two-stage topic modelling of scientific publications: A case study of University of Nairobi, Kenya”, PLoS ONE 16(1): e0243208. https://doi.org/10.1371/journal.pone.0243208, 2021.
[33] S.A. Curiskis, B. Drake, T.R. Osborn, and P.J. Kennedy, “An evaluation of document clustering and topic modelling in two online social networks: Twitter and Reddit”, Information Processing & Management, 57(2), 102034, 2020.
[34] J. Erman, M. Arlitt, and A. Mahanti, “Traffic classification using clustering algorithms”, Proceedings of the 2006 SIGCOMM workshop on Mining network data, pp. 281-286, 2006.
[35] D. M. Blei, A. Y. Ng, anfd M. I. Jordan, “Latent dirichlet allocation”, The Journal of machine Learning research, 3, pp. 993-1022, 2003.
[36] D. M. Blei, “Probabilistic topic models”, Communications of the ACM, 55(4), pp. 77-84, 2012.
[37] D. M. El-Din, “Enhancement bag-of-words model for solving the challenges of sentiment analysis”, International Journal of Advanced Computer Science and Applications, 7(1), 2016.
[38] X. Wang, K. Zeng, X. L. Zhao, and F. Y. Wang, “Using web data to enhance traffic situation awareness”, 17th International IEEE Conference on Intelligent Transportation Systems (ITSC), pp. 195-199, 2014.
[39] S. E. Middleton, L. Middleton, and S. Modafferi, “Real-time crisis mapping of natural disasters using social media”, IEEE Intelligent Systems, 29(2):9-17, 2014.