An Efficient Feature Extraction Method For Mining Social Media
[Full Text]
AUTHOR(S)
V Mageshwari, Dr. I. Laurence Aroquiaraj
KEYWORDS
Classification, BOW, feature extraction, HIV/AIDS, pre-processing, tf-idf, twitter
ABSTRACT
Social media facilitates the users to exchange their opinion, thoughts and ideas. The advantage of sharing an information through social media is, it will widespread the content quickly. There are so many social media platforms among which Twitter is one of them. Through twitter the user can communicate the information briefly. So many real-world issues are discussed on twitter, in which the discussion about HIV/AIDS is ranked as one of the topmost topics. Due to the advancement of social media many users have come forward to discuss about this societal topic. These kinds of discussion will help the communication campaigns to promote better HIV/AIDS education. In this work tweets were collected by the keywords including HIV and AIDS. Following the pre-processing steps, feature extraction has been carried out. Feature extraction is very crucial step in mining twitter because the data is in unstructured format. So, increasing the efficiency of feature extraction will improve the outcome of classification task. In this work an efficient feature extraction method has been proposed which gives a better result when compared to existing.
REFERENCES
[1] Akrini Krouska & Christos Troussas, “The Effect of Preprocessing Techniques on Twitter Sentiment analysisâ€, Research Gate, July 2016, DOI:10.1109/IISA.2016.7785373.
[2] Amit G. Shirbhate, Sachin N. Deshmukh, “Feature Extraction for Sentiment Classification on Twitter Dataâ€, International Journal of Science and Research, ISSN: 2319-7064, Volume 5 Issue 2, February 2016.
[3] Ammar Ismael Kadhim, Yu-N Cheah, “Improving TF-IDF with Singular Value Decomposition (SVD) for Feature Extraction on Twitterâ€, 3rd International Engineering Conference on Development in Civil & Computer Engineering Applications, 2017, ISSN: 24096997
[4] Ankita Pal, “Principal Component Analysis of TF-IDF In Click Through Rate Predictionâ€, International Journal of New Technology and Research (IJNTR), ISSN: 2454-4116, Volume-4, Issue-12, December 2018, pp 24-26.
[5] Arjun Srinivas Nayak & Ananthu P Kanive, “Survey on Pre-Processing Techniques for Text Miningâ€, International Journal of Engineering And Computer Science, ISSN: 2319-7242, Volume 5 Issue 6 June 2016,Page No. 16875-16879.
[6] Bholane Savita & Prof.Deipali Gore, “Sentiment Analysis on Twitter Data Using Support Vector Machineâ€, IJCST, Volume 4, Issue 3, May-Jun 2016.
[7] Aymen Abu-Errub, “Arabic Text Classification Algorithm Using TF-IDF and Chi Square Measurementsâ€, International Journal of Computer Applications, ISSN: 0975-8887, Volume 93 – No 6, May 2014.
[8] Emma Haddi & Xiaohui, “The Role of Text Pre-processing in Sentiment Analysisâ€, Information Technology and Quantitative Management (ITQM2013), Procedia Computer Science 17 (2013)26-32.
[9] Indra S.T, “Using Logistic Regression Method to Classify Tweets into the Selected Topicsâ€, ICACSIS, IEEE, 2016.
[10] Mageshwari V, Dr I. Laurence Aroquiaraj, “Big Data in Health Care Revolution – A Surveyâ€, International Research Journal of Engineering and Technology, Volume 3 Issue 9, September 2016, ISSN 2395-0056.
[11] V. Mageshwari, Dr.I. Laurence Aroquiaraj, “Social Media Mining for Analyzing HIV/AIDS – A Preliminary Studyâ€, IJIACS, ISSN: 2347-8616, Volume 6, Issue 9, September 2017.
[12] V Mageshwari, Dr.I. Laurence Aroquiaraj, “The Importance of Text Pre-Processing in Twitter Miningâ€, International Journal of Scientific Research in Computer Science Applications and Management Studies, ISSN: 2319-1953, Volume 7, Issue 4, July 2018.
[13] Rene Clausen Nielsen, “Social Media Monitoring of Discrimination and HIV Testing in Brazil, 2014-2015â€, AIDS Behav (2017) 21:S114-S120, DOI: 10.1007/s10461-017-1753-2
[14] Sean D. Young, Wenchao Yu, “Towards Automating HIV Identification: Machine Learning for Rapid Identification of HIV-related Social Media Dataâ€, J Acquir Immune Defic Syndr, February 01 2017, 74(Suppl): S128-S131, doi: 10.1097/QAI.0000000000001240.
[15] Yassine AL AMRANI, Mohammed LAZAAR, “Random Forest and Support Vector Machine based Hybrid Approach to Sentiment Analysisâ€, The First International Conference on Intelligent Computing in Data Sciences, Procedia Computer Science 127 (2018) 511-520.
[16] Tajinder Singh and Madhu Kumari, “Role of Text Pre-processing in Twitter Sentiment Analysisâ€, IMCIP-2016, Procedia Computer Science 89 (2016) 569-554.
[17] ZHANG Yun-tao, “An improved TF-IDF approach for Text Classificationâ€, Journal of Zhejiang University SCIENCE, ISSN: 1009-3095, 2005.
|