International Journal of Scientific & Technology Research

Home About Us Scope Editorial Board Blog/Latest News Contact Us
10th percentile
Powered by  Scopus
Scopus coverage:
Nov 2018 to May 2020


IJSTR >> Volume 5 - Issue 7, July 2016 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

NEWordS: A News Search Engine for English Vocabulary Learning

[Full Text]



Xuejing Huang, Sushma Chandra Reddy



Special Purpose Search Engine, Vocabulary Learning, News Retrieval, data mining



Vocabulary is the first hurdle for English learners to over- come. Instead of simply showing a word again and again, we come up with an idea to develop an English news article search engine based on users word-reciting record on Shanbay.com. It is designed for advanced English learners to find suitable reading materials. The search engine consists of Crawling Module, Document Normalizing module, Indexing Module, Querying Module and Interface Module. We propose three sorting & ranking algorithms for Querying Module. For the basic algorithm, five crucial principles are taken into consideration. Term frequency, inverse document frequency, familiarity degree and article freshness degree are factors in this algorithm. Then we think of a improved algorithm for the scene in which a user read multiple articles in the searching result list. Here we adopt a iterative & greedy method. The essential idea is to select English news articles one by one according to the query, meanwhile dynamically update the unfamiliarity of the words during each iterative step. Moreover, we develop an advanced algorithm to take article difficulty in to account. Interface Module is designed as a website, meanwhile some data visualization technologies (e.g. word cloud) are applied here. Furthermore, we conduct both applicability check and performance evaluation. Metrics such as searching time, word-covering ratio and minimum number of articles that completely cover all the queried vocabulary are randomly sampled and profoundly analyzed. The result shows that our search engine works very well with satisfying performance.



[1] J. R. Joseph, M. L. Stein, and K. Wysocki Learning vocabulary through reading, American Educational Research Journal, 21(4):795825, Winter 1984.

[2] J. B. Lovins. Development of a stemming algorithm. Mechanical Translation Computational Linguistics, 11:2231, 1968.

[3] C. D. Manning, P. Raghavan, H. Schu tze, et al. Introduction to information retrieval, volume 1. Cambridge university press Cambridge, 2008.

[4] T. Saragi. Vocabulary learning and reading. System, 6(2):7278, 1978.

[5] B. Zhang, Z. Zhao, L. Zhang, and L. Weng. Building a specialized search engine of special subject. TENCON 02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering, 1:6972, 2002.