Published 2011-06-15
Keywords
- Query expansion,
- rocchio,
- relevant term,
- IDF,
- inverse document frequency
How to Cite
Abstract
It has been shown that the query expansion process in the vector space model of document’s representation in aretrieval system, it is a useful technique for improving the relevance measured by precision of the results delivered tousers. This paper presents a new algorithm and a variation of itself used to perform query expansion in informationretrieval systems. These algorithms are based on a new discrete function that defines the relative importance of aterm in a document collection. The algorithm and its variation were evaluated against the cosine similarity searchand the query expansion algorithm proposed by Rocchio, with excellent results on data collection CACM (articlespublished in the Communications of the ACM journal).
Downloads
References
- Baeza-Yates, R., A. and B. Ribeiro-Neto, Modern Information Retrieval. 1999: Addison-Wesley Longman Publishing Co., Inc. 513.
- Manning, C., P. Raghavan, and H. Schütze, Introduction to Information Retrieval. 2008, Cambridge University Press: Cambridge, England.
- Rijsbergen, C.J.V., Information Retrieval. 1979: Butterworth-Heinemann. 208.
- Hammouda, K., Web Mining: Clustering Web Documents A Preliminary Review. 2001. p. 1-13.
- Yongli, L., et al., A Query Expansion Algorithm Based on Phrases Semantic Similarity, in Proceedings of the 2008 International Symposiums on Information Processing. 2008, IEEE Computer Society.
- Inna Gelfer, K. and K. Oren, Cluster-based query expansion, in Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. 2009, ACM: Boston, MA, USA.
- Efthimiadis, E.N. Query Expansion. 1996 [cited 2011; in: Williams, Martha E., ed. Annual Review of Information Systems and Technology (ARIST), v31, pp 121-187, 1996]. Available from: http:// faculty.washington.edu/efthimis/pubs/Pubs/qearist/QE-arist.html.
- Robertson, S.E. and K. Sparck-Jones, Relevance weighting of search terms, in Document retrieval systems. 1988, Taylor Graham Publishing. p. 143- 160.
- Garcia, E. RSJ-PM Tutorial: A Tutorial on the Robertson-Sparck Jones Probabilistic Model for Information Retrieval. 2009; Available from: http://www.miislita.com/information-retrievaltutorial/information-retrieval-probabilisticmodel-tutorial.pdf.
- Biancalana, C. and A. Micarelli. Social Tagging in Query Expansion: A New Way for Personalized Web Search. in SocialCom-09 the 2009 IEEE International Conference on Social Computing. 2009. Vancouver, Canada.
- Marin, B., et al., Toward personalized query expansion, in Proceedings of the Second ACM EuroSys Workshop on Social Network Systems. 2009, ACM: Nuremberg, Germany.
- Dongsheng, Z. and W. Liqing. Study on Key Techniques of Query Expansion Based on Ontology and Its Application. in Computational Intelligence and Software Engineering, 2009. CiSE 2009. International Conference on. 2009.
- Nguyen, T.C. and T.T. Phan. An Ontology-Based Approach of Query Expansion. in iiWAS’2007 - The Ninth International Conference on Information Integration and Web-based Applications Services. 2007. Jakarta, Indonesia.
- Han, L. and G. Chen, HQE: A hybrid method for query expansion. Expert Systems with Applications, 2009. 36(4): p. 7985-7991.
- ASF. Class Similarity. [cited 2011 January 10, 2011]; Available from: http://lucene.apache.org/ java/2_9_0/api/core/org/apache/lucene/search/ Similarity.html.
- Porter, M.F., An algorithm for suffix stripping. Program, 1980. 14(3): p. 130-137.
- Dominich, S., The Modern Algebra of Information Retrieval. The Information Retrieval Series, ed. W.B. Croft. 2008: Springer-Verlag Berlin Heidelberg.
- IRG. Test collections. [cited 2011 January 15, 2011]; Available from: http://ir.dcs.gla.ac.uk/ resources/test_collections/.