Vol. 11 No. 1 (2012): Revista UIS Ingenierías
Articles

Semantic expansion of queries for web search (MSEC)

Miguel Angel Niño-Zambrano
Universidad del Cauca
Bio
Iván Darío López-Gómez
Universidad del Cauca
Bio
Carlos Adrian Andrade
Universidad del Cauca
Bio
Carlos Alberto Cobos-Lozada
Universidad del Cauca
Bio
Ramon Fabregat-Gesa
Universitat de Girona
Bio

Published 2012-06-22

Keywords

  • Web Search,
  • query expansion,
  • domain ontologies,
  • user profiles,
  • semantic similarity

How to Cite

Niño-Zambrano, M. A., López-Gómez, I. D., Andrade, C. A., Cobos-Lozada, C. A., & Fabregat-Gesa, R. (2012). Semantic expansion of queries for web search (MSEC). Revista UIS Ingenierías, 11(1), 11–20. Retrieved from https://revistas.uis.edu.co/index.php/revistauisingenierias/article/view/11-20

Abstract

Internet has become the largest repository of human knowledge, and the amount of stored information increases day by day. This increase of information affects the levels of precision reported by Web search engines regarding  documents retrieved for the user. One strategy being used to address this problem is a focus on a personalized resource recovery. Several projects currently offer semantic methods for improving the relevance of search results  through the use of ontologies, natural language processing, knowledge based systems, query specification languages, and user profile, among others. Results are generally better than for web search engines that do not use these  techniques. However, the high cost of these improvements in precision relate to use of more complex algorithms in carrying out the search and which are more wasteful of computational resources. This article describes a semantic  query expansion model called MSEC, which is based mostly on the concept of semantic similarity, starting from domain ontologies and on the use of user profile in order to customize user searches so to improve their precision. In order to evaluate the proposed model, a software prototype was created. Preliminary experimental results show an improvement compared to the traditional web search approach. Finally the model was compared against the best  state of the art semantic search engine, called GoPubMed, for the MEDLINE collection.

Downloads

Download data is not yet available.

References

  1. R. Dhanapal, “An intelligent information retrieval agent,” Knowledge-Based Systems, vol. 21, No. 6, August 2008, pp. 466-470.
  2. C. Deco, C. Bender, J. Saer, and M. Chiari, “Expansión de consultas utilizando recursos lingüísticos para mejorar la recuperación de información en la web,” Desarrollo, implementación y utilización de modelos para el procesamiento automático de textos, Editorial de la Facultad d. Filosofía y Letras, Mendoza Argentina, 2005, pp. 35-46.
  3. R. Baeza-Yates and B. Ribeiro-Neto, Modern information retrieval, 2nd Edition, ACM Press Books, USA., 1999, p. 453.
  4. L. Schamberg, B. Einseberg, and S. Nilo, “A re-examination of relevance: toward a dynamic, situational definition,” Information Procesing and Management, vol. 26, No. 6, 1990, pp. 755-776.
  5. K. Kim, J. Hong, and S. Cho, “A semantic Bayesian network approach to retrieving information with intelligent conversational agents,” Information Processing & Management, vol. 43, No. 1, January 2007, pp. 225-236.
  6. G. Salton, Introduction to modern information retrieval, McGraw-Hill, New York, 1983, p. 448.
  7. Y. Marcano and R. Talavera, “Gestión de la información a través de la Web Semántica: Iniciativas y dificultades,” Revista Venezolana de Gerencia (RVG), vol. 11, No. 36, October 2006, p. 36.
  8. P. Mitra, N. Noy, and A. Jaiswal, “Ontology Mapping Discovery with Uncertainty,” Fourth International Conference on the Semantic Web, Galway Ireland 6th – 10th November 2005, p. 15.
  9. D. Avello, Web Cooperativa (Trabajo de Investigación), Universidad de Oviedo, 2002, p. 67.
  10. J. X. Xu and W. B. Croft, “Query expansion using local and global document analysis”, Proceedings of the 19th Annual International SIGIR Conference on Research and Development in Information Retrieval, New York 1996, pp. 4 – 11.
  11. R. Attar and A. S. Fraenkel, “Local feedback in full-text retrieval systems”, Journal of the ACM, vol. 24, No. 3, 1977, pp. 397 – 417.
  12. J. X. Xu and W. B. Croft, “Improving the effectiveness of information retrieval with local context analysis,” ACM Transactions on Information Systems, vol. 18, No. 1, 2000, pp. 79 – 112.
  13. G. Solskinnsbakk and J. Gulla, “Combining ontological profiles with context in information retrieval,” Data & Knowledge Engieering, vol. 69, No. 3, 2010, p. 10.
  14. H. Wang, J. Qin and H. Shao, “Expansion Model of Semantic Query Based on Ontology,” 2009 Second Pacific-Asia Conference on Web Mining and Web based Application, Wuhan China 6th – 7th June 2009, pp. 86 – 90.
  15. J. Mustafa, S. Han, and K. Latif, “Ontology based semantic information retrieval,” 4th International IEEE Conference, Varna 6th – 7th September 2008, pp. 22 – 19.
  16. M. Baziz, M. Boughanem, and N. AussenacGilles, “Evaluating a Conceptual Indexing Method by Utilizing WordNet,” Lecture Notes in Computer Science, vol. 40, No. 22, 2006, pp. 238 – 246.
  17. S. Liu, F. Liu, C. Yu, and W. Meng, “An Effective Approach to Document Retrieval via Utilizing WordNet and Recognizing Phrases,” Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, NY USA 2004, pp. 266 - 272.
  18. V. Cordi, P. Lombardi, M. Martelli and V. Mascardi, “An Ontology-Based Similarity between Sets of Concepts,” 2005, p. 6.
  19. T. Slimani, B. B. Yaghlane, and K. Mellouli, “A New Similarity Measure based on Edge Counting,” Proceedings of world academy of science, engineering and technology, vol. 17, 2006, p. 5.
  20. P. Chen and F. Kuo, “An information retrieval system based on a user profile,” Journal of Systems and Software, vol. 54, No. 1, 2000, pp. 3 – 8.
  21. National Cancer Institute. U.S. National Institutes of Health. Available: http://www.cancer.gov [citado 26 de Abril de 2011].
  22. MEDLARS. MEDical Literature Analysis and Retrieval System. Available: http://www.uninet. edu/do/MEDLARS.html [citado 26 de Abril de 2011].
  23. J. J. Yepes, Ontology Refinement for Improved Information Retrieval in the Biomedical Domain [PhD Thesis]. Universitat Jaume, Castellón, 2009.
  24. B. Croft, D. Metzler, T. Strohman, Search Engines: Information Retrieval in Practice, first edition, Addison-Wesley, USA, 2009, p. 552.
  25. C. D. Manning, P. Raghavan, H. Schütze, An Introduction to Information Retrieval, first edition, Cambrige University Press, Cambridge, 2008, p. 581.
  26. GoPubMed. GoPubMed, searching is now sorted!. Available: http://www.gopubmed.com/ [citado 23 de Junio de 2011].
  27. MEDLINE. MEDLINE/PubMed Resources Guide. Available: http://www.nlm.nih.gov/bsd/ pmresources.html [citado 27 de Junio de 2011].
  28. Gene_Ontology. The Gene Ontology. Available: http://www.geneontology.org/ [citado 27 de Junio de 2011].
  29. BiKE-Laboratory. MeSH Ontology in OWL format. Available: http://bike.snu. ac.kr/?q=node/207 [citado 30 de Noviembre de 2010].
  30. M. Suárez, K. Salinas, “An Approach to Semantic Indexing and Information Retrieval,” Revista Facultad de Ingeniería Universidad de Antioquia, No. 48, 2009, p. 174-187.
  31. P. Jackson, F. Schilder, “Natural Language Processing: Overview,” Encyclopedia of Language & Linguistics, Elsevier, 2006, pp. 503 – 518.
  32. C. Cobos, E. Estevez, M. Mendoza, L. Gomez and E. León, “Algoritmos de Expansión de Consulta basados en una Nueva Función Discreta de Relevancia,” Revista UIS Ingenierías, vol 10, No. 1, 2011, pp. 9-22.