Una Revisión de la generación automática de resúmenes extractivos

  • Martha Eliana Mendoza-Becerra Universidad del Cauca
  • Elizabeth Leon-Guzmán Universidad Nacional de Colombia

Resumen

Las investigaciones en el área de generación automática de resúmenes de textos se han intensifcado en los últimos años debido a la gran cantidad de información disponible en documentos electrónicos. Este artículo presenta los métodos más relevantes de generación automática de resúmenes extractivos que se han desarrollado tanto para un solo documento como para múltiples documentos, haciendo especial énfasis en los métodos basados en reducción algebraica, en agrupamiento y en modelos evolutivos, de los cuales existe gran cantidad de investigaciones en los últimos años, dado que son métodos independientes del lenguaje y no supervisados.

 

Palabras clave: Generación automática de resúmenes de textos, reducción algebraica, agrupamiento, modelos evolutivos

Descargas

La descarga de datos todavía no está disponible.

Biografía del autor

Martha Eliana Mendoza-Becerra, Universidad del Cauca

Ingeniera de Sistemas, Magíster en Informática, Dra. (c) en Ingeniería de Sistemas y Computación, Profesora Titular, Departamento de Sistemas, Facultad de Ingeniería Electrónica y Telecomunicaciones, Miembro del Grupo de I+D en Tecnologías de la Información.

Elizabeth Leon-Guzmán, Universidad Nacional de Colombia

Ingeniera de Sistemas, Magister en Ingeniería de Sistemas, Dra. en Ciencias de la computación e Ingeniería Informática, Profesora Asistente, Departamento de Ingeniería de Sistemas e Industrial, Facultad de Ingeniería, Directora del Grupo de Investigación en Minería de Datos.

Citas

S. Osiński and D. Weiss, “A concept-driven algorithm for clustering search results,” Intelligent Systems, IEEE, vol. 20, pp. 48-54, 2005.

D. Das and A. F. T. Martins, “A Survey on Automatic Text Summarization,” ed, 2007.

K. Ježek and J. Steinberger, “Automatic Text Summarization (The state of the art 2007 and new challenges),” in Znalosti 2008, Bratislava, Slovakia, 2008, pp. 1-12.

T. Simone and M. Marc, “Summarizing scientific articles: experiments with relevance and rhetorical status,” Computational Linguistics, vol. 28, pp. 409-445, 2002.

Z. Jiaming, L. Han Tong, L. Ying, and S. Aixin, “Automatic text summarization in engineering information management,” in Proceedings of the 10th International Conference on Asian digital libraries: looking back 10 years and forging new frontiers, Hanoi, Vietnam, 2007.

H. Luhn, “The automatic creation of literature abstracts,” IBM Journal of Research and Development, pp. 159-165, 1958.

P. Baxendale, “Machine-made index for technical literature - an experiment.,” Journal of Research Development, vol. 2, pp. 354-361, 1958.

H. P. Edmundson, “New Methods in Automatic Extracting,” Journal of the ACM (JACM), vol. 16, pp. 264-285, 1969.

G. Salton, “Automatic Text Processing,” Addison-Wesley Publishing Company., 1988.

C.-Y. Lin and E. Hovy, “ Identifying topics by position,” In Proceedings of the Fifth conference on Applied natural language processing. San Francisco, CA, USA., pp. 283-290, 1997.

J. Kupiec, J. Pedersen, and F. Chen, “A trainable document summarizer,” in Proceedings of the 18th Annual International ACM SIGIR Conference on Research and development in information retrieval, Seattle, Washington, United States, 1995, pp. 68-73.

C. Aone, M. E. Okurowski, J. Gorlinsky, and B. s. Larsen, “ A trainable summarizer with knowledge acquired from robust nlp techniques.,” Advances in Automatic Text Summarization, vol. Mani, I. and Maybury, M. T., pp. 71-80, 1999.

C.-Y. Lin, “Training a selection function for extraction. ,” In Proceedings of CIKM ‘99. New York, NY, USA, pp. 55-62, 1999.

M. Osborne, “Using maximum entropy for sentence extraction,” in Proceedings of the ACL-02 Workshop on Automatic Summarization, Phildadelphia, Pennsylvania, 2002.

K. Svore, Vanderwende, L., and Burges, C., “Enhancing single-document summarization by combining RankNet and third-party sources,” In Proceedings of the EMNLP-CoNLL, pp. 448-457, 2007.

D. Shen, J.-T. Sun, H. Li, Q. Yang, and Z. Chen, “Document summarization using conditional random fields,” in Proceedings of the 20th International Joint Conference on Artifical intelligence, Hyderabad, India, 2007, pp. 2862-2867.

K.-F. Wong, M. Wu, and W. Li, “Extractive summarization using supervised and semi-supervised learning,” in Proceedings of the 22nd International Conference on Computational Linguistics, Manchester, United Kingdom, 2008.

R. Barzilay, Elhadad, M, “Using Lexical Chains for Text Summarization,” In Proceedings of the ACL/EACL’97 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain., pp. 10–17, 1997.

K. Ono, Sumita, K., and Miike, S., “Abstract generation based on rhetorical structure extraction.,” In Proceedings of Coling ‘94. Morristown, NJ, USA, pp. 344-348, 1994.

D. Marcu, “Improving summarization through rhetorical parsing tuning,” Proceedings of The Sixth Workshop on Very Large Corpora. Montreal, Canada, pp. 206-215, 1998.

D. C. T. Marcu, “The rhetorical parsing, summarization, and generation of natural language texts,” PhD thesis, University of Toronto. Adviser-Graeme Hirst., 1998.

R. Mihalcea, Tarau, P. , “ Text-rank - bringing order into texts,” In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain., 2004.

Y. Gong and X. Liu, “Generic text summarization using relevance measure and latentsemantic analysis,” in Proceedings of ACM SIGIR, New Orleans, USA, 2001.

J. Steinberger and K. Ježek, “Using latent semantic analysis in text summarization and summary evaluation,” in In Proceedings ISIM ’04 2004.

J.-Y. Yeh, H.-R. Ke, W.-P. Yang, and I.-H. Meng, “Text summarization using a trainable summarizer and latent semantic analysis,” Information Processing and Management, vol. 41, pp. 75–95, 2005.

J.-H. Lee, S. Park, C.-M. Ahn, and D. Kim, “Automatic generic document summarization based on non-negative matrix factorization,” Information Processing & Management, vol. 45, pp. 20-34, 2009.

A. Kiani and M. R. Akbarzadeh, “Automatic Text Summarization Using Hybrid Fuzzy GA-GP,” in Proceedings of the IEEE International Conference on Fuzzy Systems, 2006, pp. 977-983.

M. A. Fattah and F. Ren, “GA, MR, FFNN, PNN and GMM based models for automatic text summarization,” Computer Speech & Language, vol. 23, pp. 126-144, 2009.

P.-K. Dehkordi, F. Kumarci, and H. Khosravi, “Text Summarization Based on Genetic Programming,” in In Proceedings of the International Journal of Computing and ICT Research, 2009, pp. 57-64.

M. S. Binwahlan, N. Salim, and L. Suanmali, “Fuzzy swarm based text summarization,” Journal Computer Sciences, vol. 5, pp. 338–346, 2009.

M. S. Binwahlan, N. Salim, and L. Suanmali, “Swarm Based Text Summarization,” in In Proceedings of the International Association of Computer Science and Information Technology - Spring Conference. IACSITSC ‘09, 2009, pp. 145-150.

W. Song, L. Cheon Choi, S. Cheol Park, and X. Feng Ding, “Fuzzy evolutionary optimization modeling and its applications to unsupervised categorization and extractive summarization,” Expert Systems with Applications, vol. 38, pp. 9112-9121, 2011.

M. S. Binwahlan, N. Salim, and L. Suanmali, “Fuzzy swarm diversity hybrid model for text summarization,” Information Processing and Management, vol. 46, pp. 571-588, 2010.

M. Litvak, M. Last, and M. Friedman, “A new approach to improving multilingual summarization using a genetic algorithm,” in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 2010, pp. 927-936.

V. Qazvinian, L. Sharif, and R. Halavati, “Summarising text with a genetic algorithm-based sentence extraction,” International Journal of Knowledge Management Studies (IJKMS), vol. 2, pp. 426-444, 2008.

E. Shareghi and L. S. Hassanabadi, “Text summarization with harmony search algorithm- based sentence extraction,” in Proceedings of the 5th International Conference on Soft computing as transdisciplinary science and technology Cergy-Pontoise, France, 2008.

R. M. Aliguliyev, “A new sentence similarity measure and sentence based extractive technique for automatic text summarization,” Expert Systems with Applications, vol. 36, pp. 7764-7772, 2009.

Y.-M. Chen, X.-L. Wang, and B.-Q. Liu, “Multi-document summarization based on lexical chains,” in Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on, Vol. 3, 2005, pp. 1937-1942.

R. Mihalcea, Tarau, P., “ An Algorithm for Language Independent Single and Multiple Document Summarization.,” In Proceedings of the International Joint Konference on Natural Language Processing, Korea., 2005.

X. Wan, “Towards a Unified Approach to Simultaneous Single-Document and Multi-Document Summarizations,” in In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Beijing, 2010, pp. 1137–1145.

B. Hachey, G. Murray, and D. Reitter, “The Embra System at DUC 2005: Query-oriented Multi-document Summarization with a Very Large Latent Semantic Space,” in Proceedings of the Document Understanding Conference (DUC), Vancouver, Canada, 2005.

J. Steinberger and M. Křišťan, “LSA-Based Multi-Document Summarization,” in Proceedings of 8th International PhD Workshop on Systems and Control, Balatonfured, Hungary, 2007.

D. R. Radev, H. Jing, M. Stys, and D. Tam, “Centroid-based summarization of multiple documents,” Information Processing & Management, vol. 40, 2004, pp. 919-938.

D. Wang, S. Zhu, T. Li, Y. Chi, and Y. Gong, “Integrating clustering and multi-document summarization to improve document understanding,” in Proceedings of the 17th ACM conference on Information and knowledge management, Napa Valley, California, USA, 2008, vol. 5, 2011, pp. 1-26.

M. Ali, M. K. Ghosh, and A. Al-Mamun, “Multi-document Text Summarization: SimWithFirst Based Features and Sentence Co-selection Based Evaluation,” in International Conference on Future Computer and Communication, 2009. ICFCC 2009. , 2009, pp. 93-96.

M. Xiao-Chen, Y. Gui-Bin, and M. Liang, “Multi-Document Summarization Using Clustering Algorithm,” in Proceedings of the International Workshop on Intelligent Systems and Applications, 2009, pp. 1-4.

L. Hennig, “Topic-based Multi-Document Summarization with Probabilistic Latent Semantic Analysis,” in International Conference RANLP, Borovets, Bulgaria, 2009, pp. 144–149.

D. Wang, S. Zhu, T. Li, and Y. Gong, “Multi-Document Summarization using Sentence-based Topic Model,” in Proceedings of the ACL-IJCNLP, Suntec, Singapore, 2009, pp. 297–300.

X. Cai and W. Li, “A spectral analysis approach to document summarization: Clustering and ranking sentences simultaneously,” Information Sciences, vol. 181, 2011, pp. 3816-3827.

G. Ravindra, N. Balakrishnan, and K. R. Ramakrishnan, “Multi-document Automatic Text Summarization Using Entropy Estimates,” in SOFSEM 2004: Theory and Practice of Computer Science, ed, 2004, pp. 73-82.

W. Meng, W. Xiaorong, L. Chungui, and Z. Zengfang, “Multi-document Summarization Based on Word Feature Mining,” in Proceedings of the 2008 International Conference on Computer Science and Software Engineering, 2008, pp. 743-746.

D. Wang and T. Li, “Many are better than one: improving multi-document summarization via weighted consensus,” in Proceedings of the 33rd International ACM SIGIR Conference on Research and development in information retrieval, Geneva, Switzerland, 2010, pp.

D. Bollegala, N. Okazaki, and M. Ishizuka, “A bottom-up approach to sentence ordering for multi-document summarization,” Information Processing and Management, vol. 46, 2010, pp. 89-109.

A. Celikyilmaz and D. Hakkani-Tur, “A Hybrid Hierarchical Model for Multi-Document Summarization,” in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 11-16 July 2010., 2010, pp. 815–824.

R. M. Aliguliyev, “Clustering techniques and discrete particle swarm Optimization algorithm for multi-document,” An international journal Computational Intelligence,, vol. 26, 2010, pp. 420-448.

R. M. Alguliev, R. M. Aliguliyev, M. S. Hajirahimova, and C. A. Mehdiyev, “MCMR: Maximum coverage and minimum redundant text summarization model,” Expert Systems with Applications, vol. 38, 2011, pp. 14514-14522.

R. M. Alguliev, R. M. Aliguliyev, and N. R. Isazade, “CDDS: Constraint-driven document summarization models,” Expert Systems with Applications, vol. 40, 2013, pp. 458-465.

R. M. Alguliev, R. M. Aliguliyev, and C. A. Mehdiyev, “Sentence selection for generic document summarization using an adaptive differential evolution algorithm,” Swarm and Evolutionary Computation, vol. 1, 2011, pp. 213-222.

R. M. Alguliev, R. M. Aliguliyev, and N. R. Isazade, “DESAMC+DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization,” Knowledge-Based Systems, vol. 36, pp. 21-38.

R. M. Alguliev, R. M. Aliguliyev, and N. R. Isazade, “Multiple documents summarization based on evolutionary optimization algorithm,” Expert Systems with Applications, vol. 40, 2013, pp. 1675-1689

D. Harman and P. Over, “The DUC summarization evaluations,” in Proceedings of the Second International Conference on Human Language Technology Research, San Diego, California, 2002, pp. 44-51.

H. Jing, R. Barzilay, K. Mckeown, and M. Elhadad, “Summarization Evaluation Methods: Experiments and Analysis,” in AAAI Symposium on Intelligent Summarization 1998, pp. 51-60.

C. Lin, “Rouge: a package for automatic evaluation of summaries,” in In Proceedings of the Workshop on Text Summarization Branches Out, Barcelona, Spain, 2004, pp. 25-26.

H. Tingting, C. Jinguang, M. Liang, G. Zhuoming, L. Fang, S. Wei, and W. Qian, “ROUGE-C: A fully automated evaluation method for multi-document summarization,” in Proceedings of the IEEE International Conference on Granular Computing, 2008, pp. 269-274.
Publicado
2013-06-14

Artículos más leídos por el mismo autor(es)