Vol. 12 No. 1 (2013): Revista UIS Ingenierías
Articles

A review of the extractive text summarization

Martha Eliana Mendoza-Becerra
Universidad del Cauca
Bio
Elizabeth Leon-Guzmán
Universidad Nacional de Colombia
Bio

Published 2013-06-14

Keywords

  • automatic text summarization,
  • algebraic reduction,
  • clustering,
  • evolutionary models

How to Cite

Mendoza-Becerra, M. E., & Leon-Guzmán, E. (2013). A review of the extractive text summarization. Revista UIS Ingenierías, 12(1), 7–27. Retrieved from https://revistas.uis.edu.co/index.php/revistauisingenierias/article/view/3707

Abstract

Research in the area of automatic text summarization has intensifed in recent years due to the large amount of information available in electronic documents. This article present the most relevant methods for automatic text extractive summarization that have been developed both for a single document and multiple documents, with special emphasis on methods based on algebraic reduction, clustering and evolutionary models, of which there is great amount of research in recent years, since they are language-independent and unsupervised methods.

Downloads

Download data is not yet available.

References

S. Osiński and D. Weiss, “A concept-driven algorithm for clustering search results,” Intelligent Systems, IEEE, vol. 20, pp. 48-54, 2005.

D. Das and A. F. T. Martins, “A Survey on Automatic Text Summarization,” ed, 2007.

K. Ježek and J. Steinberger, “Automatic Text Summarization (The state of the art 2007 and new challenges),” in Znalosti 2008, Bratislava, Slovakia, 2008, pp. 1-12.

T. Simone and M. Marc, “Summarizing scientific articles: experiments with relevance and rhetorical status,” Computational Linguistics, vol. 28, pp. 409-445, 2002.

Z. Jiaming, L. Han Tong, L. Ying, and S. Aixin, “Automatic text summarization in engineering information management,” in Proceedings of the 10th International Conference on Asian digital libraries: looking back 10 years and forging new frontiers, Hanoi, Vietnam, 2007.

H. Luhn, “The automatic creation of literature abstracts,” IBM Journal of Research and Development, pp. 159-165, 1958.

P. Baxendale, “Machine-made index for technical literature - an experiment.,” Journal of Research Development, vol. 2, pp. 354-361, 1958.

H. P. Edmundson, “New Methods in Automatic Extracting,” Journal of the ACM (JACM), vol. 16, pp. 264-285, 1969.

G. Salton, “Automatic Text Processing,” Addison-Wesley Publishing Company., 1988.

C.-Y. Lin and E. Hovy, “ Identifying topics by position,” In Proceedings of the Fifth conference on Applied natural language processing. San Francisco, CA, USA., pp. 283-290, 1997.

J. Kupiec, J. Pedersen, and F. Chen, “A trainable document summarizer,” in Proceedings of the 18th Annual International ACM SIGIR Conference on Research and development in information retrieval, Seattle, Washington, United States, 1995, pp. 68-73.

C. Aone, M. E. Okurowski, J. Gorlinsky, and B. s. Larsen, “ A trainable summarizer with knowledge acquired from robust nlp techniques.,” Advances in Automatic Text Summarization, vol. Mani, I. and Maybury, M. T., pp. 71-80, 1999.

C.-Y. Lin, “Training a selection function for extraction. ,” In Proceedings of CIKM ‘99. New York, NY, USA, pp. 55-62, 1999.

M. Osborne, “Using maximum entropy for sentence extraction,” in Proceedings of the ACL-02 Workshop on Automatic Summarization, Phildadelphia, Pennsylvania, 2002.

K. Svore, Vanderwende, L., and Burges, C., “Enhancing single-document summarization by combining RankNet and third-party sources,” In Proceedings of the EMNLP-CoNLL, pp. 448-457, 2007.

D. Shen, J.-T. Sun, H. Li, Q. Yang, and Z. Chen, “Document summarization using conditional random fields,” in Proceedings of the 20th International Joint Conference on Artifical intelligence, Hyderabad, India, 2007, pp. 2862-2867.

K.-F. Wong, M. Wu, and W. Li, “Extractive summarization using supervised and semi-supervised learning,” in Proceedings of the 22nd International Conference on Computational Linguistics, Manchester, United Kingdom, 2008.

R. Barzilay, Elhadad, M, “Using Lexical Chains for Text Summarization,” In Proceedings of the ACL/EACL’97 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain., pp. 10–17, 1997.

K. Ono, Sumita, K., and Miike, S., “Abstract generation based on rhetorical structure extraction.,” In Proceedings of Coling ‘94. Morristown, NJ, USA, pp. 344-348, 1994.

D. Marcu, “Improving summarization through rhetorical parsing tuning,” Proceedings of The Sixth Workshop on Very Large Corpora. Montreal, Canada, pp. 206-215, 1998.

D. C. T. Marcu, “The rhetorical parsing, summarization, and generation of natural language texts,” PhD thesis, University of Toronto. Adviser-Graeme Hirst., 1998.

R. Mihalcea, Tarau, P. , “ Text-rank - bringing order into texts,” In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain., 2004.

Y. Gong and X. Liu, “Generic text summarization using relevance measure and latentsemantic analysis,” in Proceedings of ACM SIGIR, New Orleans, USA, 2001.

J. Steinberger and K. Ježek, “Using latent semantic analysis in text summarization and summary evaluation,” in In Proceedings ISIM ’04 2004.

J.-Y. Yeh, H.-R. Ke, W.-P. Yang, and I.-H. Meng, “Text summarization using a trainable summarizer and latent semantic analysis,” Information Processing and Management, vol. 41, pp. 75–95, 2005.

J.-H. Lee, S. Park, C.-M. Ahn, and D. Kim, “Automatic generic document summarization based on non-negative matrix factorization,” Information Processing & Management, vol. 45, pp. 20-34, 2009.

A. Kiani and M. R. Akbarzadeh, “Automatic Text Summarization Using Hybrid Fuzzy GA-GP,” in Proceedings of the IEEE International Conference on Fuzzy Systems, 2006, pp. 977-983.

M. A. Fattah and F. Ren, “GA, MR, FFNN, PNN and GMM based models for automatic text summarization,” Computer Speech & Language, vol. 23, pp. 126-144, 2009.

P.-K. Dehkordi, F. Kumarci, and H. Khosravi, “Text Summarization Based on Genetic Programming,” in In Proceedings of the International Journal of Computing and ICT Research, 2009, pp. 57-64.

M. S. Binwahlan, N. Salim, and L. Suanmali, “Fuzzy swarm based text summarization,” Journal Computer Sciences, vol. 5, pp. 338–346, 2009.

M. S. Binwahlan, N. Salim, and L. Suanmali, “Swarm Based Text Summarization,” in In Proceedings of the International Association of Computer Science and Information Technology - Spring Conference. IACSITSC ‘09, 2009, pp. 145-150.

W. Song, L. Cheon Choi, S. Cheol Park, and X. Feng Ding, “Fuzzy evolutionary optimization modeling and its applications to unsupervised categorization and extractive summarization,” Expert Systems with Applications, vol. 38, pp. 9112-9121, 2011.

M. S. Binwahlan, N. Salim, and L. Suanmali, “Fuzzy swarm diversity hybrid model for text summarization,” Information Processing and Management, vol. 46, pp. 571-588, 2010.

M. Litvak, M. Last, and M. Friedman, “A new approach to improving multilingual summarization using a genetic algorithm,” in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 2010, pp. 927-936.

V. Qazvinian, L. Sharif, and R. Halavati, “Summarising text with a genetic algorithm-based sentence extraction,” International Journal of Knowledge Management Studies (IJKMS), vol. 2, pp. 426-444, 2008.

E. Shareghi and L. S. Hassanabadi, “Text summarization with harmony search algorithm- based sentence extraction,” in Proceedings of the 5th International Conference on Soft computing as transdisciplinary science and technology Cergy-Pontoise, France, 2008.

R. M. Aliguliyev, “A new sentence similarity measure and sentence based extractive technique for automatic text summarization,” Expert Systems with Applications, vol. 36, pp. 7764-7772, 2009.

Y.-M. Chen, X.-L. Wang, and B.-Q. Liu, “Multi-document summarization based on lexical chains,” in Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on, Vol. 3, 2005, pp. 1937-1942.

R. Mihalcea, Tarau, P., “ An Algorithm for Language Independent Single and Multiple Document Summarization.,” In Proceedings of the International Joint Konference on Natural Language Processing, Korea., 2005.

X. Wan, “Towards a Unified Approach to Simultaneous Single-Document and Multi-Document Summarizations,” in In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Beijing, 2010, pp. 1137–1145.

B. Hachey, G. Murray, and D. Reitter, “The Embra System at DUC 2005: Query-oriented Multi-document Summarization with a Very Large Latent Semantic Space,” in Proceedings of the Document Understanding Conference (DUC), Vancouver, Canada, 2005.

J. Steinberger and M. Křišťan, “LSA-Based Multi-Document Summarization,” in Proceedings of 8th International PhD Workshop on Systems and Control, Balatonfured, Hungary, 2007.

D. R. Radev, H. Jing, M. Stys, and D. Tam, “Centroid-based summarization of multiple documents,” Information Processing & Management, vol. 40, 2004, pp. 919-938.

D. Wang, S. Zhu, T. Li, Y. Chi, and Y. Gong, “Integrating clustering and multi-document summarization to improve document understanding,” in Proceedings of the 17th ACM conference on Information and knowledge management, Napa Valley, California, USA, 2008, vol. 5, 2011, pp. 1-26.

M. Ali, M. K. Ghosh, and A. Al-Mamun, “Multi-document Text Summarization: SimWithFirst Based Features and Sentence Co-selection Based Evaluation,” in International Conference on Future Computer and Communication, 2009. ICFCC 2009. , 2009, pp. 93-96.

M. Xiao-Chen, Y. Gui-Bin, and M. Liang, “Multi-Document Summarization Using Clustering Algorithm,” in Proceedings of the International Workshop on Intelligent Systems and Applications, 2009, pp. 1-4.

L. Hennig, “Topic-based Multi-Document Summarization with Probabilistic Latent Semantic Analysis,” in International Conference RANLP, Borovets, Bulgaria, 2009, pp. 144–149.

D. Wang, S. Zhu, T. Li, and Y. Gong, “Multi-Document Summarization using Sentence-based Topic Model,” in Proceedings of the ACL-IJCNLP, Suntec, Singapore, 2009, pp. 297–300.

X. Cai and W. Li, “A spectral analysis approach to document summarization: Clustering and ranking sentences simultaneously,” Information Sciences, vol. 181, 2011, pp. 3816-3827.

G. Ravindra, N. Balakrishnan, and K. R. Ramakrishnan, “Multi-document Automatic Text Summarization Using Entropy Estimates,” in SOFSEM 2004: Theory and Practice of Computer Science, ed, 2004, pp. 73-82.

W. Meng, W. Xiaorong, L. Chungui, and Z. Zengfang, “Multi-document Summarization Based on Word Feature Mining,” in Proceedings of the 2008 International Conference on Computer Science and Software Engineering, 2008, pp. 743-746.

D. Wang and T. Li, “Many are better than one: improving multi-document summarization via weighted consensus,” in Proceedings of the 33rd International ACM SIGIR Conference on Research and development in information retrieval, Geneva, Switzerland, 2010, pp.

D. Bollegala, N. Okazaki, and M. Ishizuka, “A bottom-up approach to sentence ordering for multi-document summarization,” Information Processing and Management, vol. 46, 2010, pp. 89-109.

A. Celikyilmaz and D. Hakkani-Tur, “A Hybrid Hierarchical Model for Multi-Document Summarization,” in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 11-16 July 2010., 2010, pp. 815–824.

R. M. Aliguliyev, “Clustering techniques and discrete particle swarm Optimization algorithm for multi-document,” An international journal Computational Intelligence,, vol. 26, 2010, pp. 420-448.

R. M. Alguliev, R. M. Aliguliyev, M. S. Hajirahimova, and C. A. Mehdiyev, “MCMR: Maximum coverage and minimum redundant text summarization model,” Expert Systems with Applications, vol. 38, 2011, pp. 14514-14522.

R. M. Alguliev, R. M. Aliguliyev, and N. R. Isazade, “CDDS: Constraint-driven document summarization models,” Expert Systems with Applications, vol. 40, 2013, pp. 458-465.

R. M. Alguliev, R. M. Aliguliyev, and C. A. Mehdiyev, “Sentence selection for generic document summarization using an adaptive differential evolution algorithm,” Swarm and Evolutionary Computation, vol. 1, 2011, pp. 213-222.

R. M. Alguliev, R. M. Aliguliyev, and N. R. Isazade, “DESAMC+DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization,” Knowledge-Based Systems, vol. 36, pp. 21-38.

R. M. Alguliev, R. M. Aliguliyev, and N. R. Isazade, “Multiple documents summarization based on evolutionary optimization algorithm,” Expert Systems with Applications, vol. 40, 2013, pp. 1675-1689

D. Harman and P. Over, “The DUC summarization evaluations,” in Proceedings of the Second International Conference on Human Language Technology Research, San Diego, California, 2002, pp. 44-51.

H. Jing, R. Barzilay, K. Mckeown, and M. Elhadad, “Summarization Evaluation Methods: Experiments and Analysis,” in AAAI Symposium on Intelligent Summarization 1998, pp. 51-60.

C. Lin, “Rouge: a package for automatic evaluation of summaries,” in In Proceedings of the Workshop on Text Summarization Branches Out, Barcelona, Spain, 2004, pp. 25-26.

H. Tingting, C. Jinguang, M. Liang, G. Zhuoming, L. Fang, S. Wei, and W. Qian, “ROUGE-C: A fully automated evaluation method for multi-document summarization,” in Proceedings of the IEEE International Conference on Granular Computing, 2008, pp. 269-274.