Vol. 15 No. 43 (2016): Revista GTI
Articles

Bioinformatics sequence alignment on a GRID architecture

Simon Orozco-Arias
Universidad de Caldas
Bio
Marcelo Herrera-González
Universidad de Caldas
Bio
Leonardo Soto-Agudelo
Universidad de Caldas
Bio
Gustavo Isaza-Echeverry
Universidad de Caldas
Bio

Published 2017-10-11

Keywords

  • HTC Computing,
  • Grid,
  • Cluster infrastructure,
  • Condor,
  • Blast

How to Cite

Orozco-Arias, S., Herrera-González, M., Soto-Agudelo, L., & Isaza-Echeverry, G. (2017). Bioinformatics sequence alignment on a GRID architecture. Revista GTI, 15(43), 37–45. Retrieved from https://revistas.uis.edu.co/index.php/revistagti/article/view/6818

Abstract

Tools such as high performance computational technologies have become very useful, used by research centers for running real time analysis. This turns high performance computing into a basic need for any research process. Moreover, minimizing the time spent to run this processes and increasing the precision with which the processes can run are some of the main reasons this technology is used. This article will discuss Grid computing (in a general manner) which is an architecture that satisfies these need. It will also showcase the fundamental factors that influence grid computing and how the performance of bioinformatics jobs can be boosted using this type of architecture. This will be done using NCBI-Blast in computational nodes which are placed in different physical locations in order to see the obtained performance after running each job.

Downloads

Download data is not yet available.

References

[1] Altschul, S. F., W. Gish, et al (1990). Basic local alignment search tool.. Revista Journal of molecular biology 215(3): 403-410.

[2] Baker, M., Buyya, R., & Laforenza, D (2002). Grids and Grid technologies for wide‐area distributed computing. Revista Software: Practice and Experience, 32(15), 1437-1466.

[3] Barker, B (2015). Message passing interface (mpi). Paper presented at the Workshop: High Performance Computing on Stampede.

[4] Beloglazov, A., Piraghaj, S. F., Alrokayan, M., & Buyya, R (2012). Deploying OpenStack on CentOS using the KVM Hypervisor and GlusterFS distributed file system. University of Melbourne

[5] Franco, Cesar. 2016 Procesamiento y Visualización Distribuida de Relaciones Funcionales de genes en ambientes Ubicuos. “Tesis de Maestría en Ingeniería Computacional no publicada” Universidad de Caldas, Manizales, Colombia.

[6] González, Á. F., Rosillo, R., Dávila, J. Á. M., & Olivera, V. M (2015). Historical review and future challenges in Supercomputing and Networks of Scientific Communication. Revista The Journal of Supercomputing

[7] Gropp, W., Lusk, E., Doss, N., & Skjellum, A (1996). A high-performance, portable implementation of the MPI message passing interface standard. Revista Parallel computing, 22(6), 789-828.

[8] Hernández, E. A. Z., & Ordoñez, J. S (2015). Una herramienta para el soporte a la computación distribuida.

[9] Hernández, J. T., Díaz, E., Figueroa, P., & De la Rosa, F (2007). El desarrollo de aplicaciones colaborativas de alta calidad: una realidad sobre la Red Académica de Alto Desempeño (Renata). Revista de Ingeniería, 26, 22-28.

[10] University of Winsconsin-madison (2016) HTCondor High Throughput Computing HTCondor Manuals. Recuperado (2016, noviembre 05) de http://research.cs.wisc.edu/htcondor/manual/

[11] Johnson, M., Zaretskaya, I., Raytselis, Y., Merezhuk, Y., McGinnis, S., & Madden, T. L (2008). NCBI BLAST: a better web interface. Nucleic acids research, 36(suppl 2), W5-W9.

[12] Kay, R (2009). Pragmatic network latency engineering fundamental facts and analysis. cPacket Networks, White Paper, 1-31.

[13] Liarte López, M. R (2008). Estudio, implementación y evaluación de entornos de computación de alto rendimiento HTC.

[14] Lonarkar, M. G., & Pandey, Y (2013). Real Time & Secure Video Transmission Using OpenMPI.

[15] Meza Martínez, J. I., & Uribe Hurtado, A. L (2013). Implementación de dos nodos grid basados en clusters e integrados a grid Colombia a través de Renata, utilizando software libre.

[16] Min, S., Lee, B., & Yoon, S (2016). Deep Learning in Bioinformatics. arXiv preprint arXiv:1603.06430.

[17] Palevich, J. H., & Taillefer, M (2008). Network file system: Google Patents.

[18] Pruitt, K. D., Tatusova, T., & Maglott, D. R (2007). NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Revista Nucleic acids research, 35(suppl 1), D61-D65.

[19] Ramstad, J (2015). Protein Alignment on the Intel Xeon Phi Coprocessor.

[20] Tejedor, R. J. M (2007). Grid Computing. Manual formativo de ACTA(43), 17-22.

[21] Tobias Oetiker (2015). About SmokePing. Recuperado (2016, abril 16) de http://oss.oetiker.ch/smokeping/

[22] V. Kalusivalingam (2004). Network Information Service (NIS),Configuration Options for Dynamic Host. Configuration Protocol for IPv6 (DHCPv6). Cisco Systems (India) Private Limited. Recuperado (2016, Marzo 22) de https://tools.ietf.org/html/rfc3898

[23] Ylonen, T., & Lonvick, C (2006). The secure shell (SSH) protocol architecture.

[24] Zhou, X., Chen, H., Wang, K., Lang, M., & Raicu, I (2013). Exploring Distributed Resource Allocation Techniques in the SLURM Job Management System. Illinois Institute of Technology, Department of Computer Science, Technical Report.

[25] Castillo, L. F., López-Gartner, G., Isaza, G. A., Sánchez, M., Arango, J., Agudelo-Valencia, D., & Castaño, S. (2015). GITIRBio: A Semantic and Distributed Service Oriented-Architecture for Bioinformatics Pipeline. Journal of Integrative Bioinformatics, 12(1), 255.