Vol. 19 No. 2 (2020): Revista UIS Ingenierías
Articles

Evolution of the maintainability of HPC facilities at CIEMAT headquarters

Antonio Juan Rubio-Montero
Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT)
Angelines Alberto-Morillas
Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT)
Rosa De Lima Herrera-Insua
Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT)
Pablo Colino-Sanguino
Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT)
Jorge Blanco-Yagüe
Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT)
Manuel Giménez
Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT)
Fernando Blanco-Marcilla
Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT)
Esther Montes-Prado
Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT)
Alicia Acero
Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT)
Rafael Mayo-García
Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT)

Published 2020-03-05

Keywords

  • resilience,
  • management practices,
  • history of computing

How to Cite

Rubio-Montero, A. J., Alberto-Morillas, A., Herrera-Insua, R. D. L., Colino-Sanguino, P., Blanco-Yagüe, J., Giménez, M., Blanco-Marcilla, F., Montes-Prado, E., Acero, A., & Mayo-García, R. (2020). Evolution of the maintainability of HPC facilities at CIEMAT headquarters. Revista UIS Ingenierías, 19(2), 85–88. https://doi.org/10.18273/revuin.v19n2-2020009

Abstract

Since its establishment in 1951, CIEMAT has been continuously boosting the use of computation as a research method, deploying innovative computing facilities. Hence, Vectorial, MPP, NUMA, and distributed architectures have been managed at CIEMAT, resulting in an extensive expertise on HPC maintainability as well as on the computational needs of the community related to international projects. Nowadays, the evolution of HPC hardware and software is progressively faster and implies a continuous challenge to increase their availability for the greater number of different initiatives supported. To address this task, the ICT team has been changing towards a flexible management model, with a look toward future acquisitions.

Downloads

Download data is not yet available.

References

[1] M. Y. Hsiao, W. C. Carter, J. W. Thomas, W. R. Stringfellow, “Reliability, Availability, and Serviceability of IBM Computer Systems: A Quarter Century of Progress,” IBM Journal of Research and Development, vol. 25, no. 5, pp. 453-468, 1981. doi: 10.1147/rd.255.0453

[2] United States Code - Definitions (44 U.S.C., Sec. 3542) and NIST Glossary, [Online]. Available: https://csrc.nist.gov/Glossary/?term=3103.

[3] F. Cappello, “Fault tolerance in petascale/exascale systems: current knowledge, challenges and research opportunities,” Int. J. High Perform. Comput. Appl., vol. 23, no. 3, pp. 212-226, 2009. doi: 10.1177/1094342009106189

[4] J. A. Moríñigo, M. Rodríguez-Pascual, R. Mayo-García, “On the Modelling of Optimal Coordinated Checkpoint Period in Supercomputers,” J. of Supercomputing, vol. 75, no. 2, pp. 930-954, 2019. doi: 10.1007/s11227-018-2621-1

[5] A. J. Rubio-Montero, E. Huedo, R. Mayo-García, “Scheduling multiple virtual environments in cloud federations for distributed calculations,” Future Generation Computer Systems, vol. 74, pp. 90-103, 2017. doi: 10.1016/j.future.2016.03.021

[6] D. Stanzione et al., “Stampede 2: The Evolution of an XSEDE Supercomputer,” in Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact, vol. Part F1287, pp. 1–8. doi: 10.1145/3093338.3093385

[7]J. A. Moríñigo, P. García-Muller, et al. “Benchmarking LAMMPS: Sensitivity to Task Location under CPU-based Weak-scaling,” 5th Latin American Conference on High Per-formance Computing (CARLA2018), Comm. Comp. Inf. Sci., vol. 979, 2019. doi: 10.1007/978-3-030-16205-4_17

[8] Spanish Official Bolletin (BOE-A-2010-1330) R. D. 3/2010, de 8 de enero, por el que se regula el Esquema Nacional de Seguridad en el ámbito de la Administración Electrónica.

[9] E. Mocskos, C. J. Barrios, H. Castro, et al. “Boosting advanced computational applications and resources in Latin America through collaboration and sharing,” Comp. Sci. & Eng., vol. 20, no. 3, pp. 39-48, 2018. doi: 10.1109/MCSE.2018.03202633

[10] PRACE Homepage. [Online]. Available: http://www.prace-ri.eu/