5 minute read
References
[1] R. H. Dennard, F. H. Gaensslen, H.-N. Yu, V. L. Rideout, E. Bassous and A. R. LeBlanc, “Design of ion-implanted
MOSFET's with very small physical dimensions,” IEEE Journal of Solid-State Circuits, vol. 9, no. 5, pp. 256-268,
Advertisement
October 1974. [2] “Intel discontinues Xeon Phi 7200 series Knights Landing coprocessor cards,” [Online]. Available: https://www.anandtech.com/show/11769/intel-discontinues-xeon-phi-7200-series-knights-landing-coprocessorcards. [3] OpenACC, [Online]. Available: https://www.openacc.org/. [4] OpenCL, [Online]. Available: https://www.khronos.org/registry/cl/specs/opencl-1.1.pdf. [5] OpenMP, [Online]. Available: https://www.openmp.org/. [6] “OmpSs@FPGA,” [Online]. Available: https://pm.bsc.es/ompss-at-fpga. [7] “High Bandwidth Memory (HBM) DRAM, White Paper,” JEDEC Solid State Technology Association, 2013. [8] “JEDEC Publishes DDR4 NVDIMM-P Bus Protocol Standard,” February 2021. [Online]. Available: https://www.jedec.org/news/pressreleases/jedec-publishes-ddr4-nvdimm-p-bus-protocol-standard. [9] K. Wang, K. Angstadt, C. Bo, N. Brunelle, E. Sadredini, T. Tracy, J. Wadden, M. Stan and K. Skadron, “An Overview of
Micron's Automata Processor,” in Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on
Hardware/Software Codesign and System Synthesis (CODES), 2016. [10] Y.-C. Kwon, S. H. Lee, J. Lee, S.-H. Kwon, J. M. Ryu, J.-P. Son, O. Seongil, H.-S. Yu, H. Lee, S. Y. Kim, Y. Cho, J. G. Kim,
J. Choi, H.-S. Shin, J. Kim, B. Phuah, H. Kim, M. J. Song, A. Choi, D. Kim, S. Kim, E.-B. Kim, D. Wang, S. Kang, Y. Ro, S.
Seo, J. Song, J. Youn, K. Sohn and N. S. Kim, “A 20nm 6GB Function-In-Memory DRAM, Based on HBM2 with a 1.2TFLOPS Programmable Computing Unit Using Bank-Level Parallelism, for Machine Learning Applications,” in
Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2021. [11] F. Devaux, “The True Processing in Memory Accelerator,” in IEEE Hot Chips Symposium (HCS), 2019. [12] “Intel Optane Memory - Responsive Memory, Accelerated Performance,” [Online]. Available: https://www.intel.com/content/www/us/en/products/details/memory-storage/optane-memory.html. [13] “Intel at last announces Optane memory: DDR4 that never forgets.,” May 2018. [Online]. Available: https://arstechnica.com/gadgets/2018/05/intel-finally-announces-ddr4-memory-made-from-persistent-3dxpoint. [14] “memkind GitHub,” [Online]. Available: http://memkind.github.io/memkind/. [15] “SICM GitHub,” [Online]. Available: https://github.com/lanl/SICM. [16] DEEP-SEA Project. Horizon 2020 grant agreement 955606, [Online]. Available: https://www.deep-projects.eu/. [17] P. Radojković, P. Carpenter, P. Esmaili-Dokht, R. Cimadomo, H.-P. Charles, A. Sebastian and P. Amato, “Processing in Memory: The Tipping Point,” ETP4HPC, 2021. [18] S. Ghose, A. Boroumand, J. S. Kim, J. Gomez-Luna and O. Mutlu, “Processing-in-memory: A workload-driven perspective,” IBM Journal of Research and Development, vol. 63, no. 6, pp. 3:1-3:19, November-December 2019. [19] P. Siegl, R. Buchty and M. Berekovic, “Data-centric computing frontiers: A survey on processing-in-memory,” in
Proceedings of the Second International Symposium on Memory Systems, 2016. [20] S. Narasimhamurthy, N. Danilov, S. Wu, G. Umanesan, S. Markidis, S. Rivas-Gomez, I. B. Peng and S. De Witt, “Sage: percipient storage for exascale data centric computing,” Parallel Computing, no. 81, 2019. [21] IO-SEA project, [Online]. Available: https://iosea-project.eu. [22] Z. Ruan, T. He and J. Cong, “Designing In-Storage Computing System for Emerging High-Performance Drive,” in
Proceedings of the 2019 USENIX Annual Technical Conference, Renton, WA, USA, 2019. [23] “What's a DPU data processing unit,” [Online]. Available: https://blogs.nvidia.com/blog/2020/05/20/whats-a-dpudata-processing-unit/. [24] EuroEXA project. Horizon 2020 grant agreement number 754337, [Online]. Available: https://euroexa.eu/.
[25] E. Suarez and T. Lippert, “Modular Supercomputing Architecture: from idea to production,” in Contemporary High
Performance Computing: from Petascale toward Exascale, vol. 3, Ed. Jeffrey S. Vetter, CRC Press, 2019, pp. 223-251. [26] DEEP projects. Horizon 2020 grant agreements 287530, 610476, 754303 and 955606., [Online]. Available: https://www.deep-projects.eu/. [27] J. Schmidt, “Network Attached Memory, Chapter 4 of the PhD Thesis: "Accelerating Checkpoint/Restart Application
Performance in Large-Scale Systems with Network Attached Memory", Ruprecht-Karls University Heidelberg -
Fakultaet fuer Mathematik und Informatik,” [Online]. Available: http://archiv.ub.uniheidelberg.de/volltextserver/23800/1/dissertation_juri_schmidt_publish.pdf. [28] P. Faraboschi, K. Keeton, T. Marsland and D. Milojicic, “Beyond Processor-centric Operating Systems,” in 15th
Workshop on Hot Topics in Operating Systems (HotOS XV), 2015. [29] Rigo et al., “Paving the way towards a highly energy-efficient and highly integrated compute node for the Exascale revolution: the ExaNoDe approach,” in Euromicro Symposium on Digital System Design, DSD 2017, 2017. [30] S. Schweitzer, “Panel: SmartNIC or DPU, Who Wins?,” in 2020 IEEE Symposium on High-Performance Interconnects (HOTI), see https://technologyevangelist.co/2020/08/25/smartnics-vs-dpus/, 2020. [31] M. Malms, M. Ostasz, M. Gilliot, P. Bernier-Bruna, L. Cargemel, E. Suarez, H. Cornelius, M. Duranton, B. Koren, P.
Rosse-Laurent, M. S. Pérez-Hernández, M. Marazakis, G. Lonsdale, P. Carpenter, G. Antoniu , S. Narasimhamurthy,
A. Brinkman, D. Pleiter, A. Tate, A. Wierse, J. Krueger, H.-C. Hoppe and E. Laure, “ETP4HPC's Strategic Research
Agenda for High-Performance Computing in Europe 4,” Zenodo, 2020. [32] SPIFFE, [Online]. Available: https://spiffe.io/. [33] F. Tessier, M. Martinasso, M. Chesi, M. Klein and M. Gila, “Dynamic Provisioning of Storage Resources: A Case Study with Burst Buffers,” in 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2020. [34] “BeeOND™: BeeGFS On Demand,” [Online]. Available: https://www.beegfs.io/wiki/BeeOND. [35] DAOS, [Online]. Available: https://docs.daos.io/. [36] “Portable Hardware Locality (hwloc),” [Online]. Available: https://www.open-mpi.org/projects/hwloc/. [37] “Portable Network Locality (netloc),” [Online]. Available: https://www.open-mpi.org/projects/netloc/. [38] “CUDA Zone,” [Online]. Available: https://developer.nvidia.com/cuda-zone. [39] “HIP Guide,” [Online]. Available: https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-GUIDE.html. [40] S. Memeti, L. Li, S. Pllana, J. Kołodziej and C. Kessler, “Benchmarking OpenCL, OpenACC, OpenMP, and CUDA:
Programming Productivity, Performance, and Energy Consumption,” in ARMS-CC '17. Proceedings of the 2017
Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, July 2017. [41] Kokkos, [Online]. Available: https://kokkos.org/. [42] “RAJA,” [Online]. Available: https://computing.llnl.gov/projects/raja-managing-application-portability-nextgeneration-platforms. [43] ADIOS2, [Online]. Available: https://csmd.ornl.gov/software/adios2. [44] “XIOS wiki page,” [Online]. Available: http://forge.ipsl.jussieu.fr/ioserver/wiki. [45] Maestro project, [Online]. Available: https://www.maestro-data.eu/. [46] O. Aumage, P. Carpenter and S. Benkner, “Task-Based Performance Portability in HPC,” Zenodo, 2021. [47] SINGULARITY, [Online]. Available: https://singularity.hpcng.org/. [48] Docker, [Online]. Available: https://www.docker.com/. [49] Podman, [Online]. Available: https://podman.io/. [50] Sarus, [Online]. Available: https://sarus.readthedocs.io/. [51] ECP, [Online]. Available: https://www.exascaleproject.org/. [52] European Centres of Excellence in High Performance Computing, [Online]. Available: https://www.hpccoe.eu/euhpc-centres-of-excellence2/.
Authors:
Paul Carpenter is leader of the Microservers and System Software team at the Barcelona Supercomputing Center (BSC). Utz-Uwe Haus is Head of HPE HPC/AI EMEA Research Lab, and technical coordinator of the MAESTRO project Erwin Laure is Director of the Max Planck Computing and Data Facility (MPCDF) Sai Narasimhamurthy leads European R&D for Seagate Systems and leads the SAGE & Sage2 storage projects. Estela Suarez is Senior Scientist at the Jülich Supercomputing Centre from Forschungszentrum Jülich, and leads the DEEP projects series.
This work was supported by the European Union’s Horizon 2020 research and innovation programme under grant agreements No. 955606 (DEEP-SEA), No.801101 (MAESTRO), No. 823964 (FocusCoE), No. 754337 (EuroEXA).
Paul Carpenter holds the Ramon y Cajal fellowship under contract RYC2018-025628-I of the Ministry of Economy and Competitiveness of Spain.
Cite as: P. Carpenter et al., « Heterogeneous High-Performance Computing », ETP4HPC White Paper, 2022, doi 10.5281/zenodo. 6090425.
DOI: 10.5281/zenodo.6090425
© ETP4HPC 2022