Empowering Agencies with High-Performance Computing
INDUSTRY PERSPECTIVE
INTRODUCTION The use of high-performance computing (HPC) infrastructures at federal agencies is growing rapidly. To successfully adopt this technology, the public sector must understand the similarities and differences between HPC and traditional IT infrastructure. For this reason, GovLoop and Red Hat, a leading provider of open source solutions, partnered to create this Industry Perspective. High-performance computing (HPC) is enabling cutting-edge research in federal labs conducting pioneering experiments that require processing large and complex collections of data. Whether modeling and forecasting weather patterns in near real time or analyzing financial data instruments to predict market economies, HPC is driving success. The need to take advantage of HPC was reflected in a 2015 White House initiative aimed at greatly accelerating the nation’s research and deployment of high-performance computing. A 2015 executive order established the National Strategic
2
Computing Initiative (NSCI), a multi-agency effort led by the Defense Department, Energy Department and National Science Foundation to maximize the benefits of high-performance computing for scientific discovery and economic competitiveness. Despite this backing, HPC environments remain challenging. As most IT administrators would tell you, HPC is not plug and play technology. Implementing and running an HPC platform comes with its own set of complex challenges. Resource constraints, training, communication, scale and operational integration with cloud computing are just a few of these challenges.
Empowering Agencies with High-Performance Computing
“The speed of the world’s fastest supercomputers has increased by a factor of roughly a half million over the past 23 years, an extremely rapid transformation that no other industry has experienced.”
ITIF REPORT
DEFINING HIGH-PERFORMANCE COMPUTING Overcoming these challenges starts with understanding what high-performance computing is and what distinguishes this technology from other IT infrastructures already in place at federal agencies. HPC is more than supercomputers, although advances in supercomputing drive HPC. According to a report from the Information Technology and Innovation Foundation (ITIF), “the speed of the world’s fastest supercomputers has increased by a factor of roughly a half million over the past 23 years, an extremely rapid transformation that no other industry has experienced.” HPC includes the full infrastructure needed to take advantage of these advances in computing speed. The differentiating factor of HPC is that it represents a different type of computing architecture. Generally, an enterprise IT infrastructure approaches compute power, networking and storage as three different silos. With HPC, the walls between these silos crumble to reveal a single computational cluster—a huge computing resource given to a single job for a very specific length of time. The benefits include improved performance, better cost management, a reduced data center footprint, agility and flexibility.
The ITIF report cites key benefits of HPC technology that make it indispensable to federal agencies: 1.
Each step-change in HPC represents an order of magnitude change unlocking new applications or the better use of existing ones.
2.
HPC is transforming the scientific method itself with the introduction of computational simulation.
3.
HPC will be needed to handle the tremendous growth of data.
4.
HPC represents an avenue to address the erosion of Moore’s Law, at least for high-performance systems.
5.
Declining prices and increasing capabilities are making HPC systems available to more institutional and commercial users, including small to medium-sized enterprises.
Source: The Vital Importance of High-Performance Computing to U.S. Competitiveness To successfully implement HPC, you must first understand the challenges involved. In the following section, we discuss each specific challenge as it pertains to the federal government.
industry perspective
3
HPC CHALLENGES Some of the IT infrastructure challenges facing federal research, engineering and scientific organizations in implementing HPC are centered on administration, scaling and automation.
1. ADMINISTRATION Implementing an HPC infrastructure often forces administrators to adjust from managing a few computers to potentially many thousands. The complexity of this surge in responsibility can often produce a bottleneck in achieving the performance and cost benefits associated with HPC solutions.
2. AUTOMATION Automation is needed to enable the administration of large computing clusters and the handling of large data sets, as well as the need to manage multiple discrete HPC facilities. Marshaling the needed resources is a significant problem that can slow an agency’s efforts at scaling.
3. SCALING Science and research organizations often are unable to accurately forecast the need to scale their infrastructure to meet research requirements. This can leave administrators scrambling to set up and merge new storage, compute or networking infrastructure, which then also must be managed.
4
Empowering Agencies with High-Performance Computing
SOLUTIONS FOCUSED ON THE RESEARCH, NOT JUST DATA Red Hat Infrastructure Suite for Science and Research, which is made up of Red Hat Enterprise Linux, Red Hat Enterprise Virtualization, OpenStack, Cloud Forms and Ansible, provides high-performance computing in a platform that improves administrative control, automation and scalability for research and scientific needs. The OpenStack platform transcends the computation and virtualization environment by including control over networking and storage resources. This allows a single administrator
Some of the IT infrastructure c
to control, automate and manage the storage network and computation infrastructure. This level of control and automation lets the administrator model the data and deploy the infrastructure to manage it more efficiently than in any other computing structure.
and scientific organizations in tion, scaling and automation.
The OpenStack platform provides a highly scalable, production-ready Infrastructure as a Service (IaaS) solution that’s reliable, available and scalable for your fully open HPC cloud infrastructure.
RED HAT INFRASTRUCTURE SUITE FOR SCIENCE AND RESEARCH (RHISSR) Red Hat Infrastructure Suite for Science and Research (RHISSR) address the specific challenges and needs of organizations that rely on high-performance computing infrastructure with a solution that provides performance, management, and mission enablement. Having a backup plan when the limits of resource capacity are being reached is a major challenge. But with RHISSR, agencies can quickly merge with a partner facility or move to a public cloud provider as needed. RHCI provides options so that administrators can use either Red Hat Enterprise Virtualization or Red Hat OpenStack Platform to do this. Inside the virtualized environment, administrators can use Red Hat Enterprise Linux and access to Red Hat Cloud Forms, which is a management engine that helps organizations manage workloads across multiple HPC facilities. RHCI enables agencies to tailor HPC infrastructure to the needs of their most demanding computing workloads. In the public sector, Red Hat offers a solution called Scientific and Research Infrastructure as a Service, or SRIS. SRIS is a combination of all those technologies, plus the multi-platform automation and configuration management tool Ansible. SRIS is ideal for the lab or research facility that needs the Red Hat Public Sector benefits along with a substantial amount of autonomy.
RHCI and RHEL, Red Hat Enterprise Linux, Tailored to Perform Red Hat Enterprise Linux (RHEL) is the foundational operating system for the entire infrastructure. This time-tested operating system has a proven history of compliance with federal regulatory requirements, including the Health Insurance Portability and Accountability Act (HIPAA), Common Criteria and the Federal Information Processing Standards (FIPS). Red Hat OpenStack Platform lives on top of Red Hat Enterprise Linux. Building an HPC infrastructure on OpenStack provides organizations with the assurances that the IT is being built in alignment with the security protocols and requirements engrained within Red Hat Enterprise Linux. “Whether HIPAA, Common Criteria or even Federal Information Processing Standards (FIPS), there’s a strong degree of oversight and regulatory requirement that influences the federal scientific and research community,” said Adam Clater, Chief Architect for Red Hat. “Part of what we’ve done in the public sector over the last 10 years has been to craft an operating system that conforms to those regulatory requirements by design. The security and the conformity to those regulations are all ingrained in our solutions.”
“Part of what we’ve done in the public sector over the last 10 years has been to craft an operating system that conforms to those regulatory requirements by design.” ADAM CLATER CHIEF ARCHITECT, RED HAT
industry perspective
5
DEPARTMENT OF ENERGY CASE STUDY The Department of Energy (DOE) recently worked with Red Hat to enhance the capabilities of a function of OpenStack called Ironic. Ironic is the provisioning engine within OpenStack that allows administrators to take an image and deploy it onto bare metal infrastructure. A traditional HPC and data center may have a main instrument and a several other compute or storage clusters that stage the data and perform processing as well as input/output workloads. The provisioning mechanism used by DOE for its IO cluster was becoming outdated. Red Hat was able to stage entire pieces of its peta-scale file system into memory so that the HPC cluster could go back and find everything it needed locally.
Red Hat also worked with this DOE lab to provision Network File System (NFS) containers into memory. In doing so, Red Hat contributed code into the upstream around Ironic so that Red Hat could further support any amount of data provided. This exposed Ironic to a larger community and allowed it to be more widely used. Today, Ironic is the backbone of the installer for the Red Hat OpenStack Platform called OpenStack Platform Director.
THE ‘LONG TAIL EFFECT’ ON HPC A “long tailed distribution” is a chart that attempts to explain the number of occurrences that exist well outside of a given statistical norm. This can be applied to dollars and scientific research, where the overwhelming majority of money spent on research is spent in a very few number of places. Locations like CERN and DoE labs receive an overwhelming majority of physics research funding but are far from the only place where physics research occurs – and it's those ‘other’ research locations that make up the long tail. This same mechanism can be applied to HPC – wherein the largest research centers have the highest amount of spend for HPC facilities, but are far from the only people who need HPC systems. This “long tail of HPC” implies that the overwhelming majority of scientific users have a very limited amount of money to spend on expensive, baremetal HPC systems.
6
Dan McGuan, Program Lead for HPC, Cloud and Emerging Technologies at Red Hat, believes that the Long Tail of HPC is essential for research organizations to understand when looking for HPC solutions that help them produce faster results. According to McGuan, “Specific, domain science research groups that need, and can truly benefit from high-performance computing solutions offered at the Nation-State level, often lack the millions of dollars and resources needed to build and deploy those systems as well as lack the need for that level of performance. What OpenStack does is offer an affordable, software-defined HPC environment ‘as-a-Service’ that can accelerate research by allowing the labs to own their workflows rather than rely on their collaborators with those enormous, expensive HPC systems which can take months to access.”
Empowering Agencies with High-Performance Computing
CONCLUSION In essence, OpenStack empowers 90% of the research institutes in the Long Tail whom, like the top one to three percent, are also in need of performing operations that require parallel processing and batch computing. This emphasis on serving the scientific community through high-performance computing solutions further address the lifecycle of research while helping scientists and researchers make new discoveries, answer old questions and produce new knowledge. In fact, the accessibility of a well-balanced cloud and HPC solution, like OpenStack, can become the canvas on which research and scientific communities can work and thrive throughout the entire scientific initiative lifecycle.
ABOUT RED HAT
ABOUT GOVLOOP
Red Hat® is the world’s leading provider of open source solutions, using a community-powered approach to provide reliable and high-performing cloud, virtualization, storage, Linux® and middleware technologies. Today, Red Hat is at the forefront of open source software development for enterprise IT, with a broad portfolio of products and services for commercial markets. That vision for developing better software is a reality, as CIOs and IT departments around the world rely on Red Hat to deliver solutions that meet their business needs. Solutions that provide technology leadership, performance, security, and unmatched value to more than 90 percent of Fortune 500 companies. Learn more: http://www.redhat.com/en/technologies/industries/government
GovLoop’s mission is to “connect government to improve government.” We aim to inspire public-sector professionals by serving as the knowledge network for government. GovLoop connects more than 250,000 members, fostering cross-government collaboration, solving common problems and advancing government careers. GovLoop is headquartered in Washington, D.C., with a team of dedicated professionals who share a commitment to connect and improve government. For more information about this report, please reach out to info@govloop.com.
Thanks to Adam Clater, Office of the Chief Technologist, North America Public Sector, and Dan McGuan, Program Lead for HPC, Cloud and Emerging Technologies, Red Hat for their contributions to this report.
industry perspective
7
1152 15th Street NW, Suite 800 Washington, DC 20005 Phone: (202) 407-7421 | Fax: (202) 407-7501 www.govloop.com @GovLoop