One Petaflop

In recent discussions around big compute I was asked what a Petaflop (PFlop) would look like on Azure. Now, I had heard the term before and knew it was used to describe a considerable capacity of compute. So what is a petaflop (PFlop) and how do we convert this into something that we can all relate to?

What is a Petaflop (PFlop)

A petaflop is a measure of a computer’s processing speed and can be expressed as:

  • A quadrillion (thousand trillion) floating point operations per second (FLOPS)
  • A thousand teraflops
  • 10 to the 15th power FLOPS
  • 2 to the 50th power FLOPS

The Formula

Source: How to calculate peak theoretical performance of a CPU-based HPC system

GFlops =   (CPU speed in GHz)
         x (Number of CPU cores)
         x (CPU instruction per cycle)
         x (number of CPUs per node)

TFlops = (GFlops / 1000)

PFlops = (TFlops / 1000)

Where to take a number of CPU instructions per cycle?

  • Intel X5600 series CPUs and AMD 6100/6200/6300 series CPUs have 4 instructions per cycle
  • Intel E5-2600v1 and E5-2600v2 series CPUs have 8 instructions per cycle
  • Intel E5-2600v3 series CPUs have 8 instructions per cycle

What Does 1 PFlop Represent on Azure

As of February 2016, Azure has a few machine configurations. Microsoft has created the Azure Compute Unit (ACU) to provide a way of comparing compute (CPU) performance across Azure SKUs. This helps us easily identify which SKU is most likely to satisfy our performance needs.

SKU Family ACU/Core
Standard_A0 (Extra Small) 50
Standard_A1-4 (Small – Large) 100
Standard_A5-7 100
A8-A11 225 *
D1-14 160
D1-14v2 210 – 250 *
DS1-14 160
G1-5 180 – 240 *
GS1-5 180 – 240 *

ACUs marked with a * use Intel® Turbo technology to increase CPU frequency and provide a performance boost. The amount of the boost can vary based on the VM size, workload, and other workloads running on the same host.

Let’s do The Math


These calculations are the theoretical maximum. Real-world “FLOPS” will vary based on parallelization, network communication, the use of technologies like RDMA, GPUs and other factors.

The details used for this post have been pulled from various sources on the Internet. Some details may not be exact, please verify. Furthermore, please note that Intel® Turbo technology was not taken into consideration.

For this exercise, let’s focus on SKUs A9, D14 V2, G5 and N21.

An A9 is an Intel Xeon E5-2670 and has 16 cores @ 2.6 GHz (Supports RDAM)
A D14 V2 is an Intel Xeon E5-2673v3 and has 16 cores @ 2.4 GHz
A G5 is 2 Intel Xeon E5-2698Bv3 with 16 cores @ 2.00 GHz
A N21 is 2 Intel E5-2690v3 with 12-cores @ 2.6GHz (Supports RDMA)

  A9 D14 V2 G5 N21
GFlops 332.8 307.2 512 665.6
PFlops 0.0003328 0.0003072 0.000512 0.0006656
Aprox 1 PFlop 3,005 VMs 3,256 VMs 1954 VMs 1503 VMs

Gaining more Perspective

Have a look at this top 10 from the top 500 Supercomputer Sites published in June 2015. When we start referring to compute in Petaflops, we’re entering a whole new world of compute.

For the fifth consecutive time, Tianhe-2, a supercomputer developed by China’s National University of Defense Technology, has retained its position as the world’s No. 1 system, according to the 45th edition of the twice-yearly TOP500 list of the world’s most powerful supercomputers. Tianhe-2, which means Milky Way-2, led the list with a performance of 33.86 petaflop/s (quadrillions of calculations per second) on the Linpack benchmark… more

Trackbacks and Pingbacks:

  1. Luper's Learnings - Azure Technical Community for Partners (March 2016) - Luper’s Learnings - Site Home - TechNet Blogs - April 5, 2016

    […] Alexandre Brisebois (@Brisebois) helped us to understand what a Petaflop would look like on Azure. […]

    Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.