Brief History of High Performance Computing

Parallelism has always been present, particularly in IO intensive areas. IBM had dedicated, ``smart'' IO controllers handling a large part of business tasks by the late 1950s.

In the 1960s and 1970s, the primary focus of high-performance computing was in building single high-performance processors (CDC 6600, IBM 360/91, Cray 1). Exotic hardware was the most effective means to performance, as we could see with innovations like the 6600 being the first computer to use silicon (rather than germanium) transistors, and many high-end mainframes (including the Cray-1) using non-saturated logics like ECL, combined with really impressive cooling technologies.

Nobody realized it at the time, but the right road to performance changed in 1971, when the Intel 8008 came out. The key advance here is the amount of functionality that could be crammed on a chip, and the fact that the market would grow to the point that enormous resources could be devoted to the development of processors, but with sales high enough that the per-processor development cost would be very low. As a result, single-chip processor performance increased at a rate of 50% per year from the mid 1980s until the mid 2000s, while ``supercomputers'' were only seeing the same 20% per year improvement they'd had since the beginning.

The question becomes, how do we make effective use of silicon with a lot of functionality? Some early answers have included machines like ZMOB (a ``mob'' of Z-80s), the Cosmic Cube (a hypercube of 8086s), and custom designs like MPP or the Connection Machine (both SIMD processors).

Current answer seems to be to use as much COTS hardware as possible, augmented with special-purpose hardware where necessary.

Note on DEC view of CPI: when the Alpha was introduced, they considered the VAX's lifespan (~25 years), and saw a factor of 1000 improvement in that time. Assumption: there will be another factor of 1000 performance in next 25 years. How?

Landmark Computers

Here is my completely idiosyncratic view of a few computers which have had a huge impact:

ABC (1942) and ENIAC (1946)
The first two "computers" were the Atanasoff-Berry Computer and ENIAC. ABC was designed to perform Gaussian row elimination to solve linear algebra problems; the ENIAC was capable of executing programs that were entered by setting switching and changing cable connections.

The developers of ENIAC were originally granted a patent on the digital computer; this was overturned in 1973 and ABC is legally recognized as the first digital computer. You can see some highly partisan information on which was really first at http://www.cs.iastate.edu/jva/jva-archive.shtml (the ABC side) and http://ftp.arl.mil/~mike/comphist/eniac-story.html (the ENIAC side).

COLOSSUS
Another early computing engine that deserves mention here is COLOSSUS, built during World War II to break the Lorenz cipher used by the Germans during the war (Lorenz was a more advanced cipher than the better-known Enigma). Alan Turing had a great deal of input into its design; a total of ten of these machines were built during the war.

Due to wartime secrecy, and a long, long delay in relaxing this security, the developers of COLOSSUS never got the recognition they deserved at the time. Like ABC and ENIAC, COLOSSUS has a claim to being the "first computer"; also like those two, it was not a stored-program computer.

EDVAC and EDSAC (1949)
From a modern perspective, neither ABC, ENIAC, nor COLOSSUS had what I would regard as the key to a ``real'' computer: the ability to maintain a stored program: neither was a "stored program" computer. ABC was a hard-wired calculator, and ENIAC's programming amounted to rewiring the computer.

John von Neumann receives the credit for the idea of a stored-program computer (the 1945 paper on EDVAC it had only his name as author), though it's very likely Eckert and Mauchly (the ENIAC developers) were as responsible for it as he was. Maurice Wilkes began development of a stored-program computer called EDSAC (Electronic Delay Storage Automatic Calculator) in 1946; it was operational in 1949.

EDSAC used mercury delay lines for memory. It also had a subroutine instruction; this was called a ``wheeler jump'' after David Wheeler (the grad student who invented it).

Manchester/Ferranti Mark I (1951)
This computer was operational in 1951; it used Williams electrostatic storage tubes and a magnetic drum for memory. It was this project (though before the Mark I itself) that first executed a small stored program. This was the first commercially available computer.

Whirlwind (1952)
Originally intended as an analog controller for a flight simulator during WWII, this machine wound up being a digital computer, with no flight simulator, first operational in 1952. Core memory was invented for this computer.

Whirlwind was arguably the world's first supercomputer: a computer developed using the most advanced technology available, with a very cost-is-no-object philosophy, for high performance. It was able to execute a whopping 50,000 operations per second!

IBM 7030 (STRETCH) (1961)
By the mid 1950s, there were quite a few companies building computers. At that time, there was a much clearer distinction between a ``commercial'' processor and a ``scientific'' processor than there is now: a scientific processor would likely not have instructions aimed at doing packed decimal arithmetic, while a commercial processor would not have floating point, for instance. IBM's scientific and commercial computers at the time were the 704 and 705. The STRETCH project aimed at developing a machine using germanium transistor technology instead of tubes, which was to be 100 times faster than the 704 and 705, and would be useable as both a commercial and scientific computer. Its circuitry was 10 times faster than that of the 704 and 705 due to transistors, and improvements in core memory made memory accesses six times faster. It also introduced pipelining.

It didn't fully meet its performance goals, especially on commercial codes, but it was a huge step forward. Another candidate for ``first supercomputer,'' based on degree of improvement in state of the art.

CDC 6600 (1964)
First use of silicon transistors. Advanced packaging technology, with cross-shaped computer. Yet another candidate for first supercomputer, for all the same reasons as Whirlwind and STRETCH (in the CDC's case, it introduced silicon transistors to computers).

CDC 7600 (1969)
Fully pipelined functional units, another step forward in packaging technology (and shaped like the letter C! Not that its designer, Seymour Cray, had an ego or anything. One of CDC's later computers -- I think it was the CDC STAR 100, but haven't been able to verify it -- was shaped like the letter "t". Not that Jim Thornton had an ego or anything. He probably should have steered clear of the temptation; the STAR was a major disappointment to everybody and wound up costing Thornton his job). Four CPUs in the box.

ILLIAC IV (1974)
64 processors all executing in lockstep; first SIMD computer I know of. For a while computers with this approach were a popular way of performing image processing.

Cray-1 (1976)
First successful commercial pipelined computer, 1976. Non-saturated ECL technology gave balanced circuit load (and dissipated huge amounts of power). 12.5 nanosecond cycle time -- IBM didn't catch up until 15 years later. 65 sold. Yes, another candidate for first supercomputer, based on parallel instructions in instruction set. Hot-rodded by Steve Chen as X-MP.

Goodyear MPP (1979)
128x128 grid of one-bit processors (another SIMD design), specifically intended for computer vision applications. Capable of 250 MFlops.

Cray-2 (1985)
Wanted to use GaAs, but technology wasn't mature enough. Fluorocarbon direct immersion cooling. First Cray-2 sold had more memory than all Cray-1s deployed. No, I haven't come across anyone who thought of this as the first supercomputer :)

IBM 801 (1975), RISC (1982), and MIPS (1982)
Three processors, all with instruction sets designed around the twin concepts of making the common case fast, and being easy to create a pipelined single-chip implementation. RISC project coined term RISC; Hennessy and Patterson (RISC and MIPS developers) regard 801 as first RISC processor. An argument could be made that the CDC 6600 was really the first RISC machine.

Caltec Cosmic Cube (1985)
Six-dimensional hypercube of 8086's.

Connection Machine (1985)
Up to 65,536 one-bit processors arranged in a hypercube, specifically intended for artificial intelligence applications.

Intel 80486 (1990)
Pipelined implementation of Intel IA-32 instruction set. IA-32 had widely been regarded as impossible to pipeline effectively, due to extreme variability in sizes of instructions; this machine showed it could be done, and that IA-32 could be an effective processor for high-performance applications.

Intel Pentium Pro (P6 processor core) (1995)
Introduced Out of Order execution to IA-32 processor line.

Modern High Performance
A dominant factor in high performance computing today has been the increased die size and decreased feature size of integrated circuits. Systems have become more and more integrated, with first CPUs, then floating point units and cache, and now even muliple processor cores and levels of cache on a single chip.

The current highest-speed computers in the world amount to massive arrays of commodity processors, with very high-speed interconnects -- so-called Pile o' PC or Beowulf clusters (the best-known early project to build a Pile o' PCs was named Beowulf).

Another recent development has been the recognition that forever-higher clock rates and ever-greater instruction level parallelism was reaching a limit, and that putting multiple processor cores on a chip has become a better road to high performance at present.

As of this writing (January, 2009) the fastest computer in the world is Roadrunner, at Los Alamos National Labs. It's got 129,600 PowerXCell 8i processor cores. Wow.


Last modified: Fri Jan 16 13:01:42 MST 2009

Valid HTML 4.01 Transitional