This is the second edition of a user’s guide to the Cray T3E massively parallel supercomputer installed at the Center for Scientific Computing. 11 2 Using the Cray T3E at CSC 13 Logging in. The components of Cray T3E node. The DEC Alpha processor architecture. . The CRAY T3E is a scalable shared-memory multiprocessor based on the DEC Alpha Section 2 provides a brief overview of the system architecture.

Author: Zulkirisar Meramar
Country: Lebanon
Language: English (Spanish)
Genre: Photos
Published (Last): 12 May 2007
Pages: 301
PDF File Size: 10.85 Mb
ePub File Size: 16.34 Mb
ISBN: 377-4-45846-826-6
Downloads: 13156
Price: Free* [*Free Regsitration Required]
Uploader: Mezil

The Cray T3E was Cray Research ‘s second-generation massively parallel supercomputer architecture, launched in late November Like the previous Cray T3Dit was a fully distributed memory machine using a 3D torus topology interconnection network. A processor T3E was the first supercomputer to achieve a performance of more than 1 teraflops running a computational science application, in After Cray Research was acquired by Silicon Graphics in Februarydevelopment of new Alpha-based systems was stopped.

The MC models were housed in one or more liquid-cooled cabinet separately from the host, there was also a liquid-cooled MCN model which had an alternative interconnect wiremat allowing non-power-of-2 numbers of PEs.

Cray Research Incorporated

The first T3D delivered was a prototype installed at the Pittsburgh Supercomputing Center in early Septemberthe supercomputer was formally introduced on 27 September Alpha — The Alpha architeccture, also known by its code name, EV5, is a microprocessor developed and fabricated by Digital Equipment Corporation that implemented the Alpha instruction set architecture.

It was introduced in Januarysucceeding the Alpha A architetcure Digitals flagship microprocessor and it was succeeded by the Alpha in First silicon of the Alpha was produced in Februaryand it was sampled in late and was introduced in January at MHz. A MHz version was arcuitecture in Marchthe final Alphaa MHz version, was architectue on 2 Octoberavailable in sample quantities. The Alpha was replaced by the Alpha A as Digitals flagship microprocessor in when a MHz version became available in volume quantities, Digital used the Alpha operating at various clock frequencies in their AlphaServer servers, AlphaStation workstations.

Third parties such as DeskStation also built using the Alpha The Alpha is a superscalar microprocessor capable of issuing a maximum of four instructions per clock cycle to four execution units.

The integer pipeline is seven stages long, and the floating-point pipeline is ten stages long, the implemented a bit virtual address and a bit physical address. It was therefore capable of addressing 8 TB of virtual memory and 1 TB of physical memory, the integer unit consisted of two integer pipelines and the integer register archtecture. The multiply pipeline exclusively executes shift, store, and multiply instructions, the architecthre pipeline exclusively executes branch instructions.

Except for branch, conditional move, and multiply instructions, all other instructions begin, branch and conditional move instructions are executed during stage six so they can be issued with a compare instruction whose result they depend on.

Cray T3E – Wikipedia

The integer register file contained forty bit registers, of which thirty-two are specified by the Alpha Architecture, the register file has four read ports and two write ports evenly divided between the two integer pipelines. The floating-point unit consisted of two floating-point pipelines and the floating point register file, the two pipelines are not identical, one executed all floating-point instructions except for multiply, and the other executed only multiply instructions.

A non-pipelined floating-point divider is connected to the architrcture pipeline, all floating-point instructions except for divide have four-cycle latency. Divides have variable latency that depends on whether the architectur is being performed architscture single or on double precision floating-point numbers and numbers, including overhead, single precision divides have a to cycle latency, whereas double precision architexture have a to cycle latency.

The has three levels of cache, two on-die and one external and optional, the caches and the associated logic consisted of 7. Microprocessor — A microprocessor is a computer processor which incorporates the functions of a computers rachitecture processing unit on a single integrated circuit, or at most a few integrated circuits. Microprocessors contain both combinational logic and sequential digital logic, Microprocessors operate on numbers and symbols represented in the binary numeral system.

The integration of a whole CPU onto a chip or on a few chips greatly reduced the cost of processing power.

Integrated circuit processors are produced in arcnitecture by highly automated processes resulting in a low per unit cost. Single-chip processors increase reliability as there are many electrical connections to fail.

As microprocessor designs get better, the cost of manufacturing a chip generally stays the same, before microprocessors, small computers had been built crxy racks of circuit boards with many medium- and small-scale integrated circuits. Microprocessors combined this into one or a few large-scale ICs, the internal arrangement of a microprocessor varies depending on the age of the design and the intended purposes of the microprocessor.

Advancing technology architectre more complex and powerful chips arxhitecture to manufacture, a minimal hypothetical microprocessor might only include an arithmetic logic unit and a control logic section. The ALU performs operations such as addition, subtraction, and operations such as AND or OR, architrcture operation of the ALU sets one or more flags in a status register, which indicate the results of the last operation.


The control logic retrieves instruction codes from memory and initiates the sequence of operations required for the ALU to carry out the instruction, a single operation code might affect many individual data paths, registers, and other elements of the processor.

As integrated circuit technology advanced, it was feasible to manufacture more and more complex processors on a single chip, the size of data objects became larger, allowing more transistors on a chip allowed word sizes to increase from 4- and 8-bit words up to todays bit words. Additional features were added to the architecture, more on-chip registers sped up programs. Floating-point arithmetic, for example, was not available on 8-bit microprocessors.

Integration of the point unit first as a separate integrated circuit f3e then as part of the same microprocessor chip. Occasionally, physical limitations of integrated circuits made such practices as a bit slice approach necessary, instead of processing all of a long word on one integrated circuit, multiple circuits in parallel processed subsets of each data word. With the ability to put large numbers of transistors on one architectuer and this CPU cache has the advantage of faster access than off-chip memory, and increases the processing speed of the system for many applications.

Processor clock frequency has increased atchitecture rapidly than external memory speed, except in the recent past, a microprocessor is a general purpose system. Several specialized processing devices have followed from the technology, A digital signal processor is specialized for signal processing, graphics processing units are processors designed primarily for realtime rendering of 3D images.

Dynamic random-access memory — Dynamic random-access memory is a type of random-access memory that stores each bit of data in a separate capacitor within an integrated circuit. The capacitor can be charged or discharged, these two states are taken to represent the two values of a bit, conventionally called 0 and 1.

Since even nonconducting transistors always leak a small amount, the capacitors will slowly discharge, because of this refresh requirement, it is a dynamic memory as opposed to static random-access memory and other static types of memory. Unlike flash memory, DRAM is volatile memory, since it loses its data quickly when power is removed, however, DRAM does exhibit limited data remanence.

DRAM is widely used in digital electronics where low-cost and high-capacity memory is required, one of the largest applications for DRAM is the main memory in modern computers, and as the main memories of components used in these computers such as graphics cards.

Cray T3E – WikiVisually

The advantage of DRAM is its simplicity, only one transistor. This allows DRAM to reach high densities. The transistors and capacitors used are small, billions can fit on a single memory chip. Due to the nature of its memory cells, DRAM consumes relatively large amounts of power. Paper tape was read and the characters on it were remembered in crwy dynamic store, the store used a large bank of capacitors, which were either charged or not, a charged capacitor representing cross and an uncharged capacitor dot.

Since the charge gradually leaked away, a pulse was applied to top up those still charged. InArnold Farber and Eugene Schlig, working for IBM, created a hard-wired adchitecture cell, using a transistor gate and they replaced the latch with two transistors and two resistors, a configuration that became known as the Farber-Schlig cell.

He was granted U. Silicon Graphics — Silicon Graphics, Inc. Early systems were based on the Geometry Engine that Clark and Marc Hannah had developed at Stanford University, for much of its history, the company focused on 3D imaging and were a major supplier of both hardware and software in this market.

They reincorporated as a Delaware corporation in Januarythrough the mid to lates, the rapidly improving performance of commodity Wintel machines began to erode SGIs stronghold in the 3D market.

The porting of Maya to other platforms is an event in this process. Under Belluzzos architedture a number of initiatives were taken which are considered to have accelerated the corporate decline, one such initiative was trying to sell workstations running Windows NT called Visual Workstations instead of just ones craj ran IRIX, the companys version of T3. This put the company in more direct competition with the likes of Dell.

Inin an attempt to clarify their current market position as more than a company, Silicon Graphics Inc. The new logo drew criticism for wasting the professional associated with the previous cube logo. SGI continued to use the Silicon Graphics name for its product line. In NovemberSGI announced that it cra been delisted from the New York Stock Exchange because its common stock had fallen below the share price for listing archtecture the exchange.

In FebruarySGI noted that it could run out of cash by the end of the year, in mid, SGI hired Alix Partners to advise it on returning to profitability and received a new line of credit. SGI announced it was postponing its scheduled annual December stockholders meeting until March and it proposed a reverse stock split to deal with the de-listing from the New York Stock Exchange.

History of supercomputing — The CDC, released inis generally considered the first supercomputer. Progress in the first decade of the 21st century was dramatic and supercomputers with over 60, processors appeared, the term Super Computing was first used in the New York World in to refer to large custom-built crzy that IBM t3d made for Columbia University.


In Cray completed the CDC, one of the first solid state computers, around Cray decided to design a computer that would be the fastest in arcgitecture world by a large margin. After four years of experimentation along with Jim Thornton, and Dean Roush, Cray switched from germanium to silicon transistors, built by Fairchild Semiconductor, that used the planar process.

These did not have the drawbacks of the silicon transistors. He ran them very fast, and archltecture speed of light restriction forced a very compact design with severe overheating problems, which were solved by introducing refrigeration, designed by Dean Roush. Given that the outran all computers of vray time by about 10 times, it was dubbed a supercomputer, the gained speed by farming out work to peripheral computing elements, freeing the CPU to process actual data.

In Cray completed the CDC, again the fastest archjtecture in the world, at 36 MHz, the had about three and a half times the clock speed of thebut ran significantly faster due to other technical innovations. They only sold about 50 of the s, not quite a failure, Cray left CDC in to form his own company.

It was said that whenever Englands Atlas went offline half of the United Kingdoms computer capacity was lost, Atlas also pioneered the Atlas Supervisor, zrchitecture by many to be the first recognizable modern operating system.

All three floating point pipelines on the X-MP could operate simultaneously, the Cray-2 released in was a 4 processor liquid cooled computer totally immersed in a tank of Fluorinert, which t33e as it operated. It could perform to 1. The Cray 2 was a new design and did not use chaining and had a high memory latency. That trend was partly responsible for an away from the in-house. It was announced f3e as the cleaned up successor to the Cray-1, the principal designer was Steve Chen.

Architectjre X-MPs main improvement over the Cray-1 was that it was a parallel vector processor. It housed two CPUs in a mainframe that was identical in outside appearance to the Cray The X-MP initially supported 2 million bit words of memory in 16 banks.

This configuration was first used for Cray Researchs UNIX port, inimproved models of the X-MP were announced, consisting of one, two, and four-processor systems with 4 and 8 million word configurations.

Cray-1 — The Cray-1 was a supercomputer designed, manufactured and marketed by Cray Research. The first Cray-1 system was installed at Los Alamos National Laboratory in and it went on to one of the best known. From to Seymour Cray of Control Data Corporation worked on the CDC, the was essentially made up of four s in a box with an additional special mode that allowed them to operate lock-step in a SIMD fashion.

In fact the main processor of the STAR had less performance than thebythe had reached a dead end, the machine was so incredibly complex that it was impossible to get one working properly.

Even a single faulty h3e would render the machine non-operational, Cray went to William Norris, Control Datas CEO, saying that a redesign from scratch was needed. At the time the company was in financial trouble, and with the STAR in the pipeline as well. The company expected to sell perhaps a dozen of the machines, and set the selling price accordingly, the machine made Seymour Cray a celebrity and his company a success, lasting until the supercomputer crash in the early s.

Based on a recommendation by William Perrys study, the NSA purchased a Cray-1 for theoretical research in cryptanalysis. As a comparison standpoint, the processor in a typical smartphone performs at roughly 1 GFLOPS, typical scientific workloads consist of reading in large data sets, transforming them in some way and then writing them back out again. Normally the transformations being applied are identical across all of the points in the set.

Cray-2 — The Cray-2 is a supercomputer with four vector processors built with emitter-coupled logic and made by Cray Research starting in With the successful launch of arcchitecture famed Cray-1, Seymour Cray turned to the design of its successor.

By he had become fed up with management interruptions in what was now a large company, and as he had done in the past, decided to resign his management post and move to form a new lab.

Working as an independent consultant at these new Cray Labs, he put together a team and this Lab would later close, rachitecture a t3s later a new facility in Colorado Springs would open. Unfortunately the density needed to achieve this cycle time led to the machines downfall, one solution to this problem, one that most computer vendors had already moved to, was to use integrated circuits instead of individual components.