HYPER TRANSPORT TECHNOLOGY
Hyper Transport technology is a very fast, low latency, point-to-point link used for inter-connecting integrated circuits on board. Hyper Transport, previously codenamed as Lightning Data Transport (LDT), provides the bandwidth and flexibility critical for today's networking and computing platforms while retaining the fundamental programming model of PCI. Hyper Transport was invented by AMD and perfected with the help of several partners throughout the industry.
Hyper Transport was designed to support both CPU-to-CPU communications as well as CPU-to-I/O transfers, thus, it features very low latency. It provides up to 22.4 Gigabyte/second aggregate CPU to I/O or CPU to CPU bandwidth in a highly efficient chip-to-chip technology that replaces existing complex multi-level buses .Using enhanced 1.2 volt LVDS signaling reduces signal noise, using non-multiplexed lines cuts down on signal activity and using dual-data rate clocks lowers clock rates while increasing data throughput. . It employs a packet-based data protocol to eliminate many sideband (control and command) signals and supports asymmetric, variable width data paths.
New specifications are backward compatible with previous generations of specification, extending the investment made in one generation of Hyper Transport-enabled device to future generations. Hyper Transport devices are PCI software compatible, thus they require little or no software overhead. The technology targets networking, telecommunications, computers and embedded systems and any application where high speed, low latency and scalability are necessary.
The I/O Bandwidth Problem
While microprocessor performance continues to double every eighteen months, the performance of the I/O bus architecture has lagged, doubling in performance approximately every three years. This I/O bottleneck constrains system performance, resulting in diminished actual performance. Over the past 20 years, a number of legacy buses, such as ISA, VL-Bus, AGP, LPC, PCI-32/33, and PCI-X, have emerged that must be bridged together to support a varying array of devices. Servers and workstations require multiple high-speed buses, including PCI-64/66, AGP Pro, and SNA buses like InfiniBand. The hodge-podge of buses increases system complexity, adds many transistors devoted to bus arbitration and bridge logic, while delivering less than optimal performance.
A number of new technologies are responsible for the increasing demand for additional bandwidth.
? High-resolution, texture-mapped 3D graphics and high-definition streaming video are escalating bandwidth needs between CPUs and graphics processors.
? Technologies like high-speed networking (Gigabit Ethernet, InfiniBand, etc.) and wireless communications (Bluetooth) are allowing more devices to exchange growing amounts of data at rapidly increasing speeds.
? Software technologies are evolving, resulting in breakthrough methods of utilizing multiple system processors. As processor speeds rise, so will the need for very fast, high-volume inter-processor data traffic.
While these new technologies quickly exceed the capabilities of todayâ„¢s PCI bus, existing interface functions like MP3 audio, v.90 modems, USB, 1394, and 10/100 Ethernet are left to compete for the remaining bandwidth. These functions are now commonly integrated into core logic products.
Higher integration is increasing the number of pins needed to bring these multiple buses into and out of the chip packages. Nearly all of these existing buses are single- ended, requiring additional power and ground pins to provide sufficient current return paths. Reducing pin count helps system designers to reduce power consumption and meet thermal requirements.
In response to these problems, AMD began developing the HyperTransportâ€žÂ¢ I/O link architecture in 1997. Hyper Transport technology has been designed to provide system architects with significantly more bandwidth, low-latency responses, lower pin counts, compatibility with legacy PC buses, extensibility to new SNA buses, and transparency to operating system software, with little impact on peripheral drivers.
The HyperTransportâ€žÂ¢ Technology Solution
Hyper Transport technology, formerly codenamed Lightning Data Transfer (LDT), was developed at AMD with the help of industry partners to provide a high-speed, high performance, point-to-point link for inter -connecting integrated circuits on a board. With a top signaling rate of 1.6 GHz on each wire pair, a Hyper Transport technology link can support a peak aggregate bandwidth of 12.8 Gbytes/s. The Hyper Transport specification provides both link- and system-level power management capabilities optimized for processors and other system devices. Hyper Transport technology is targeted at networking , telecommunications , computer and high performance embedded applications and any other application in which high speed, low latency, and scalability is necessary.
Original Design Goals
In developing HyperTransport technology, the architects of the technology considered the design goals presented in this section. They wanted to develop a new I/O protocol for in-the-box I/O connectivity that would:
? Improve system performance
??Provide increased I/O bandwidth
??Reduce data bottlenecks by moving slower devices out of critical information paths
??Reduce the number of buses within the system
??Ensure low latency responses
??Reduce power consumption
? Simplify system design
??Use a common protocol for in-chassis connections to I/O and processors
??Use as few pins as possible to allow smaller packages and to reduce cost
? Increase I/O flexibility
??Provide a modular bridge architecture
??Allow for differing upstream and downstream bandwidth requirements
? Maintain compatibility with legacy systems
??Complement standard external buses
??Have little or no impact on existing operating systems and drivers
? Ensure extensibility to new system network architecture (SNA) buses
? Provide highly scalable multiprocessing systems
Flexible I/O Architecture
The resulting protocol defines a high-performance and scalable interconnect between CPU, memory, and I/O devices. Conceptually, the architecture of the HyperTransport I/O link can be mapped into five different layers, which structure is similar to the Open System Interconnection (OSI) reference model.
In HyperTransport technology:
? The physical layer defines the physical and electrical characteristics of the protocol.This layer interfaces to the physical world and includes data, control, and clock lines.
? The data link layer includes the initialization and configuration sequence, periodic cyclic redundancy check (CRC), disconnect or reconnect sequence, information packets for flow control and error management, and doubleword framing for other packets.
? The protocol layer includes the commands, the virtual channels in which they run, and the ordering rules that govern their flow.
? The transaction layer uses the elements provided by the protocol layer to perform actions, such as reads and writes.
? The session layer includes rules for negotiating power management state changes, as well as interrupt and system management activities.
HyperTransport technology creates a packet-based link implemented on two independent, unidirectional sets of signals. It provides a broad range of system topologies built with three generic device types:
? Caveâ€A single-link device at the end of the chain.
? Tunnelâ€A dual-link device that is not a bridge.
? Bridgeâ€Has a primary link upstream link in the direction of the host and one or more secondary links.
Each HyperTransport link consists of two point-to-point unidirectional data paths, as illustrated in Figure.
? Data path widths of 2, 4, 8, and 16 bits can be implemented either upstream or downstream, depending on the device-specific bandwidth requirements.
? Commands, addresses, and data (CAD) all use the same set of wires for signaling, dramatically reducing pin requirements.
HyperTransportâ€žÂ¢ Technology Data Paths
All HyperTransport technology commands, addresses, and data travel in packets. All packets are multiples of four bytes (32 bits) in length. If the link uses data paths narrower than 32 bits, successive bit-times are used to complete the packet transfers. The Hyper Transport link was specifically designed to deliver a high-performance and scalable interconnect between CPU, memory, and I/O devices, while using as few pins as possible.
? To achieve very high data rates, the Hyper Transport link uses low-swing differential signaling with on-die differential termination.
? To achieve scalable bandwidth, the Hyper Transport link permits seamless scalability of both frequency and data width.
Minimal Pin Count
The designers of HyperTransport technology wanted to use as few pins as possible to enable smaller packages, reduced power consumption, and better thermal characteristics, while reducing total system cost. This goal is accomplished by using separate unidirectional data paths and very low-voltage differential signaling.
The signals used in Hyper Transport technology are summarized in Table given below
? Commands, addresses, and data (CAD) all share the same bits.
? Each data path includes a Control (CTL) signal and one or more Clock (CLK) signals.
??The CTL signal differentiates commands and addresses from data packets.
??For every grouping of eight bits or less within the data path, there is a forwarded CLK signal. Clock forwarding reduces clock skew between the reference clock signal and the signals traveling on the link. Multiple forwarded clocks limit the number of signals that must be routed closely in wider Hyper Transport links.
? For most signals, there are two pins per bit.
? In addition to CAD, Clock, Control, VLDT power, and ground pins, each Hyper Transport device has Power OK (PWROK) and Reset (RESET#) pins. These pins are single-ended because of their low-frequency use.
? Devices that implement Hyper Transport technology for use in lower power applications such as notebook computers should also implement Stop (LDTSTOP#) and Request (LDTREQ#). These power management signals are used to enter and exit low-power states.
Enhanced Low-Voltage Differential Signaling
The signaling technology used in HyperTransport technology is a type of low voltage differential signaling (LVDS ). However, it is not the conventional IEEE LVDS standard. It is an enhanced LVDS technique developed to evolve with the performance of future process technologies. This is designed to help ensure that the Hyper Transport technology
standard has a long lifespan. LVDS has been widely used in these types of applications because it requires fewer pins and wires. This is also designed to reduce cost and power requirements because the transceivers are built into the controller chips.
Hyper Transport technology uses low-voltage differential signaling with differential impedance (ZOD) of 100 ohms for CAD, Clock, and Control signals, as illustrated in Figure. Characteristic line impedance is 60 ohms. The driver supply voltage is 1.2 volts, instead of the conventional 2.5 volts for standard LVDS. Differential signaling and the chosen impedance provide a robust signaling system for use on low-cost printed circuit boards. Common four-layer PCB materials with specified di-electric, trace, and space dimensions and tolerances or controlled impedance boards are sufficient to implement a Hyper Transport I/O link. The differential signaling permits trace lengths up to 24 inches for 800 Mbit/s operation.
Enhanced Low-Voltage Differential Signaling (LVDS)
At first glance, the signaling used to implement a Hyper Transport I/O link would seem to increase pin counts because it requires two pins per bit and uses separate upstream and downstream data paths. However, the increase in signal pins is offset by two factors:
? By using separate data paths, Hyper Transport I/O links are designed to operate at much higher frequencies than existing bus architectures. This means that buses delivering equivalent or better bandwidth can be implemented using fewer signals.
? Differential signaling provides a return current path for each signal, greatly reducing the number of power and ground pins required in each package.
Greatly Increased Bandwidth
Commands, addresses, and data traveling on a HyperTransport link are double pumped, where transfers take place on both the rising and falling edges of the clock signal. For example, if the link clock is 800 MHz, the data rate is 1600 MHz.
? An implementation of HyperTransport links with 16 CAD bits in each direction with a 1.6-GHz data rate provides bandwidth of 3.2 Gigabytes per second in each direction, for an aggregate peak bandwidth of 6.4 Gbytes/s, or 48 times the peak bandwidth of a 33-MHz PCI bus.
? ?A low-cost, low-power HyperTransport link using two CAD bits in each direction and clocked at 400 MHz provides 200 Mbytes/s of bandwidth in each direction, or nearly four times the peak bandwidth of PCI 32/33.
Data Link Layer
The data link layer includes the initialization and configuration sequence, periodic cyclic redundancy check (CRC), disconnect/reconnect sequence, information packets for flow control and error management, and double word framing for other packets.
HyperTransport technology-enabled devices with transmitter and receiver links of equal width can be easily and directly connected. Devices with asymmetric data paths can also be linked together easily. Extra receiver pins are tied to logic 0, while extra transmitter pins are left open. During power-up, when RESET# is asserted and the Control signal is at logic 0, each device transmits a bit pattern indicating the width of its receiver. Logic within each device determines the maximum safe width for its transmitter. While this may be narrower than the optimal width, it provides reliable
Communications between devices until configuration software can optimize the link to the widest common width.
For applications that typically send the bulk of the data in one direction, component vendors can save costs by implementing a wide path for the majority of the traffic and a narrow path in the lesser used direction. Devices are not required to implement equal width upstream and downstream links.
Protocol and Transaction Layers
The protocol layer includes the commands, the virtual channels in which they run, and the ordering rules that govern their flow. The transaction layer uses the elements provided by the protocol layer to perform actions, such as read request and responses.
All HyperTransport technology commands are either four or eight bytes long and begin with a 6-bit command type field. The most commonly used commands are Read Request, Read Response, and Write. A virtual channel contains requests or responses with the same ordering priority.
When the command requires an address, the last byte of the command is concatenated with an additional four bytes to create a 40-bit address.
A Write command or a Read Response command is followed by data packets. Data packets are four to 64 bytes long in four-byte increments. Transfers of less than four bytes are padded to the four-byte minimum. Byte granularity reads and writes are supported with a four-byte mask field preceding the data. This is useful when transferring data to or from graphics frame buffers where the application should only affect certain bytes that may correspond to one primary color or other characteristics of the displayed pixels. A control bit in the command indicates whether the writes are byte or doubleword granularity.
Reads and writes to PCI I/O space are mapped into a separate address range, eliminating the need for separate memory and I/O control lines or control bits in read and write commands.
Additional address ranges are used for in-band signaling of interrupts and system management messages. A device signaling an interrupt performs a byte-granularity write command targeted at the reserved address space. The host bridge is responsible for delivery of the interrupt to the internal target.
I/O Stream Identification
Communications between the HyperTransport host bridge and other HyperTransport technology-enabled devices use the concept of streams. A HyperTransport link can handle multiple streams between devices simultaneously. HyperTransport technology devices are daisy-chained, so that some streams may be passed through one node to the next.
Packets are identified as belonging to a stream by the Unit ID field in the packet header. There can be up to 32 unique IDs within a Hyper Transport chain. Nodes within a HyperTransport chain may contain multiple units.It is the responsibility of each node to determine if information sent to it is targeted at a device within it. If not, the information is passed through to the next node. If a device is located at the end of the chain and it is not the target device, an error response is passed back to the host bridge.
Commands and responses sent from the host bridge have a Unit ID of zero. Commands and responses sent from other HyperTransport technology devices on the chain have their own unique ID.
If a bus-mastering HyperTransport technology device like a RAID controller sends a write command to memory above the host bridge, the command will be sent with the Unit ID of the RAID controller . Hyper Transport technology permits posted write operations so that these devices do not wait for an acknowledgement before proceeding. This is useful for large data transfers that will be buffered at the receiving end.
I/O Streams Use Unit IDs
Within streams, the HyperTransport I/O link protocol implements the same basic ordering rules as PCI. Additionally, there are features that allow these ordering rules to be relaxed. A Fence command aligns posted cycles in all streams, and a Flush command flushes the posted write channel in one stream. These features are helpful in handling protocols for bridges to other buses such as PCI, InfiniBand, AGP.
The session layer includes link width optimization and link frequency optimization along with interrupt and power state capabilities.
Standard Plug Ëœn Play Conventions
Devices enabled with HyperTransport technology use standard Plug Ëœn Play conventions for exposing the control registers that enable configuration routines to optimize the width of each data path. AMD registered the HyperTransport Specific Capabilities Block with the PCI SIG. This Capabilities Block, illustrated in Figure , permits devices enabled with HyperTransport technology to be configured by any operating system that supports a PCI architecture.
HyperTransportâ€žÂ¢ Technology Capabilities Block
Since system enumeration and power-up are implementation-specific, it is assumed that system firmware will recognize the Capabilities Block and use the information within it to configure all Hyper Transport host bridges in the system.
Once the host bridges are identified, devices enabled with Hyper Transport technology that are connected to the bridges can be enumerated just as they are for PCI devices. Configuration information that is collected and the structures created by this process will look to a Plug Ëœn Play-aware operating system (OS) just like those of PCI devices. In short, the Plug Ëœn Play-aware OS does not require any modification to recognize and configure devices enabled with HyperTransport technology.
Minimal Device Driver Porting
Drivers for devices enabled with HyperTransport technology are unique to the devices just as they are to PCI I/O devices, but the similarities are great. Companies that build a PCI I/O device and then create an equivalent device enabled with Hyper Transport technology should have no problems porting the driver. To make porting easier, the chain from a host bridge is enumerated like a PCI bus, and devices and functions within a device enabled with HyperTransport technology are enumerated like PCI devices and functions, as shown in Figure
Link Width Optimization
The initial link-width negotiation sequence may result in links that do not operate at their maximum width potential. All 16-bit, 32-bit, and asymmetrically-sized configurations must be enabled by a software initialization step. At cold reset, all links power-up and synchronize according to the protocol. Firmware (or BIOS) then interrogates all the links in the system, reprograms them to the desired width, and takes the system through a warm reset to change the link widths. Devices that implement the LDTSTOP# signal can disconnect and reconnect rather than enter warm reset to invoke link width changes.
Link Frequency Initialization
At cold reset, all links power-up with 200-MHz clocks. For each link, firmware reads a specific register of each device to determine the supported clock frequencies. The reported frequency capability, combined with system-specific information about the board layout and power requirements, is used to determine the frequency to be used for each link. Firmware then writes the two frequency registers to set the frequency for each link. Once all devices have been configured, firmware initiates an LDTSTOP# disconnect
or RESET# of the affected chain to cause the new frequency to take effect.
HyperTransport technology has a daisy-chain topology, giving the opportunity to connect multiple HyperTransport input/output bridges to a single channel. Hyper -Transport technology is designed to support up to 32 devices per channel and can mix and match components with different link widths and speeds. This capability makes it possible to create Hyper Transport technology devices that are building blocks capable of spanning a range of platforms and market segments. For example, a low-cost entry in a mainstream PC product line might be designed with an AMD Duronâ€žÂ¢ processor. With very little redesign work, as shown in Figure this PC design could be upgraded to a high-end workstation by substituting high-end AMD Athlonâ€žÂ¢ processors and bridges with HyperTransport technology to
expand the platformâ„¢s I/O capabilities. Figure 10 also illustrates the concept of tunnels, in which multiple HyperTransport tunnels can be daisy-chained onto a single I/O link. A tunnel can be viewed as a basic building block for complex system designs.
A number of industry partners are developing HyperTransport switches, allowing engineers to have a great deal of flexibility in their system designs. In this type of configuration, a HyperTransport I/O switch handles multiple HyperTransport I/O data streams and manages the interconnection between the attached HyperTransport devices. For example, a four-port HyperTransport switch could aggregate data from multiple downstream ports into a single high-speed uplink, or it could route port-to-port connections. A switched environment allows multiple high-speed data paths to be linked while simultaneously supporting slower speed buses.
HYPER TRANSPORT TECHNOLOGY CONSORTIUM
The Consortium is a non-profit organization whose membership is open to any commercial or educational organization. It manages the HyperTransport Specification and promotes the technology to the industry at large. Promoter and Contributor members are eligible for membership in Technical and Marketing Task Force groups that manage the specification and direct the marketing outreach programs.
Hyper Transport technology is a new high-speed, high-performance, point-to-point link for integrated circuits. It provides a universal connection designed to reduce the number of buses within the system, provide a high-performance link for embedded applications, and enable highly scalable multiprocessing systems. It is designed to enable the chips inside of PCs and networking and communications devices to communicate with each other up to 48 times faster than with existing technologies. Hyper Transport
technology provides an extremely fast connection that complements externally visible bus standards like the PCI, as well as emerging technologies like InfiniBand and Gigabit Ethernet. Hyper Transport technology is truly the universal solution for in-the-box connectivity .Doubtlessly future will see the tremendous advancement of HyperTransport
technology. HyperTransport technology will bring a revolution in the bus architecture.