Feature Archives - Page 4 of 6

Latest coherent ASICs set the bar for the optical industry

Feature: Beyond 100G - Part 3

Alcatel-Lucent has detailed its next-generation coherent ASIC that supports multiple modulation schemes and allow signals to scale to 400 Gigabit-per-second (Gbps).

The announcement follows Ciena's WaveLogic 3 coherent chipset that also trades capacity and reach by changing the modulation scheme.

"They [Ciena and Alcatel-Lucent] have set the bar for the rest of the industry," says Ron Kline, principal analyst for Ovum’s network infrastructure group.

"We will employ [the PSE] for all new solutions on 100 Gigabit"

Kevin Drury, Alcatel-Lucent

Photonic service engine

Dubbed the photonic service engine (PSE), Alcatel-Lucent's latest ASIC will be used in 100Gbps line cards that will come to market in the second half of 2012.

The PSE compromises coherent transmitter and receiver digital signal processors (DSPs) as well as soft-decision forward error correction (SD-FEC). The transmit DSP generates the various modulation schemes, and can perform waveform shaping to improve spectral efficiency. The coherent receiver DSP is used to compensate for fibre distortions and for signal recover.

The PSE follows Alcatel-Lucent's extended reach (XR) line card announced in December 2011 that extends its 100Gbps reach from 1,500 to 2,000km. "This [PSE] will be the chipset we will employ for all new solutions on 100 Gigabit," says Kevin Drury, director of optical marketing at Alcatel-Lucent. The PSE will extend 100Gbps reach to over 3,000km.

Ciena's WaveLogic 3 is a two-device chipset. Alcatel-Lucent has crammed the functionality onto a single device. But while the device is referred to as the 400 Gigabit PSE, two PSE ASICs are needed to implement a 400Gbps signal.

"They [Ciena and Alcatel-Lucent] have set the bar for the rest of the industry"

Ron Kline, Ovum

"There are customers that are curious and interested in trialling 400Gbps but we see equal, if not higher, importance in pushing 100Gbps limits," says Manish Gulyani, vice president, product marketing for Alcatel-Lucent's networks group.

In particular, the equipment maker has improved 100Gbps system density with a card that requires two slots instead of three, and extends reach by 1.5x using the PSE.

Performance

Alcatel-Lucent makes several claims about the performance enhancements using the PSE:

Reach: The reach is extended by 1.5x.

Line card density: At 100Gbps the improvement is 1.5x. The current 100Gbps muxponder (10x10Gbps client input) and transponder (100Gbps client) line card designs occupy three slots whereas the PSE design will occupy two slots only. Density will be improved by 4x by adopting a 400Gbps muxponder that occupies three slots.

Power consumption: By going to a more advanced CMOS process and by enhancing the design of the chip architecture, the PSE consumes a third less power per Gigabit of transport: from 650mW/Gbps to 425mW/Gbps. Alcatel-Lucent is not saying what CMOS process technology is used for the PSE. The company's current 100Gbps silicon uses a 65nm process and analysts believe the PSE uses a 40nm process.

System capacity: The channel width occupied by the signal can be reduced by a third. A 50GHz 100Gbps wavelength can be compressed to occupy a 37.5GHz. This would improve overall 100Gbps system capacity from 8.8 Terabit-per-second (Tbps) to 11.7Tbps. Overall capacity can be improved from 88, 100Gbps ports to 44, 400Gbps interfaces. That doubles system capacity to 17.6Tbps. Using waveform shaping, this is improved by a further third, to greater than 23Tbps.

"We are not saying we are breaking the 50GHz channel spacing today and going to a flexible grid, super-channel-type construct," says Drury. "But this chip is capable of doing just that." Alcatel-Lucent will at least double network capacity when its system adopts 44 wavelengths, each at 400Gbps.

400 Gigabit

To implement a 400Gbps signal, a dual-carrier, dual-polarisation 16-QAM coherent wavelength is used that occupies 100GHz (two 50GHz channels). Alcatel-Lucent says that should it commercialise 400Gbps using waveform shaping, the channel spacing would reduce to 75GHz. But this more efficient grid spacing only works alongside a flexible grid colourless, directionless and contentionless (CDC) ROADM architecture.

A 400Gbps PSE card showing four 100 Gigabit Ethernet client signals going out as a 400Gbps wavelength. The three-slot card is comprised of three daughter boards. Source: Alcatel-Lucent.

Alcatel-Lucent is not ready to disclose the reach performance it can achieve with the PSE using the various modulation schemes. But it does say the PSE supports dual-polarisation bipolar phase-shift keying (DP-BPSK) for longest reach spans, as well as quadrature phase-shift keying (DP-QPSK) and 16-QAM (quadrature amplitude modulation).

"[This ability] to go distances or to sacrifice reach to increase bandwidth, to go from 400km metro to trans-Pacific by tuning software, that is a big advantage," says Ovum's Kline. "You don't then need as many line cards and that reduces inventory."

Market status

Alcatel-Lucent says that it has 55 customers that have deployed over 1,450 100Gbps transponders.

A software release later this year for Alcatel-Lucent's 1830 Photonic Service Switch will enable the platform to support 100Gbps PSE cards.

A 400Gbps card will also be available this year for operators to trial.

by Michael

Ciena: Changing bandwidth on the fly

Ciena has announced its latest coherent chipset that will be the foundation for its future optical transmission offerings. The chipset, dubbed WaveLogic 3, will extend the performance of its 100 Gigabit links while introducing transmission flexibility that will trade capacity with reach.

Feature: Beyond 100 Gigabit - Part 1

"We are going to be deployed, [with WaveLogic 3] running live traffic in many customers’ networks by the end of the year"

Michael Adams, Ciena

"This is changing bandwidth modulation on the fly," says Ron Kline, principal analyst, network infrastructure group at market research firm, Ovum. “The capability will allow users to dynamically optimise wavelengths to match application performance requirements.”

WaveLogic 3 is Ciena's third-generation coherent chipset that introduces several firsts for the company.

The chipset supports single-carrier 100 Gigabit-per-second (Gbps) transmission in a 50GHz channel.

The chipset includes a transmit digital signal processor (DSP) - which can adapt the modulation schemes as well as shape the pulses to increase spectral efficiency. The coherent transmitter DSP is the first announced in the industry.

WaveLogic 3's second chip, the coherent receiver DSP, also includes soft-decision forward error correction (SD-FEC). SD-FEC is important for high-capacity metro and regional, not just long-haul and trans-Pacific routes, says Ciena.

The two-ASIC chipset is implemented using a 32nm CMOS process. According to Ciena, the receiver DSP chip, which compensates for channel impairments, measures 18 mm sq. and is capable of 75 Tera-operations a second.

Ciena says the chipset supports three modulation formats: dual-polarisation bipolar phase-shift keying (DP-BPSK), quadrature phase-shift keying (DP-QPSK) and 16-QAM (quadrature amplitude modulation). Using a single carrier, these equate to 50Gbps, 100Gbps and 200Gbps data rates. Going to 16-QAM may increase the data rate to 200Gbps but it comes at a cost: a loss in spectral efficiency and in reach.

"This software programmability is critical for today's dynamic, cloud-centric networks," says Michael Adams, Ciena’s vice president of product & technology marketing.

WaveLogic 3 has also been designed to scale to 400Gbps. "This is the first programmable coherent technology scalable to 400 Gig," says Adams. "For 400 Gig, we would be using a dual-carrier, dual-polarisation 16-QAM that would use multiple [WaveLogic 3] chipsets."

Performance

Ciena stresses that this is a technology not a product announcement. But it is willing to detail that in a terrestrial network, a single carrier 100Gbps link using WaveLogic 3 can achieve a reach of 2,500+ km. "These refer to a full-fill [wavelengths in the C-Band] and average fibre," says Adams. "This is not a hero test with one wavelength and special [low-loss] fibre.”

Metro to trans-Pacific: The different reaches and distances over terrestrial and submarine using Ciena's WaveLogic 3. SC stands for single carrier. Source: Ciena/ Gazettabyte

When the modulation is changed to BPSK, the reach is effectively doubled. And Ciena expects a 9,000-10,000km reach on submarine links.

The same single-carrier 50GHz channel reverting to 16-QAM can transmit a 200Gbps signal over distances of 750-1,000km. "A modulation change [to 16-QAM] and adding a second 100 Gigabit Ethernet transceiver and immediately you get an economic improvement," says Adams.

For 400Gbps, two carriers, each 16-QAM, are needed and the distances achieved are 'metro regional', says Ciena.

The transmit DSP also can implement spectral shaping. According to Ciena, by shaping the signals sent, a 20-30% bandwidth improvement (capacity increase) can be achieved. However that feature will only be fully exploited once networks deploy flexible grid ROADMs.

At OFC/NFOEC. Ciena will be showing a prototype card that will demonstrate the modulation going from BPSK to QPSK to 16-QAM. "We are going to be deployed, running live traffic in many customers’ networks by the end of the year," says Adams.

Analysis

Sterling Perrin, senior analyst, Heavy Reading

Heavy Reading believes Ciena's WaveLogic 3 is an impressive development, compared to its current WaveLogic 2 and to other available coherent chipsets. But Perrin thinks the most significant WaveLogic 3 development is Ciena’s single-carrier 100Gbps debut.

Until now, Ciena has used two carriers within a 50GHz, each carrying 50Gbps of data.

"The dual carrier approach gave Ciena a first-to-market advantage at 100Gbps, but we have seen the vendor lose ground as Alcatel-Lucent rolled out its single carrier 100Gbps system," says Perrin in a Heavy Reading research note. "We believe that Alcatel-Lucent was the market leader in 100Gbps transport in 2011."

Other suppliers, including Cisco Systems and Huawei, have also announced single-carrier 100Gbps, and more single-wavelength 100Gbps announcements will come throughout 2012.

Heavy Reading believes the ability to scale to 400Gbps is important, as is the use of multiple carriers (or super-channels). But 400 Gigabit and 1 Terabit transport are still years away and 100Gbps transport will be the core networking technology for a long time yet.

"The vendors with the best 100G systems will be best-positioned to capture share over the next five years, we believe," says Perrin.

Ron Kline, principal analyst for Ovum’s network infrastructure group.

For Ron Kline, Ciena's announcement was less of a surprise. Ciena showcased WaveLogic 3's to analysts late last year. The challenge with such a technology announcement is understanding the capabilities and how it will be rolled out and used within a product, he says.

"Ciena's WaveLogic 3 is the basis for 400 Gig," says Kline. "They are not out there saying 'we have 400 Gig'." Instead, what the company is stressing is the degree of added capacity, intelligence and flexibility that WaveLogic 3 will deliver. That said, Ciena does have trials planned for 400 Gig this year, he says.

What is noteworthy, says Ovum, is that 400Gbps is within Ciena's grasp whereas there are still some vendors yet to record revenues for 100Gbps.

"Product differentiation has changed - it used to be about coherent," says Kline. "But now that nearly all vendors have coherent, differentiation is going to be determined by who has the best coherent technology."

by Michael

Photonic integration specialist OneChip tackles PON

Briefing: PON

Part 1: Monolithic integrated transceivers

OneChip Photonics is moving to volume production of PON transceivers based on its photonic integrated circuit (PIC) design. The company believes that its transceivers can achieve a 20% price advantage.

"We will be able to sell [our integrated PON transceivers] at a 20% price differential when we reach high volumes"

Andy Weirich, OneChip Photonics

OneChip Photonics has already provided transceiver engineering samples to prospective customers and will start the qualification process with some customers this month. It expects to start delivering limited quantities of its optical transceivers in the next quarter.

The company's primary products are Ethernet PON (EPON) and Gigabit PON (GPON) transceivers. But it is also considering selling a bi-directional optical sub-assembly (BOSA), a component of its transceivers, to those system providers that want to attach the BOSA directly to the printed circuit board (PCB) in their optical network units (ONUs).

"The BOSA is the sub-assembly that contains all the optics, usually the TIA [trans-impedance amplifier] and sometimes the laser driver," says Andy Weirich, OneChip Photonics' vice president of product line management.

The company will roll out its Ethernet PON (EPON) ONU transceivers in the second quarter of 2012, followed by GPON ONU transceivers in the third quarter.

PON Technologies

EPON operates at 1.25 Gigabit-per-second (Gbps) upstream and downstream. OneChip had planned to develop a 2.5Gbps EPON variant which, says OneChip, has been standardised by the China Communications Standards Association (CCSA). But the company has abandoned the design since volumes have been extremely small and there have been no deployments in China.

GPON is a 2.5Gbps downstream/ 1.25Gbps upstream technology. The main differences between GPON and EPON transceiver optical components are the requirement of the ONU's receiver optics and circuitry, and the laser type, says Weirich. GPON's Class B+ specification, used for nearly all the GPON deployments, calls for a 28-29dB sensitivity. This is a more demanding specification requirement to meet than EPON's. GPON also calls for a Distributed Feedback (DFB) laser, whereas an EPON ONU may use either a Fabry-Perot laser or a DFB laser.

OneChip uses the same DFB for GPON and EPON ONUs. Where the PIC designs differ is the receiver assembly where GPON requires amplification. This, says Weirich, is achieved using either an avalanche photodiode (APD) or a semiconductor optical amplifier (SOA).

OneChip will start with an APD but will progress to an SOA. Once it integrates an SOA as part of the PIC, a simpler, cheaper photo-detector can be used.

Weirich admits that it has taken OneChip longer than it expected to develop its monolithically-integrated design.

Part of the challenge has been the issue of packaging the PIC. "Because of our integrated approach and non-alignment-requiring assembly, we have had to solve a few more technology problems," he says. "Our suppliers have had a challenge with some of those issues, and it has taken a couple of iterations to solve."

OneChip says that the good news is that the price erosion of EPON transceivers has slowed down in the last two years. So while Weirich admits the market is more competitive now, what is promising is that volumes have continued to grow.

"There is no sign of saturation happening either in the EPON or GPON markets," he says. And OneChip believes it can compete on price. "What we are saying is that we will be able to sell [our monolithically integrated PON transceivers) at a 20% price differential when we reach high volumes." That is because the monolithic design is simpler and the optical components that make up the design are cheaper, says the company.

10G EPON and XGPON

OneChip believes the end of 2012 will be when 10G EPON volumes start to ramp. "10G EPON is a significantly larger market than 10G GPON [XGPON]," says Weirich, pointing out that some of the largest operators such as China Telecom have backed 10G EPON.

With 10G EPON there are two flavours: the asymmetric (10Gbps downstream and 1.25Gbps upstream) and the symmetric (10Gbps bidirectional) versions.

For an asymmetric 10Gbps ONU transceiver, the laser does not need to change but the optics and electronics at the receiver do, because of the 10Gbps receive signal and because operators want 28-29dB optical link budgets so that 10G EPON can run on the same fibre plant as EPON. "This is an order of magnitude more difficult from a sensitivity perspective than for EPON," says Weirich.

There is demand for the 10G symmetric EPON but it is much lower than the asymmetric version primarily due to cost. "The ONU transceiver with its 10 Gbps laser and photo-detector is quite a bit more costly," says Weirich, complicating the PON's business case.

OneChip says it has a 10G EPON in its product roadmap, but it has not yet made any announcements or made any demonstrations to customers.

Challenges

OneChip is not aware of any other company developing a monolithic integrated design for PON transceivers, in part due to the challenge. It has to be made cheaply enough to compete with the traditional TO-can design. The key is to develop low-cost integration techniques and processes right at the start of the PIC design, he says.

The company says that it is also exploring using its PIC technology to address data centre connectivity.

OneChip Photonics at a glance

OneChip employs some 80 staff and is headquartered in Ottawa, Canada, where it has a 4,000 sq. ft. cleanroom. The start-up also has a regional office in Shenzhen, China which includes a test lab to serve regional customers.

The company is primarily a transceiver supplier and its main target customers are the tier-one system vendors that supply OLT and ONU equipment. "When you think of the big three players in China, Huawei, ZTE and Fiberhome would be among those we are targeting," says Steve Bauer, vice president of marketing and communications, as well as players such as Alcatel-Lucent and Motorola. As mentioned, the company is also considering selling its BOSA design to ONU makers.

In May 2011 the company received $18M in its latest round of funding. "We are transitioning from product development to becoming operationally ready to manufacture in volume," says Bauer.

Fabrinet and Sanmina-SCI are two contract manufacturers that the company is using for transceiver testing and assembly while it has partnerships with several other fabs for supply of wafers, wafer fabrication and silicon optical benches.

by Michael

Next-gen 100 Gigabit optics

Briefing: 100 Gigabit

Part 2: Interview

Gazettabyte spoke to John D'Ambrosia about 100 Gigabit technology

John D'Ambrosia, chair of the IEEE 100 Gig backplane and copper cabling task force

John D'Ambrosia laughs when he says he is the 'father of 100 Gig'.

He spent five years as chair of the IEEE 802.3ba group that created the 40 and 100 Gigabit Ethernet (GbE) standards. Now he is the chair of the IEEE task force looking at 100 Gig backplane and copper cabling. D'Ambrosia is also chair of the Ethernet Alliance and chief Ethernet evangelist in the CTO office of Dell's Force10 Networks.

“People are also starting to talk about moving data operations around the network based on where electricity is cheapest”

"Part of the reason why 100 Gig backplane technology is important is that I don't know anybody that wants a single 100 Gig port off whatever their card is," says D'Ambrosia. "Whether it is a router, line card, whatever you want to call it, they want multiple 100 Gig [interfaces]: 2, 4, 8 - as many as they can."

Earlier this year, there was a call for interest for next-generation 100 Gig optical interfaces, with the goal of reducing the cost and power consumption of 100 Gig interfaces while increasing their port density. "This [next-generation 100 Gig optical interfaces] is going to become very interesting in relation to what is going on in the industry,” he said.

Next-gen 100 Gig

The 10x10 MSA is an industry initiative that is an alternative 100 Gig interface to the IEEE 100 Gigabit Ethernet standards. Members of the 10x10 MSA include Google, Brocade, JDSU, NeoPhotonics (Santur), Enablence, CyOptics, AFOP, MRV, Oplink and Hitachi Cable America.

"Unfortunately, that [10x10 MSA] looks like it could cause potential interop issues,” says D'Ambrosia. That is because the 10x10 MSA has a 10-channel 10 Gigabit-per-second (Gbps) optical interface while the IEEE 100GbE use a 4x25Gbps optical interface.

The 10x10 interface has a 2km reach and the MSA has since added a 10km variant as well as 4x10x10Gbps and 8x10x10Gbps versions over 40km.

The advent of the 10x10 MSA has led to an industry discussion about shorter-reach IEEE interfaces. "Do we need something below 10km?” says D’Ambrosia.

Reach is always a contentious issue, he says. When the IEEE 802.3ba was choosing the 10km 100GBASE-LR4, there was much debate as to whether it should be 3 or 4km. "I won’t be surprised if you have people looking to see what they can do with the current 100GBASE-LR4 spec: There are things you can do to reduce the power and the cost," he says.

One obvious development to reduce size, cost and power is to remove the gearbox chip. The gearbox IC translates between 10x10Gbps and the 4x25Gbps channels. The chip consumes several watts each way (transmit to receive and vice versa). By adopting a 4x25Gbps input electrical interface, the gearbox chip is no longer needed - the electrical and optical channels will then be matched in speed and channel count. The result is that the 100GbE designs can be put into the upcoming, smaller CFP2 and even smaller CFP4 form factors.

As for other next-gen 100Gbps developments, these will likely include a 4x25Gbps multi-mode fibre specification and a 100 Gig, 2km serial interface, similar to the 40GBASE-FR.

The industry focus, he says, is to reduce the cost, power and size of 100Gbps interfaces rather than develop multiple 100 Gig link interfaces or expand the reach beyond 40km. "We are going to see new systems introduced over the next few years not based on 10 Gig but designed for 25 Gig,” says D’Ambrosia. The ASIC and chip designers are also keen to adopt 25Gbps signalling because they need to increase input-output (I/O) yet have only so may pins on a chip, he says.

D’Ambrosia is also part of an Ethernet bandwidth assessment ad-hoc committee that is part of the IEEE 802.3 work. The group is working with the industry to quantify bandwidth demand. “What you see is a lot of end users talking about needing terabit and a lot of suppliers talking about 400 Gig,” he says. Ultimately, what will determine the next step is what technologies are going to be available and at what cost.

Backplane I/0 and switching

Many of the systems D'Ambrosia is seeing use a single 100Gbps port per card. "A single port is a cool thing but is not that useful,” he says. “Frankly, four ports is where things start to become interesting.”

This is where 25Gbps electrical interfaces come into play. "It is not just 25 Gig for chip-to-chip, it is 25 Gig chip-to-module and 25 Gig to the backplane."

Moreover modules, backplane speeds, and switching capacity are all interrelated when designing systems. Designing a 10 Terabit switch, for example, the goal is to reduce the number of traces on a board and that go through the backplane to the switch fabric and other line cards.

Using 10Gbps electrical signals, between 1,200 to 2,000 signals are needed depending on the architecture, says D'Ambrosia. With 25Gbps the interface count reduces to 500-750. “The electrical signal has an impact on the switch capacity,” he says.

100 Gig in the data centre

D’Ambrosia stresses that care is needed when discussing data centres as the internet data centres (IDC) of a Google or a Facebook differ greatly from those of enterprises. “In the case of IDCs, those people were saying they needed 100 Gig back in 2006,” he says.

Such mega data centres use tens of thousands of servers connected across a flat switching architecture unlike traditional data centres that use three layers of aggregated switching. According to D'Ambrosia such flat architectures can justify using 100Gbps interfaces even when the servers each have a 1 Gig Ethernet interfaces only. And now servers are transitioning to 10 GbE interfaces.

“You are going to have to worry about the architecture, you are going to have to worry about the style of data centre and also what the server applications are,” says D'Ambrosia. “People are also starting to talk about moving data operations around the network based on where electricity is cheapest.” Such an approach will require a truly wide, flat architecture, he says.

D'Ambrosia cites the Amsterdam Internet exchange that announced in May its first customer using a 100 Gig service. "We are starting to see this happen,” he says.

One lesson D'Ambrosia has learnt is that there is no clear relationship between what comes in and out of the cloud and what happens within the cloud. Data centres themselves are one such example.

100 Gig direct detection

In recent months lower power, 200km to 800km reach, 100Gbps direct detection interfaces that are cheaper than coherent transmission have been announced by ADVA Optical Networking and MultiPhy. Such interfaces have a role in the network and are of varying interest to telco operators. But these are vendor-specific solutions.

D’Ambrosia stresses the importance of standards such as the IEEE and the work of the Optical Internetworking Forum (OIF) that has adopting coherent. “I still see customers that want a standards-based solution,” says D'Ambrosia, who adds that while the OIF work is not a standard, it is an interoperability agreement. “It allows everyone to develop the same thing," he says.

There are also other considerations regarding 100 Gig direct-detection besides cost, power and a pluggable form factor. Vendors and operators want to know how many people will be able to source this, he says.

D'Ambrosia says that new systems being developed now will likely be deployed in 2013. Vendors must assess the attractiveness of any alternative technologies to where industry backed technologies like coherent and the IEEE standards will be then.

The industry will adopt a variety of 100Gbps solutions, he says, with particular decisions based on a customer’s cost model, its long term strategy and its network.

For Part 1 - 100 Gig: An operator view click here

by Michael

100 Gigabit: An operator view

Gazettabyte spoke with BT, Level 3 Communications and Verizon about their 100 Gigabit optical transmission plans and the challenges they see regarding the technology.

Briefing: 100 Gigabit

Part 1: Operators

Operators will use 100 Gigabit-per-second (Gbps) coherent technology for their next-generation core networks. For metro, operators favour coherent and have differing views regarding the alternative, 100Gbps direct-detection schemes. All the operators agree that the 100Gbps interfaces - line-side and client-side - must become cheaper before 100Gbps technology is more widely deployed.

"It is clear that you absolutely need 100 Gig in large parts of the network"

Steve Gringeri, Verizon

100 Gigabit status

Verizon is already deploying 100Gbps wavelengths in its European and US networks, and will complete its US nationwide 100Gbps backbone in the next two years.

"We are at the stage of building a new-generation network because our current network is quite full," says Steve Gringeri, a principal member of the technical staff at Verizon Business.

The operator first deployed 100Gbps coherent technology in late 2009, linking Paris and Frankfurt. Verizon's focus is on 100Gbps, having deployed a limited amount of 40Gbps technology. "We can also support 40 Gig coherent where it makes sense, based on traffic demands," says Gringeri.

Level 3 Communications and BT, meanwhile, have yet to deploy 100Gbps technology.

"We have not [made any public statements regarding 100 Gig]," says Monisha Merchant, Level 3’s senior director of product management. "We have had trials but nothing formal for our own development." Level 3 started deploying 40Gbps technology in March 2009.

BT expects to deploy new high-speed line rates before the year end. "The first place we are actively pursuing the deployment of initially 40G, but rapidly moving on to 100G, is in the core,” says Steve Hornung, director, transport, timing and synch at BT.

Operators are looking to deploy 100Gbps to meet growing traffic demands.

"If I look at cloud applications, video distribution applications and what we are doing for wireless (Long Term Evolution) - the sum of all the traffic - that is what is putting the strain on the network," says Gringeri.

Verizon is also transitioning its legacy networks onto its core IP-MPLS backbone, requiring the operator to grow its base infrastructure significantly. "When we look at demands there, it is clear that you absolutely need 100 Gig in large parts of the network," says Gringeri.

Level 3 points out its network between any two cities has been running at much greater capacity than 100 Gbps so that demand has been there for years, the issue is the economics of the technology. "Right now, going to 100Gbps is significantly a higher cost than just deploying 10x 10Gbps," says Level 3's Merchant.

BT's core network comprises 106 nodes: 20 in a fully-meshed inner core, surrounded by an outer 86-node core. The core carries the bulk of BT's IP, business and voice traffic.

"We are taking specific steps and have business cases developed to deploy 40G and 100G technology: alternative line cards into the same rack," says Hornung.

Coherent and direct detection

Coherent has become the default optical transmission technology for operators' next-generation core networks.

BT says it is a 'no-brainer' that 400Gbps and 1 Terabit-per-second light paths will eventually be deployed in the network to accommodate growing traffic. "Rather than keep all your options open, we need to make the assumption that technology will essentially be coherent going forward because it will be the bandwidth that drives it," says Hornung.

Beyond BT's 106-node core is a backhaul network that links 1,000 points-of-presence (PoPs). It is this part of the network that BT will consider 40Gbps and perhaps 100Gbps direct-detection technology. "If it [such technology] became commercially available, we would look at the price, the demand and use it, or not, as makes sense," says Hornung. "I would not exclude at this stage looking at any technology that becomes available." Such direct-detection 100Gbps solutions are already being promoted by ADVA Optical Networking and MultiPhy.

However, Verizon believes coherent will also be needed for the metro. "If I look at my metro systems, you have even lower quality amplifiers, and generally worse signal-to-noise," says Gringeri. “Based on the performance required, I have no idea how you are going to implement a solution that isn't coherent."

Even for shorter reach metro systems - 200 or 300km- Verizon believes coherent will be the implementation, including expanding existing deployments that carry 10Gbps light paths and that use dispersion-compensated fibre.

Level 3 says it is not wedded to a technology but rather a cost point. As a result it will assess a technology if it believes it will address the operator's needs and has a cost performance advantage.

100 Gig deployment stages

The cost of 100Gbps technology remains a key challenge impeding wider deployment. This is not surprising since 100Gbps technology is still immature and systems shipping are first-generation designs.

Operators are willing to pay a premium to deploy 100Gbps light paths at network pinch-points as it is cheaper that lighting a new fibre.

Metro deployments of new technology such as 100Gbps occur generally occur once the long-haul network has been upgraded. The technology is by then more mature and better suited to the cost-conscious metro.

Applications that will drive metro 100Gbps include linking data centre and enterprises. But Level 3 expects it will be another five years before enterprises move from requesting 10 Gigabit services to 100 Gigabit ones to meet their telecom needs.

Verizon highlights two 100Gbps priorities: the high-end performance dense WDM systems and client-side 'grey' (non-WDM) optics used to connect equipment across distances as short as 100m with ribbon cable to over 2km or 10km over single-mode fibre.

"I would not exclude at this stage looking at any technology that becomes available"

Steve Hornung, BT

"Grey optics are very costly, especially if I’m going to stitch the network and have routers and other client devices and potential long-haul and metro networks, all of these interconnect optics come into play," says Gringeri.

Verizon is a strong proponent of a new 100Gbps serial interface over 2km or 10km. At present there are the 100 Gigabit interface and the 10x10 MSA. However Gringeri says it will be 2-3 years before such a serial interface becomes available. "Getting the price-performance on the grey optics is my number one priority after the DWDM long haul optics," says Gringeri.

Once 100Gbps client-side interfaces do come down in price, operators' PoPs will be used to link other locations in the metro to carry the higher-capacity services, he says.

The final stage of the rollout of 100Gbps will be single point-to-point connections. This is where grey 100Gbps comes in, says Gringeri, based on 40 or 80km optical interfaces.

Source: Gazettabyte

Tackling costs

Operators are confident regarding the vendors’ cost-reduction roadmaps. "We are talking to our clients about second, third, even fourth generation of coherent," says Gringeri. "There are ways of making extremely significant price reductions."

Gringeri points to further photonic integration and reducing the sampling rate of the coherent receiver ASIC's analogue-to-digital converters. "With the DSP [ASIC], you can look to lower the sampling rate," says Gringeri. "A lot of the systems do 2x sampling and you don't need 2x sampling."

The filtering used for dispersion compensation can also be simpler for shorter-reach spans. "The filter can be shorter - you don't need as many [digital filter] taps," says Gringeri. "There are a lot of optimisations and no one has made them yet."

There are also the move to pluggable CFP modules for the line-side coherent optics and the CFP2 for client-side 100Gbps interfaces. At present the only line-side 100Gbps pluggable is based on direct detection.

"The CFP is a big package," says Gringeri. "That is not the grey optics package we want in the future, we need to go to a much smaller package long term."

For the line-side there is also the issue of the digital signal processor's (DSP) power consumption. "I think you can fit the optics in but I'm very concerned about the power consumption of the DSP - these DSPs are 50 to 80W in many current designs," says Gringeri.

One obvious solution is to move the DSP out of the module and onto the line card. "Even if they can extend the power number of the CFP, it needs to be 15 to 20W," says Gringeri. "There is an awful lot of work to get where you are today to 15 to 20W."

* Monisha Merchant left Level 3 before the article was published.

Further Reading:

100 Gigabit: The coming metro opportunity - a position paper, click here

Click here for Part 2: Next-gen 100 Gig Optics

by Michael

Boosting high-performance computing with optics

Briefing: Optical Interconnect

Part 2: High-performance computing

IBM has adopted optical interfaces for its latest POWER7-based high-end computer system. Gazettabyte spoke to IBM Fellow, Ed Seminaro, about high-performance computing and the need for optics to address bandwidth and latency requirements.

“At some point when you go a certain distance you have to go to an optical link”

Ed Seminaro, IBM Fellow

IBM has used parallel optics for its latest POWER7 computing systems, the Power 775. The optical interfaces are used to connect computing node drawers that make up the high-end computer. Each node comprises 32 POWER7 chips, with each chip hosting eight processor cores, each capable of running up to four separate programming tasks or threads.

Using optical engines, each node – a specialised computing card - has a total bandwidth of 224, 120 Gigabit-per-second (12x10Gbps) VCSEL-based transmitters and 224, 120Gbps receivers. The interfaces can interconnect up to 2,048 nodes, over half a million POWER7 cores, with a maximum network diameter of only three link hops.

IBM claims that with the development of the Power 775, it has demonstrated the superiority of optics over copper for high-end computing designs.

High-performance computing

Not so long ago supercomputers were designed using exotic custom technologies. Each company crafted its own RISC microprocessor that required specialised packaging, interconnect and cooling. Nowadays supercomputers are more likely to be made up of aggregated servers – computing nodes - tied using a high-performance switching fabric. Software then ties the nodes to appear to the user as a single computer.

But clever processor design is still required to meet new computing demands and steal a march on the competition, as are ever-faster links – interconnect bandwidth - to connect the nodes and satisfy their growing data transfer requirements.

High-performance computing (HPC) is another term used for state-of-the-art computing systems, and comes in many flavours and deployments, says Ed Seminaro, IBM Fellow, power systems development in the IBM Systems & Technology Group.

“All it means is that you have a compute-intensive workload – or a workload combining compute and I/O [input-output] intensive aspects," says Seminaro. "These occur in the scientific and technical computing world, and are increasingly being seen in business around large-scale analytics and so called ‘big data’ problem sets.”

Within the platform, the computer’s operating system runs on a processor or a group of processors connected using copper wire on a printed circuit board (PCB), typically a few inches apart, says Seminaro

The processor hardware is commonly a two-socket server: two processor modules no more than 10 inches apart. The hardware can run a single copy of the operating system – known as an image - or many images.

Running one copy of the operating system, all the memory and all the processing resource are carefully managed, says Seminaro. Alternatively an image can be broken into hundreds of pieces with a copy of the operating system running on each. “That is what virtualisation means,” says Seminaro. The advent of virtualisation has had a significant impact in the design of data centres and is a key enabler of cloud computing (Add link).

“The biggest you can build one of these [compute nodes] is 32 sockets – 32 processor chips - which may be as much as 256 processor cores - close enough that you can run them as what we call a single piece of hardware,” says Seminaro. But this is the current extreme, he says, the industry standard is two or four-socket servers.

That part is well understood, adds Seminaro, the challenge is connecting many of these hardware pieces into a tightly-coupled integrated system. This is where system performance metrics of latency and bandwidth come to the fore and why optical interfaces have become a key technology for HPC.

Latency and bandwidth

Two data transfer technologies are commonly used for HPC: Ethernet LAN and Infiniband. The two networking technologies are also defined by two important performance parameters: latency and bandwidth.

Using an Ethernet LAN for connectivity, the latency is relatively high when transferring data between two pieces of hardware. Latency is the time it takes before requested data starts to arrive. Normally when a process running on hardware accesses data from its local memory the latency is below 100ns. In contrast, accessing data between nodes can take more than 100x longer or over 10 microseconds.

For Infiniband, the latency between nodes can be under 1 microsecond, still 10x worse than a local transfer but more than 10x better than Ethernet. “Inevitably there is a middle ground somewhere between 1 and 100 microsecond depending on factors such as the [design of the software] IP stack,” says Seminaro.

If the amount of data requested is minor, the transfer itself typically takes nanoseconds. If a large file is requested, then not only is latency important – the time before asked-for data starts arriving – but also the bandwidth dictating overall file transfer times.

To highlight the impact of latency and bandwidth on data transfers, Seminaro cites the example of a node requesting data using a 1 Gigabit Ethernet (GbE) interface, equating to a 100MByte-per-second (MBps) transfer rate. The first bit of data requested by a node arrives after 100ns but a further second is needed before the 100MB file arrives.

A state-of-the-art Ethernet interface is 10GbE, says Seminaro: “A 4x QDR [quad data rate] Infiniband link is four times faster again [4x10Gbps].” The cost of 4x QDR Infiniband interconnect is roughly the same as for 10GbE, so most HPC systems either use 1GbE, for lowest cost networking, or 4x QDR Infiniband, when interconnect performance is a more important consideration. Of the fastest 500 computing systems in the world, over 425 use either 1GbE or Infiniband, only 11 use 10GbE. The remainder use custom or proprietary interconnects, says IBM.

The issue is that going any distance at these speeds using copper interfaces is problematic. “At some point when you go a certain distance you have to go to an optical link,” says Seminaro. “With Gigabit Ethernet there is copper and fibre connectivity; with 10GbE the standard is really fibre connectivity to get any reasonable distance.”

Copper for 10GbE or QDR Infiniband can go 7m, and using active copper cable the reach can be extended to 15m. Beyond that it is optics.

“We have learned that we can do a very large-scale optical configuration cost effectively. We had our doubts about that initially”

Ed Seminaro

The need for optics

Copper’s 7m reach places an upper limit on the number of computing units – each with 32 processor nodes - that can be reached. “To go beyond that, I’m going to have to go optical,” says Seminaro.

But reach is not the sole issue. The I/O bandwidth associated with each node is also a factor. “If you want an enormous amount of bandwidth out of each of these [node units], it starts to get physically difficult to externalise from each that many copper cables,” says Seminaro.

Many data centre managers would be overjoyed to finally get rid of copper, adds Seminaro, but unfortunately optical costs more. This has meant people have pushed to keep copper alive, especially for smaller computing clusters.

People accept how much bandwidth they can get between nodes using technologies such as QDR linking two-socket servers, and then design the software around such performance. “They get the best technology and then go the next level and do the best with that,” says Seminaro. “But people are always looking how they can increase the bandwidth dramatically coming out of the node and also how they can make the node more computationally powerful.” Not only that, if the nodes are more powerful, fewer are needed to do a given job, he says.

What IBM has done

The IBM’s Power 775 computer system is a sixth generation design that started in 2002. The Power 775 is currently being previewed and will be generally available in the second half of 2011, says IBM.

At its core is a POWER7 processor, described by Seminaro as highly flexible. The processor can tackle various problems from commercial applications to high-performance computing and which can scale from one processing node next to the desk to complete supercomputer configurations.

Applications the POWER7 is used for include large scale data analysis, automobile and aircraft design, weather prediction, and oil exploration, as well as multi-purpose computing systems for national research labs.

In the Power 775, as mentioned, each node has 32 chips comprising 256 cores, and each core can process four [programming] threads. “That is 1,024 threads – a lot of compute power,” says Seminaro, who stresses that the number of cores and the computing capability of each thread are important, as is the clock frequency at which they are run. These threads must access memory and are all tightly coupled.

“That is where it all starts: How much compute power can you cram in one of these units of electronics,” says Seminaro. The node design uses copper interconnect on a PCB and in placed into a water-cooled drawer to ensure a relatively low operating temperature, which improves power utilisation and system reliability.

“We have pulled all the stops out with this drawer,” says Seminaro. “It has the highest bandwidth available in a generally commercially available processor – we have several times the bandwidth of a typical computing platform at all levels of the interconnect hierarchy.”

To connect the computing nodes or drawers, IBM uses optical interfaces to achieve a low latency, high bandwidth interconnect design. Each node uses 224 optical transceivers, with each transceiver consisting of an array of 12 send and 12 receive 10Gbps lanes. This equates to a total bandwidth per 2U-high node of 26.88+26.88 Terabit-per-second.

“That is equivalent to 2,688 10Gig Ethernet connections [each way],” says Seminaro. “Because we have so many links coming out of the drawer it allows us to connect a lot of drawers directly to each other.”

In a 128-drawer system, IBM has sufficient number of ports and interconnect bandwidth to link each drawer to every one of the other 127. Using the switching capacity within the drawer, the Power 775 can be further scaled to build systems of up to 2,048 node drawers, with up to 524,288 POWER7 cores.

IBM claims one concern about using optics was cost. However working with Avago Technologies, the supplier of the optical transceivers, it has been able to develop the optical-based systems cost-effectively (see 'Parallel Optics' section within OFC round-up story) . “We have learned that we can do a very large-scale optical configuration cost effectively,” says Seminaro. “We had our doubts about that initially.”

IBM also had concerns about the power consumption of optics. “Copper is high-power but so is optics,” says Seminaro. “Again working with Avago we’ve been able to do this at reasonable power levels.” Even for very short 1m links the power consumption is reasonable, says IBM, and for longer reaches such as connecting widely-separated drawers in a large system, optical interconnect has a huge advantage, since the power required for an 80m link is the same as for a 1m link.

Reliability was also a concern given that optics is viewed as being less reliable than copper. “We have built a large amount of hardware now and we have achieved outstanding reliability,” says Seminaro.

IBM uses 10 out of the 12 lanes - two lanes are spare. If one lane should fail, one of the spare lanes is automatically configured to take its place. Such redundancy improves the failure rate metrics greatly and is needed in systems with a large number of optical interconnects, says Seminaro.

IBM has also done much work to produce an integrated design, placing the optical interfaces close to its hub/switch chip and reducing the discrete components used. And in a future design it will use an optical transceiver that integrates the transmit and receive arrays. IBM also believes it can improve the integration of the VCSEL-drive circuitry and overall packaging.

What next?

For future systems, IBM is investigating increasing the data rate per channel to 20-26Gbps and has already designed the current system to be able to accommodate such rates.

What about bringing optics within the drawer for chip-to-chip and even on-chip communications?

“There is one disadvantage to using optics which is difficult to overcome and that is latency,” says Seminaro. “You will always have higher latency when you go optics and a longer time-of-flight than you have with copper.” That’s because converting from wider, slower electrical buses to narrower optical links at higher bit rate costs a few cycles on each end of the link.

Also an optical signal in a fibre takes slightly longer to propagate, leading to a total increase in propagation delay of 1-5ns. “When you are within that drawer, especially when you are in some section of that drawer say between four chips, the added latency and time-of–flight definitely hurts performance,” says Seminaro.

IBM does not rule out such use of optics in the future. However, in the current Power 775 system, using optical links to interconnect the four-chip processor clusters within a node drawer does not deliver any processing performance advantage, it says.

But as application demands rise, and as IBM’s chip and package technologies improve, the need for higher bandwidth interconnect will steadily increase. Optics within the drawer is only a matter of time.

Further reading

Part 1: Optical Interconnect: Fibre-to-the-FPGA

Get on the Optical Bus, IEEE Spectrum, October 2010.

IBM Power 775 Supercomputer

by Michael

Fibre-to-the-FPGA

Briefing: Optical Interconnect

Part 1: FPGAs

Programmable logic chip vendor Altera is developing FPGAs with optical interfaces. But is there a need for such technology and how difficult will it be to develop?

FPGAs with optical interfaces promise to simplify high-speed interfacing between and within telecom and datacom systems. Such fibre-based FPGAs, once available, could also trigger novel system architectures. But not all FPGA vendors believe optical-enabled FPGAs’ time has come, arguing that cost and reliability hurdles must be overcome for system vendors to embrace the technology

“One of the advantages of using optics is that you haven’t got to throw your backplanes away as [interface] speeds increase.”

Craig Davis, Altera

Altera announced in March that it is developing FPGAs with optical interfaces. The FPGA vendor has yet to detail its technology demonstrator but says it will do so later this year. Altera describes the advent of optically-enabled FPGAs as a turning point, driven by the speed-reach tradeoff of electrical interfaces coupled with the rising cost of elaborate printed circuit board (PCB) materials needed for the highest speed interfaces.

Interface speeds continue to rise. The Interlaken interface has a channel rate of up to 6.375 Gigabit-per-second (Gbps) while the Gen 3.0 PCI Express standard uses 8.0 Gbps lanes. Meanwhile 16 Gigabit Fibre Channel standard operates at 14.1 Gbps while 100 Gigabit interfaces for Ethernet and line-side optical transport are moving to a four-channel electrical interface that almost doubles the lane rates to 25-28 Gbps. The CFP2 optical module for 100 Gigabit, to be introduced in 2012, will use the four-channel electrical interface.

Copper interfaces such channel speeds but at the expense of reach. Craig Davis, senior product marketing engineer at Altera, cites the 10GBASE-KR 10Gbps backplane standard as an example of the bandwidth-reach the latest FPGAs can achieve: 40 inches including the losses introduced by the two connectors at each end.

“Our interactions with our customers are primarily for products that are not going to see the light of day for several years”

Panch Chandrasekaran, Xilinx

Work is being undertaken to development very short reach electrical interfaces at 28Gbps for line cards and electrical backplanes. “You are talking 4 to 6 inches of trace to a CFP2 module or a chip-to-chip interface,” says Panch Chandrasekaran, Xilinx’s senior product marketing manager, high-speed serial I/O. “Honestly, this is going to be a challenge but we usually figure out a way how to do things.”

The faster the link, the more energy has to be put into the signals and the more losses you have on the board, says Davis: “Signal integrity aspects also get more difficult, the costs go up as does the power consumption.”

According to Altera, signal losses increase 3.5x going from 10 to 30Gbps. To match the losses at 10Gbps when operating at these higher speeds, complex PCB materials such as N4000-13 EP SI and Megtron 6 are needed rather than the traditional FR4 design. However, the cost of designing and manufacturing such PCBs can rise by five-fold.

In contrast, using an optically-enabled FPGA simplifies PCB design. “For traditional chip-to-chip on a line card, optics does have a benefit because you can trade off the number of layers on a PCB,” says Davis. Such an optical-based design also offers future-proofing. “A lot of the applications we’ll be looking to support are across backplanes and between shelves,” says Davis. “One of the advantages of using optics is that you haven’t got to throw your backplanes away as [interface] speeds increase.”

FPGAs with optical interfaces also promise new ways to design systems. Normally when one line card talks to another on different shelves it is via a switch card on each shelf. Using an FPGA with an optical interface, the cards can talk directly. “People are looking at this,” says Davis. “You could take that to the extreme and go to the next cabinet which makes a much easier system design.”

Altera says vendors are interested in optical-enabled FPGAs for storage systems. Here interlinked disk drives require multiple connectors between boards. “There is an argument that it becomes a simpler system design with one FPGA taking directly to another or one chip directly to another,” says Davis “The more advanced R&D groups within certain companies are investigating the best route forward.”

But while FPGA companies agree that optical interfaces will be needed, there is no consensus on timing. “Xilinx has been looking at this technology for a while now,” says Chandrasekaran. “There is a reason why we haven’t announced it: we have a little while to go before key ecosystem and technology questions are answered.”

The mechanical and reliability issues of systems are stringent and the optical option must prove that it can deliver what is needed, says Chandrasekaran. “It is possible to do at the moment but the cost and reliability equation hasn’t been fully solved.”

Xilinx also says that while it is discussing the technology with customers, the requirement for such FPGA-based optical interfaces is some way off. “Our interactions with our customers are primarily for products that are not going to see the light of day for several years,” says Chandrasekaran

“Customers are always excited to hear about integration play,” says Gilles Garcia, director, wired communications business unit at Xilinx. But ultimately end customers care less about the technology as long as the price, power and board real-estate requirements are met. “What we are seeing with this [optical-enabled FPGA] technology is that it is not answering the requirements we are seeing from our large customers that are looking for their next-generation systems,” says Garcia

FPGA vendor Tabula also questions the near-term need for such technology. Alain Bismuth, vice president of marketing at Tabula, says nearly all the ports shipped today are at speeds of 10Gbps and below. Even in 2014, the number of 40Gbps ports forecast will only number 650,000, he says.

For Bismuth, two things must happen before optically-enabled FPGAs become commonplace. “You can build them in high volumes reliably and with good yields without incurring higher costs than a separate, discrete [FPGA and optical module] solution,” says Bismuth. “Second, the emergence in interesting volume of networks at 100 Gig and beyond to justify the integration effort.” Such networks are emerging at a “fairly slow pace”, he says.

Meanwhile Altera’s development work continues apace. “We are working with partners to develop the system and we will be demonstrating the optics-on-a-chip in Q4,” says Bob Blake, corporate and product marketing manager, Altera Europe. Altera says its packaged FPGA and optical interface will support short reach links up to 100m and be based on multimode fibre. “All we have announced is that the optical interface will be on the package and it will connect into the FPGA,” says Davis.

The technology will also use 10Gbps optical interface yet the company has detailed that its Stratix V FPGA family supports electrical transceivers at 28Gbps. “The optical interface can go higher than that [10Gbps] so in future we can target 28Gbps and beyond,” says Davis.

Optical partners

Optical component and transceiver firms such as Avago Technologies, Finisar and Reflex Photonics all have parallel optical devices - optical engines - that support up to 12 channels at 10Gbps. Avago’s MicroPod 12x10Gbit/s optical engine measures 8x8mm, for example.

None of the optical vendors would comment on its involvement with Altera’s optical-enabled FPGA.

Avago Technologies says that as FPGA interface speeds move to 10 Gbps and beyond, its customers are finding they need to move from copper to optical interfaces to maintain bandwidth for board, chassis, and system-level interconnect. “In line with this announcement from Altera, we are investing the time to verify Avago optical modules with FPGA SERDES blocks to ensure that FPGA users can design optical interfaces with confidence,” says Victor Krutul, director of marketing for fibre optic products at Avago.

Finisar too only talks about general trends. “We are seeing many technology leaders moving optics further onto the board and deeper into the system,” says Katharine Schmidtke, director of strategic marketing for Finisar. “This approach offers a number of advantages including improving signal integrity and reducing power consumption on copper traces at higher bandwidths.”

Reflex Photonics says that it has the technology and products to realise optically-enabled IC packages. “We are working with more than one IC company to bring optically-enabled IC packages to market,” says Robert Coenen, vice president, sales and marketing at Reflex.

For Coenen, FPGAs represent the first step in bringing optics to the IC package: “Due to their penetration into niche markets, FPGAs make the most sense to create what will ultimately be a huge market in optically-enabled IC packages.”

Coenen stresses that optics to the IC package is a significant shift in how optical links are used and so it will take time for this application to take hold. However, as the cost per bit decreases, optics will start being used in additional applications including switch ASICs, microprocessors and graphics processors.

“The beauty of an MT-terminated ribbon fiber optical connection at the edge of the package is that this solution allows designers to use the additional high-speed optical connectivity without having to drastically change their design practices,” says Coenen. This is not the case with technologies such as PCB optical waveguides or free-space optical communication.

“I believe the Altera announcement is just the first in what will be many announcements of optical-to-the-IC-package technology in the coming year or two,” says Coenen.

Further reading

Briefing Part 2: Boosting high performance computing with optics
Altera White Paper: Overcome Copper Limits with Optical Interfaces
Xilinx's 400 Gigabit Ethernet FPGA
The InfiniBand roadmap gets redrawn

by Michael

Operators want to cut power by a fifth by 2020

Briefing: Green ICT

Part 2: Operators’ power efficiency strategies

Service providers have set themselves ambitious targets to reduce their energy consumption by a fifth by 2020. The power reduction will coincide with an expected thirty-fold increase in traffic in that period. Given the cost of electricity and operators’ requirements, such targets are not surprising: KPN, with its 12,000 sites in The Netherlands, consumes 1% of the country’s electricity.

“We also have to invest in capital expenditure for a big swap of equipment – in mobile and DSLAMs"

Philippe Tuzzolino, France Telecom-Orange

Operators stress that power consumption concerns are not new but Marga Blom, manager, energy management group at KPN, highlights that the issue had become pressing due to steep rises in electricity prices. “It is becoming a significant part of our operational expense,” she says.

"We are getting dedicated and allocated funds specifically for energy efficiency,” adds John Schinter, AT&T’s director of energy. “In the past, energy didn’t play anywhere near the role it does today.”

Power reduction strategies

Service providers are adopted several approaches to reduce their power requirements.

Upgrading their equipment is one. Newer platforms are denser with higher-speed interfaces while also supporting existing technologies more efficiently. Verizon, for example, has deployed 100 Gigabit-per-second (Gbps) interfaces for optical transport and for its IT systems in Europe. The 100Gbps systems are no larger than existing 10Gbps and 40Gbps platforms and while the higher-speed interfaces consume more power, overall power-per-bit is reduced.

“There is a business case based on total cost of ownership for migrating to newer platforms.”

Marga Blom, KPN

Reducing the number of facilities is another approach. BT and Deutsche Telekom are reducing significantly the number of local exchanges they operate. France Telecom is consolidating a dozen data centres in France and Poland to two, filling both with new, more energy-efficient equipment. Such an initiative improves the power usage effectiveness (PUE), an important data centre efficiency measure, halving the energy consumption associated with France Telecom’s data centres’ cooling systems.

“PUE started with data centres but it is relevant in the future central office world,” says Brian Trosper, vice president of global network facilities/ data centers at Verizon. “As you look at the evolution of cloud-based services and virtualisation of applications, you are going to see a blurring of data centres and central offices as they interoperate to provide the service.”

Belgacom plans to upgrade its mobile infrastructure with 20% more energy-efficient equipment over the next two years as it seeks a 25% network energy efficiency improvement by 2020. France Telecom is committed to a 15% reduction in its global energy consumption by 2020 compared to the level in 2006. Meanwhile KPN has almost halted growth in its energy demands with network upgrades despite strong growth in traffic, and by 2012 it expects to start reducing demand. KPN’s target by 2020 is to reduce energy consumption by 20 percent compared to its network demands of 2005.

Fewer buildings, better cooling

Philippe Tuzzolino, environment director for France Telecom-Orange, says energy consumption is rising in its core network and data centres due to the ever increasing traffic and data usage but that power is being reduced at sites using such techniques as virtualisation of servers, free-air cooling, and increasing the operating temperature of equipment. “We employ natural ventilation to reduce the energy costs of cooling,” says Tuzzolino.

“Everything we do is going to be energy efficient.”

Brian Trosper, Verizon

Verizon uses techniques such as alternating ‘hot’ and ‘cold’ aisles of equipment and real-time smart-building sensing to tackle cooling. “The building senses the environment, where cooling is needed and where it is not, ensuring that the cooling systems are running as efficiently as possible,” says Trosper.

Verizon also points to vendor improvements in back-up power supply equipment such as DC power rectifiers and uninterruptable power supplies. Such equipment which is always on has traditionally been 50% efficient. “If they are losing 50% power before they feed an IP router that is clearly very inefficient,” says Chris Kimm, Verizon's vice president, network field operations, EMEA and Asia-Pacific. Now manufacturers have raised efficiencies of such power equipment to 90-95%.

France Telecom forecasts that its data centre and site energy saving measures will only work till 2013 with power consumption then rising again. “We also have to invest in capital expenditure for a big swap of equipment – in mobile and DSLAMs [access equipment],” says Tuzzolino.

Newer platforms support advanced networking technologies and more traffic while supporting existing technologies more efficiently. This allows operators to move their customers onto the newer platforms and decommission the older power-hungry kit.

“Technology is changing so rapidly that there is always a balance between installing new, more energy efficient equipment and the effort to reduce the huge energy footprint of existing operations”

John Schinter, AT&T

Operators also use networking strategies to achieve efficiencies. Verizon is deploying a mix of equipment in its global private IP network used by enterprise customers. It is deploying optical platforms in new markets to connect to local Ethernet service providers. “We ride their Ethernet clouds to our customers in one market, whereas layer 3 IP routing may be used in an adjacent, next most-upstream major market,” says Kimm. The benefit of the mixed approach is greater efficiencies, he says: “Fewer devices to deploy, less complicated deployments, less capital and ultimately less power to run them.”

Verizon is also reducing the real-estate it uses as it retires older equipment. “One trend we are seeing is more, relatively empty-looking facilities,” says Kimm. It is no longer facilities crammed with equipment that is the problem, he says, rather what bound sites are their power and cooling capacity requirements.

“You have to look at the full picture end-to-end,” says Trosper. “Everything we do is going to be energy efficient.” That includes the system vendors and the energy-saving targets Verizon demands of them, how it designs its network, the facilities where the equipment resides and how they are operated and maintained, he says.

Meanwhile, France Telecom says it is working with 19 operators such as Vodafone and Telefonica, BT, DT, China Telecom, and Verizon as well as the organisations such as the ITU and ETSI to define standards for DSLAMs and base stations to aid the operators in meeting their energy targets.

Tuzzolino stresses that France Telecom’s capital expenditure will depend on how energy costs evolve. Energy prices will dictate when France Telecom will need to invest in equipment, and the degree, to deliver the required return on investment.

The operator has defined capital expenditure spending scenarios - from a partial to a complete equipment swap from 2015 - depending on future energy costs. New services will clearly dictate operators’ equipment deployment plans but energy costs will influence the pace.

““If they [DC power rectifiers and UPSs] are losing 50% power before they feed an IP router that is clearly very inefficient”

Chris Kimm, Verizon.

Justifying capital expenditure spending based on energy and hence operational expense savings in now ‘part of the discussion’, says KPN’s Blom: “There is a business case based on total cost of ownership for migrating to newer platforms.”

Challenges

But if operators are generally pleased with the progress they are making, challenges remain.

“The big challenge for us is to plan the capital expenditure effort such that we achieve the return-on-investment based on anticipated energy costs,” says Tuzzolino.

Another aspect is regulation, says Tuzzolino. The EC is considering how ICT can contribute to reducing the energy demands of other industries, he says. “We have to plan to reduce energy consumption because ICT will increasingly be used in [other sectors like] transport and smart grids.”

Verizon highlights the challenge of successfully managing large-scale equipment substitution and other changes that bring benefits while serving existing customers. “You have to keep your focus in the right place,” says Kimm.

Part 1: Standards and best practices

by Michael

ICT could reduce global carbon emissions by 15%

Briefing: Green ICT

Part 1: Standards and best practices

Keith Dickerson is chair of the International Telecommunication Union's (ITU) working party on information and communications technology (ICT) and climate change.

In a Q&A with Gazettabyte, he discusses how ICT can help reduce emissions in other industries, where the power hot spots are in the network and what the ITU is doing.

"If you benchmark base stations across different countries and different operators, there is a 5:1 difference in their energy consumption"

Keith Dickerson

Q. Why is the ITU addressing power consumption reduction and will its involvement lead to standards?

KD: We are producing standards and best practices. The reason we are involved is simple: ICT – all IT and telecoms equipment - is generating 2% of [carbon] emissions worldwide. But traffic is doubling every two years and the energy consumption of data centres is doubling every five years. If we don’t watch out we will be part of the problem. We want to reduce emissions in the ICT sector and in other sectors. We can reduce emissions in other sectors by 5x or 6x what we emit in our own sector.

Just to understand that figure, you believe ICT can cut emissions in other industries by a factor of six?

KD: We could reduce emissions overall by 15% worldwide. Reducing things like travel and storage of goods and by increasing recycling. All these measures in conjunction, enabled by ICT, could reduce overall emissions by 15%. These sectors include travel, the forestry sector and waste management. The energy sector is huge and we can reduce emissions here by up to 30% using smarter grids.

What are the trends regarding ICT?

KD: ICT accounts for 2% at the moment, maybe 2.5% if you include TV, but it is growing very fast. By 2020 it could be 6% of worldwide emissions if we don’t do something. And you can see why: Broadband access rates are doubling every two years, and although the power-per-bit is coming down, overall power [consumed] is rising.

Where are the hot spots in the network?

The areas where energy consumption is going up most greatly are at the ends of the network. They are in the home equipment and in data centres. Within the network it is still going up, but it is under control and there are clear ways of reducing it.

For example all operators are moving to a next-generation network (NGN) – BT is doing this with its 21CN - and this alone leads to a power reduction. It leads to a significant reduction in switching centres, by a factor of ten. And you can collapse different networks into a single IP network, reducing the energy consumption [associated with running multiple networks]. The equipment in the NGN doesn’t need as much cooling or air conditioning. The use of more advanced access technology such as VDSL2 and PON will by itself lead to a reduction in power-per-bit.

The EU has a broadband code of conduct which sets targets in reducing energy consumption in the access network and that leads to technologies such as standby modes. My home hub, if I don’t use it for awhile, switches to a low-power mode.

The ITU is looking at how to apply these low–power modes to VDSL2. There has also been a very recent proposal to reduce the power levels in PONs. There has been a contribution from the Chinese for a deep-sleep mode for XG-PON. The ITU-T Study Group 13 on future networks is also looking at such techniques, shutting down part of the core network when traffic levels are low such as at night.

What about mobile networks?

If you benchmark them across different countries and different operators there is a 5:1 difference in the energy consumption of base stations. They are running the same standard but their energy efficiency is somewhat different; they have been made at different times and by different vendors.

In a base station, some half of the power is lost in the [signal] coupling to the antenna. If you can make amplifiers more efficient and reduce the amount of cooling and air-condition required by the base station, you can reduce energy consumption by 70 or 80%. If all operators and all counties used best practices here, energy consumption in the mobile network could be reduced by 50% to 70%.

If you could get overall power consumption of a base station down to 100W, you could power it from renewable energy. That would make a huge difference; it could work without having to worry about the reliability of the electricity grid which in India and Africa is a tricky problem. And at the moment the price of diesel fuel [to power standby generators] is going through the roof.

I visited Huawei recently and they have examples of 100W base stations powered by renewable energy, making them independent of the electricity network. At the moment a base station consume more like 1000W and overall they consume over half the overall power used by a mobile operator. At 100W, that wouldn’t be the case.

Other power saving activities in mobile include sharing networks among operators such as Orange and T-Mobile in the UK. And BT has signed a contract with four out of the five UK mobile operators to provide their backhaul and core networks in the future.

What is the ITU doing with regard energy saving schemes?

The ITU set up the working party on ICT and climate change less than two years ago. We have work in three different areas.

One is increasing energy efficiencies in ICT which we are doing through the widespread introduction of best practices. We are relying on the EC to set targets. The ITU, because it has 193 countries involved, finds it very difficult to agree targets. So we issue best practices which show how targets can be met. This covers data centres, broadband and core networks.

Another of our areas is agreeing a common methodology for how to measure the impact of ICT on carbon emissions. We have been working on this for 18 months and the first recommendations should be consented this summer. Overall this work will be completed in the next two years. This will enable you to measure the emissions of ICT by country, or sector, or an individual product or service, or within a company. If companies don’t meet their targets in future they will be fined so it is very important companies are measured in the same way.

A third area of our activities are things like recycling. We have produced a standard for a universal charger for mobile phones. You won’t have to buy a new charger each time you buy a new phone. At the moment thousands of tonnes of chargers go to landfill [waste sites] every year. The standard introduced by the ITU last year only covers 25% of handsets. The revised standard will raise that to 80%.

At the last meeting the Chinese also proposed a universal battery – or a range of batteries. This would means you don’t have to throw away your old battery each time you buy a new mobile. It is all about reducing the amount of equipment that goes into landfill.

We are also doing some other activities. Most telecom equipment use a 50V power supply. We are taking that up to 400V. So a standard power supply for a data centre or a switch would be at 400V. This would mean you would lose a lot less power in the wiring as you would be operating at a lower current - power losses vary according to the square of the current.

These ITU activities coupled with operators moving to new architectures and adopting new technologies will all help yet traffic is doubling every two years. What will be the overall effect?

It all depends on the targets that are set. The EU is putting in more and more severe targets. If companies have to pay a fine if they don’t meet them, they will introduce new technologies more quickly. Companies won’t pay the extra investment unless they have to, I’m afraid, especially during this difficult economic period.

Every year the EC revises the code of conduct on broadband and sets stiffer targets. They are driving the introduction of new technology into the industry, and everyone wants to sign up to show that they are using best practices.

What the ITU is doing is providing the best practices and the standards to help them do that. The rate at which they act will depend on how fast those targets are reduced.

Keith Dickerson is a director at Climate Associates.

Part 2 Operators' power efficiency strategies

by Michael

Virtualisation set to transform data centre networking

Briefing: Data centre switching

Part 3: Networking developments

The adoption of virtualisation techniques is causing an upheaval in the data centre. Virtualisation is being used to boost server performance, but its introduction is changing how switching equipment is networked.

“This is the most critical period of data centre transformation seen in decades,” says Raju Rajan, global system networking evangelist at IBM.

“We are on a long hard path – it is going to be a really challenging transition”

Stephen Garrison, Force10 Networks

Data centre managers want to accommodate variable workloads, and that requires the moving of virtualised workloads between servers and even between data centres. That is leading to new protocol developments and network consolidation, all the while making IT management more demanding, and hence requiring greater automation.

“We used to share the mainframe, now we share a pool of resources,” says Andy Ingram, vice president of product marketing and business development at the fabric and switching technologies business group at Juniper Networks. “What we are trying to get to is better utilisation of resources for the purpose of driving applications - the data centre is about applications.”

Networking challenges

New standards to meet the networking challenges created by virtualisation are close to completion and are already appearing in equipment. In turn, equipment makers are developing switch architectures that will scale to support tens of thousands of 10-Gigabit-per-second (Gbps) Ethernet ports. But industry experts expect these developments will take up to a decade to become commonplace due to the significant hurdles to be overcome.

“We are all marketing to a very distant future where most of the users are still trying to get their arms around eight virtual machines on a server,” says Stephen Garrison, vice president marketing at Force10 Networks. “We are on a long hard path – it is going to be a really challenging transition.”

IBM points out how its customers are used to working in IT siloes, selecting subsystems independently. New work practices across divisions will be needed if the networking challenges are to be addressed. “For the first time, you cannot make a networking choice without understanding the server, virtualisation, storage, security and the operations support strategies,” says Rajan.

A lot of the future value of these various developments will be based on enabling IT automation. “That is a big hurdle for IT to get over: allowing systems to manage themselves,” says Zeus Kerravala, senior vice president, global enterprise and consumer research at market research firm, Yankee Group. “Do I think this vision will happen? Sure I do, but it will take a lot longer than people think.” Yankee expects it will be closer to ten years before these developments become commonplace.

Networking provides the foundation for servers and storage and, ultimately, the data centre’s applications. “Fifty percent of the data centre spend is servers, 35% is storage and 15% networking,” says Ingram. “The key resources I want to be efficient are the servers and the storage; what interconnects them is the network.”

Traditionally, applications have resided on dedicated servers, but equipment usage has been low, at 10% commonly. Given the huge numbers of servers deployed in data centres, this is no longer acceptable.

“From an IT perspective, the cost of computing should fall quite dramatically; if it doesn’t fall by half we will have failed”

Zeus Kerravala, Yankee Group

Virtualisation splits a server’s processing into time-slots to support 10, 100 and even 1,000 virtual machines, each with its own application. The server usage improvement that results ranges from 20% to as high as 70%.

That could result in significant efficiencies when you consider the growth of server virtualisation: In 2010, deployed virtual machines will outnumber physical servers for the first time, claims market research firm, IDC. And Gartner has said that half of all workloads will be virtualised by 2012, quotes Cisco’s Kash Shaikh, Cisco’s group manager, data center product marketing.

In enterprise and hosting data centres, servers are typically connected using three tiers of switching. The servers are linked to access switches which in turn connect to aggregation switches whose role is to funnel traffic to the large, core switches.

An access switch typically sits on top of the server rack, explaining why it is also known as a top-of-rack switch. Servers are now moving from a 1Gbps to having a 10Gbps interface, with a top-of-rack switch connecting up to 40 servers typically.

Broadcom’s latest BCM56845 switch chip has 64x10Gbps ports. The BCM56845 can link 40, 10Gbps-based servers to an aggregation switch via four high-capacity 40Gbps links. Each aggregation switch will likely have 6 to 12, 40Gbps ports per line card and between eight and 16 cards per chassis. In turn, the aggregation switches connect to the core switches. The result is a three-tier architecture that can link thousands, even tens of thousands, of servers in the data centre.

The rise of virtualisation impacts data centre networking profoundly. Applications are no longer confined to single machines but are shared across multiple servers for scaling. Nor is the predominant traffic ‘north-south’, across this three-layer switch hierarchy. Instead, virtualisation promotes greater ‘east-west’ traffic, across the same tiered equipment.

“The network has to support these changes and it can’t be the bottleneck,” says Cindy Borovick, vice president, enterprise communications infrastructure and data centre networks, at IDC. The result is networking change on several fronts.

“The [IT] resource needs to scale to one large pool and it needs to be dynamic, to allow workloads to be moved around,” says Ingram. But that is the challenge: “The inherent complexity of the network prevents it from scaling and prevents it from being dynamic.”

Data Center Bridging

Currently, IT staff manage several separate networks: Ethernet for the LAN, Fibre Channel for storage and Infiniband for high-performance computing. To migrate the traffic types onto a common network, the IEEE is developing the Data Center Bridging (DCB) Ethernet standard. A separate Fibre Channel over Ethernet (FCoE) standard, developed by the International Committee for Information Technology Standards, enables Fibre Channel to be encapsulated onto DCB.

“No-one is coming to the market with 10-Gigabit [Ethernet ports] without DCB bundled in”

Raju Rajan, IBM

DCB is designed to enable the consolidation of many networks to just one within the data centre. A single server typically has multiple networks connected to it, including Fibre Channel and several separate 1Gbps Ethernet networks.

The DCB standard has several components including Priority Flow Control which provides eight classes of traffic; Enhanced Transmission Selection which manages the bandwidth allocated to different flows and Congestion Notification which, if a port begins to fill up, can notify upstream along all the hops to the source to back off from sending traffic.

“These standards - Priority Flow Control, Congestion Notification and Enhanced Transmission Selection – are some 98% complete, waiting for procedural things,” says Nick Ilyadis, chief technical officer for Broadcom’s infrastructure networking group. DCB is sufficiently stable to be encapsulated in silicon and is being offered on increasing numbers of platforms.

With these components DCB can transport Fibre Channel in a lossless way. Fibre Channel is intolerant to loss and can require minutes to recover from a lost packet. Now with DCB, critical storage traffic such as FCoE, iSCSI and network-attached storage is supported over Ethernet.

Network convergence may be the primary driver for DCB, but its adoption also benefits virtualisation. Since higher server usage results in extra port traffic, virtualisation promotes the transition from 1-Gigabit to 10-Gigabit Ethernet ports. “No-one is coming to the market with 10-Gigabit [Ethernet ports] without DCB bundled in,” says IBM’s Rajan.

The uptake is also being helped by the significant reduction in the cost of 10-Gigabit ports with DCB. “This year we will see 10-Gigabit DCB at about $350 per port, down from over $800 last year,” says Rajan. The upgrade is attractive when the alternative is using several 1-Gigabit Ethernet ports for server virtualisation, each port costing $50-$75.

Yet while DCB may be starting to be deployed, networking convergence remains in its infancy.

“FCoE seems to be lagging behind general industry expectations,” says Rajan. “For many of our data centre owners, virtualisation is the overriding concern.” Network convergence may be a welcome cost-reducing step, but it introduces risk. “So the net gain [of convergence] is not very clear, yet, to our end customers,” says Rajan. “But the net gain of virtualisation and cloud is absolutely clear to everybody.”

Global Crossing has some 20 data centres globally, including 14 in Latin America. These serve government and enterprise customers with storage, connectivity and firewall managed services.

“We are not using lossless Ethernet,” says Mike Benjamin, vice president at Global Crossing. “The big push for us to move to that [DCB] would be doing storage as a standard across the Ethernet LAN. “We today maintain a separate Fibre Channel fabric for storage. We are not prepared to make the leap to iSCSI or FCoE just yet.”

TRILL

Another networking protocol under development is the Internet Engineering Task Force’s (IETF) Transparent Interconnection of Lots of Links (TRILL) that promotes large-scale Ethernet networks. TRILL’s primary role is to replace the spanning tree algorithm that was never designed to address the latest data centre requirements.

Spanning tree disables links in a layer-two network to avoid loops to ensure traffic has only one way to get to a port. But disabling links can remove up to half the available network bandwidth. TRILL enables large layer-two network linking switches that avoids loops without turning off precious bandwidth.

“TRILL treats the Ethernet network as the complex network it really is,” says Benjamin. “If you think of the complexity and topologies of IP networks today, TRILL will have similar abilities in terms of truly understanding a topology to forward across, and permit us to use load balancing which is a huge step forward.”

"Data centre operators are cognizant of the fact that they are sitting in the middle of the battle of [IT vendor] giants and they want to make the right decisions”

Cindy Borovick, IDC

Collapsing tiers

Switch vendors are also developing flatter switch architectures to reduce the switching tiers from three to two to ultimately one large, logical switch. This promises to reduce the overall number of platforms and their associated management as well as switch latency.

Global Crossing’s default data centre switch design is a two-tier switch. “Unless that top tier starts to hit scaling problems, at which time we move into a three-tier,” says Benjamin. “A two-tier switch architecture really does have benefits in terms of cost and low-latency switching.”

Juniper Networks is developing a single-layer logical switch architecture. Dubbed Stratus, the architecture will support tens of thousands of 10Gbps ports and span the data centre. While Stratus has still to be detailed, Juniper has said the design will be based on a 64x10-Gbps building block chip. Stratus will be in customer trials by early 2011. “We have some customers that have some very difficult networking challenges that are signed up to be our early field trials,” says Ingram.

Brocade is about to launch its virtual cluster switching (VCS) architecture. “There will be 10 switches within a cluster and they will be managed as if it is one chassis,” says Simon Pamplin, systems engineering pre-sales manager for Brocade UK and Ireland. VCS supports TRILL and DCB.

“We have the ability to make much larger flat layer two networks which ease management and the mobility of [servers’] virtual machines, whereas previously you were restricted to the size of the spanning tree layer-two domain you were happy to manage, which typically wasn’t that big,” says Pamplin.

Cisco’s Shaikh argues multi-tiered switching is still needed, for system scaling and separation of workloads: “Sometimes [switch] tiers are used for logical separation, to separate enterprise departments and their applications.” However, Cisco itself is moving to fewer tiers with the introduction of its FabricPath technology within its Nexus switches that support TRILL.

“There are reasons why you want a multi-tier,” agrees Force10 Network’s Garrison. “You may want a core and a top-of rack switch that denotes the server type, or there are some [enterprises] that just like a top-of-rack as you never really touch the core [switches]; with a single-tier you are always touching the core.”

Garrison argues that a flat network should not be equated with single tier: “What flat means is: Can I create a manageable domain that still looks like it is layer two to the packet even if it is a multi-tier?”

Global Crossing has been briefed by vendors such as Juniper and Brocade on their planned logical switch architectures and the operator sees much merit in these developments. But its main concern is what happens once such an architecture is deployed.

“Two years down the road, not only are we forced back to the same vendor no matter what other technology advancements another vendor has made, we also risk that they have phased out that generation of switch we installed,” says Benjamin. If the vendor does not remain backwards compatible, the risk is a complete replacement of the switches may be necessary.

Benjamin points out that while it is the proprietary implementations that enable the single virtual architectures, the switches also support the network standards. Accordingly a data centre operator can always switch off the proprietary elements that enable the single virtual layer and revert back to a traditional switched architecture.

Edge Virtual Bridging and Bridge Port Extension

A networking challenge caused by virtualisation is switching virtual machines and moving them between servers. A server’s software-based hypervisor that oversees the virtual machines comes with a virtual switch. But the industry consensus it that hardware rather than software running on a server is best for switching.

There are two standards under development to handle virtualisation requirements: the IEEE 802.1Qbg Edge Virtual Bridging (EVB) and the IEEE 802.1Qbh Bridge Port Extension. The 802.1Qbg camp is backed by many of the leading switch and network interface card vendors, while 802.1Qbh is based on Cisco Systems’ VN-Tag technology.

Virtual Ethernet Port Aggregation (VEPA), a proprietary element used as part of 802.1Qbg, is the transport mechanism used. In terms of networking, VEPA allows traffic to exit and re-enter the same server physical port to enable switching between virtual ports. EVB’s role is to provide the required virtual machine configuration and management.

“The network has to recognise the virtual machine appearing on the virtual interfaces and provision the network accordingly,” says Broadcom’s Ilyadis. “That is where EVB comes in, to recognise the virtual machine and use its credentials for the configuration.”

Brocade supports EVB and VEPA as part of its converged network adaptor (CNA) card that also support DCB and FCoE. “You have software switches within the hypervisor, you have some capabilities in the CNA and some in the edge switch,” says Pamplin. “We don’t see the soft-switch as too beneficial, as some CPU cycles are stolen to support it.”

Instead Brocade does the switching within the CNA. When a virtual machine within a server needs to talk to another virtual machine, the switching takes place at the adaptor. “We vastly reduce what needs to go out on the core network,” says Pamplin. “If we do have traffic that we need to monitor and put more security around, we can take that traffic out through the adaptor to the switch and switch it back – ‘hairpin’ it - into the server.”

The 802.1Qbh Bridge Port Extension uses a tag that is added to an Ethernet frame. The tag is used to control and identify the virtual machine traffic, and enable port extension. According to Cisco, the port extension allows the aggregation of a large number of ports through hierarchical switches. “This provides a way of doing a large fan-out while maintaining smaller management tiering,” says Prashant Gandhi, technical leader internet business systems unit at Cisco.

For example, top-of-rack switches could be port-extenders and be managed by the next-tier switch. “This would significantly simplifying provisioning and management of a large number of physical and virtual Ethernet ports,” says Gandhi.

“The common goal of both [802.1Qbg and 802.1Qbh] standards is to help us with configuration management, to allow virtual machines to move with their entire configuration and not require us to apply and keep that configuration in sync across every single switch,” says Global Crossing’s Benjamin. “That is a huge step for us as an operator.”

“Our view is that VEPA will be needed,” says Gary Lee, director of product marketing at Fulcrum Microsystems, which has just announced its first Alta family switch chip that supports 72x10-Gigabit ports and can process over one billion packets per second.

Benjamin hopes both standards will be adopted by the industry: “I don’t think it’s a bad thing if they both evolve and you get the option to do the switching in software as well as in hardware based on the application or the technology that a certain data centre provider requires.” Broadcom and Fulcrum are supporting both standards to ensure its silicon will work in both environments.

“This [the Edge Virtual Bridging and Bridge Port Extension standards’ work] is still in flux,” says Ilyadis. “At the moment there are a lot of proprietary implementations but it is coming together and will be ready next year.”

The big picture

For Global Crossing, it will be economics or reliability considerations that will determine which technologies are introduces first. DCB may be first once its economics and reliability for storage cannot be ignored, says Benjamin. Or it will be the networking redundancy and reliability offered by the likes of TRILL that will be needed as the operator uses virtualization more.

And whether it is five or ten years, what will be the benefit of all these new protocols and switch architectures?

Malcolm Mason, EMEA hosting product manager at Global Crossing, says there will be less equipment doing more, which will save power and require less cabling. The new technologies will also enable more stringent service level agreements to be met.

“The end-user won’t notice a lot of difference, but what they should notice is more consistent application performance,” says Yankee’s Kerravala. “From an IT perspective, the cost of computing should fall quite dramatically; if it doesn’t fall by half we will have failed.”

Meanwhile data centre operators are working to understand these new technologies. “I get a lot of questions about end-to-end architectures,” says Borovick. “They are cognizant of the fact that they are sitting in the middle of the battle of [IT vendor] giants and they want to make the right decisions.”

Click here for Part 1: Single-layer switch architectures

Click here for Part 2: Ethernet switch chips

by Michael