New MSA to enable four-lambda 400-gigabit modules
A new 100-gigabit single-wavelength multi-source agreement (MSA) has been created to provide the industry with 2km and 10km 100-gigabit and 400-gigabit four-wavelength interfaces.
Mark NowellThe MSA is backed by 22 founding companies including Microsoft, Alibaba and Cisco Systems.
The initiative started work two months ago and a draft specification is expected before the year end.
“Twenty-two companies is a very large MSA at this stage, which shows the strong interest in this technology,” says Mark Nowell, distinguished engineer, data centre switching at Cisco Systems and co-chair of the 100G Lambda MSA. “It is clear this is going to be the workhorse technology for the industry for quite a while.”
Phased approach
The 100G Lambda MSA is a phased project. In the first phase, three single-mode fibre optical interfaces will be specified: a 100-gigabit 2km link (100G-FR), a 100-gigabit 10km link (100G-LR), and a 2km 400-gigabit coarse wavelength-division multiplexed (CWDM) design, known as the 400G-FR4. A 10km version of the 400-gigabit CWDM design (400G-LR4) will be developed in the second phase.
For the specifications, the MSA will use work already done by the IEEE that has defined two 100-gigabit-per-wavelength specifications. The IEEE 802.3bs 400 Gigabit Ethernet Task Force has defined a 400-gigabit parallel fibre interface over 500m, referred to as DR4 (400GBASE-DR4). The second, the work of the IEEE 802.3cd 50, 100 and 200 Gigabit Ethernet Task Force, defines the DR (100GBASE-DR), a 100-gigabit single lane specification for 500m.
Twenty-two companies is a very large MSA at this stage, which shows the strong interest in this technology
“The data rate is known, the type of forward-error correction is the same, and we have a starting point with the DR specs - we know what their transmit levels and receive levels are,” says Nowell. The new MSA will need to contend with the extra signal loss to extend the link distances to 2km and 10km.
With the 2km 400G-FR4 specification, not only does the design involve longer distances but also loss introduced using an optical multiplexer and demultiplexer to combine and separate the four wavelengths transmitted over the single-mode fibre.
“It is really a technical problem, one of partitioning the specifications to account for the extra loss of the link channel,” says Nowell.
One way to address the additional loss is to increase the transmitter’s laser power but that raises the design’s overall power consumption. And since the industry continually improves receiver performance - its sensitivity - over time, any decision to raise the transmitter power needs careful consideration. “There is always a trade off,” says Nowell. “You don't want to put too much power on the transmitter because you can’t change that specification.”
The MSA will need to decide whether the transmitter power is increased or is kept the same and then the focus will turn to the receiver technology. “This is where a lot of the hard work occurs,” he says.
Origins
The MSA came about after the IEEE 802.3bs 400 Gigabit Ethernet Task Force defined 2km (400GBASE-FR8) and 10km (400GBASE-LR8 interfaces based on eight 50 gigabit-per-second wavelengths. “There was concern or skepticism that some of the IEEE specification for 2km and 10km at 400 gigabits were going to be the lowest cost,” says Nowell. Issues include fitting eight wavelengths within the modules as well as the cost of eight lasers. Many of the large cloud players wanted a four-wavelength solution and they wanted it specified.
The debate then turned to whether to get the work done within the IEEE or to create an MSA. Given the urgency that the industry wanted such a specification, there was a concern that it might take too long to get the project started and completed using an IEEE framework, so the decision was made to create the MSA.
“The aim is to write these specifications as quickly as we can but with the assumption that the IEEE will pick up the challenge of taking on the same scope,” says Nowell. “So the specs are planned to be written following IEEE methodology.” That way, when the IEEE does address this, it will have work it can reference.
“We are not saying that the MSA spec will go into the IEEE,” says Nowell. “We are just making it so that the IEEE, if they chose, can quickly and easily have a very good starting point.”
Form factors
The MSA specification does not dictate the modules to be used when implementing the 100-gigabit-based wavelength designs. An obvious candidate for the single-wavelength 2km and 10km designs is the SFP-DD. And Nowell says the OSFP and the QSFP-DD pluggable optical modules as well as COBO, the embedded optics specification, will be used to implement 400G-FR4. “From Cisco’s point of view, we believe the QSFP-DD is where it is going to get most of its traction,” says Nowell, who is also co-chair of the QSFP-DD MSA.
Nowell points out that the industry knows how to build systems using the QSFP form factors: how the systems are cooled and how the high-speed tracks are laid down. The development of the QSFP-DD enables the industry to reuse this experience to build new high-density systems.
“And the backward compatibility of the QSFP-DD is massively important,” he says. A QSFP-DD port also supports the QSFP28 and QSFP modules. Nowell says there are customers that buy the latest 100-gigabit switches but use lower-speed 40-gigabit QSFP modules until their network needs 100 gigabits. “We have customers that say they want to do the same thing with 100 and 400 gigabits,” says Nowell. “That is what motivated us to solve that backward-compatibility problem.”
Roadmap
A draft specification of the phase one work will be published by the 22 founding companies this year. Once published, other companies - ‘contributors’ - will join and add their comments and requirements. Further refinement will then be needed before the final MSA specification, expected by mid-2018. Meanwhile, the development of the 10km 400G-LR4 interface will start during the first half of 2018.
The MSA work is focussed on developing the 100-gigabit and 400-gigabit specifications. But Nowell says the work will help set up what comes next after 400 gigabits, whether that is 800 gigabits, one terabit or whatever.
“Once a technology gets widely adopted, you get a lot of maturity around it,” he says. “A lot of knowledge about where and how it can be extended.”
There are now optical module makers building eight-wavelength optical solutions while in the IEEE there are developments to start 100-gigabit electrical interfaces, he says: “There are a lot of pieces out there that are lining up.”
The 22 founding members of the 100G Lambda MSA Group are: Alibaba, Arista Networks, Broadcom, Ciena, Cisco, Finisar, Foxconn Interconnect Technology, Inphi, Intel, Juniper Networks, Lumentum, Luxtera, MACOM, MaxLinear, Microsoft, Molex, NeoPhotonics, Nokia, Oclaro, Semtech, Source Photonics and Sumitomo Electric.
Inphi unveils a second 400G PAM-4 IC family
Inphi has announced the Vega family of 4-level, pulse-amplitude modulation (PAM-4) chips for 400-gigabit interfaces.
The 16nm CMOS Vega IC family is designed for enterprise line cards and is Inphi’s second family of 400-gigabit chips that support eight lanes of 50-gigabit PAM-4.
Its first 8x50-gigabit family, dubbed Polaris, is used within 400-gigabit optical modules and was announced at the OFC show held in Los Angeles in March.
“Polaris is a stripped-down low-power DSP targeted at optical module applications,” says Siddharth Sheth, senior vice president, networking interconnect at Inphi (pictured). “Vega, also eight by 50-gigabits, is aimed at enterprise OEMs for their line-card retimer and gearbox applications.”
A third Inphi 400-gigabit chip family, supporting four channels of 100-gigabit PAM-4 within optical modules, will be announced later this year or early next year.
400G PAM-4 drivers
Inphi’s PAM-4 chips have been developed in anticipation of the emergence of next-generation 6.4-terabit and 12.8-terabit switch silicon and accompanying 400-gigabit optical modules such as the OSFP and QSFP-DD form factors.
Sheth highlights Broadcom’s Tomahawk-III, start-up Innovium’s Teralynx and Mellanox’s Spectrum-2 switch silicon. All have 50-gigabit PAM-4 interfaces implemented using 25-gigabaud signalling and PAM-4 modulation.
“What is required is that such switch silicon is available and mature in order for us to deploy our PAM-4 products,” says Sheth. “Everything we are seeing suggests that the switch silicon will be available by the end of this year and will probably go into production by the end of next year,” says Sheth.
Several optical module makers are starting to build 8x50-gigabit OSFP and QSFP-DD products
The other key product that needs to be available is the 400-gigabit optical modules. The industry is pursuing two main form factors: the OSFP and the QSFP-DD. Google and switch maker Arista Networks are proponents of the OSFP form factor while the likes of Amazon, Facebook and Cisco back the QSFP-DD. Google has said that it will initially use an 8x50-gigabit module implementation for 400 gigabit. Such a solution uses existing, mature 25-gigabit optics and will be available sooner than the more demanding 4x100-gigabit design that Amazon, Facebook and Cisco are waiting for. The 4x100 gigabit design requires 50Gbaud optics and a 50Gbaud PAM-4 chip.
Inphi says several optical module makers are starting to build 8x50-gigabit OSFP and QSFP-DD products and that its Polaris and Vega family of chips anticipate such deployments.
“We expect 100-gigabit optics to be available sometime around mid-2018 and our next-generation 100-gigabit PAM-4 will be available in the early part of next year,” says Sheth.
Accordingly, the combination of the switch silicon and optics means that the complete ecosystem will already exist next year, he says
Vega
The Polaris chip, used within an optical module, equalises the optical non-linearities of the incoming 50-gigabit PAM-4 signals. The optical signal is created using 25-gigabit lasers that are modulated using a PAM-4 signal that encodes two bits per signal. “When you run PAM-4 over fibre - whether multi-mode or single mode - the signal undergoes a lot of distortion,” says Sheth. “You need the DSP to clean up that distortion.”
The Vega chip, in contrast, sits on enterprise line cards and adds digital functionality that is not supported by the switch silicon. Most enterprise boxes support legacy data rates such as 10 gigabit and 1 gigabit. The Vega chip supports such legacy rates as well as 25, 50, 100, 200 and 400 gigabit, says Sheth.
The Vega chip can add forward-error correction to a data stream and decode it. As well as FEC, the chip also has physical coding sublayer (PCS) functionality. “Every time you need to encode a signal with FEC or decode it, you need to unravel the Ethernet data stream and then reassemble it,” says Sheth.
Also on-chip is a crossbar that can switch any lane to any other lane before feeding the data to the switch silicon.
Sheth stresses that not all switch chip applications need the Vega. For large-scale data centre applications that use stripped-down systems, the optical module would feed the PAM-4 signal directly into the switch silicon, requiring the use of the Polaris chip only.
A second role for Vega is driving PAM-4 signals across a system. “If you want to drive 50-gigabit PAM-4 signals electrically across a system line card and noisy backplane then you need a chip like Vega,” says Sheth.
A further application for the Vega chip is as a ‘gearbox’, converting between 50-gigabit and 25-gigabit line rates. Once high-capacity switch silicon with 50G PAM-4 signals are deployed, the Vega chip will enable the conversion between 50-gigabit PAM-4 and 25-gigabit non-return-to-zero (NRZ) signals.System vendors will then be able to interface 100-gigabit (4x25-gigabit) QSFP28 modules with these new switch chips.
One hundred gigabit modules will be deployed for at least another three to four years while the price of such modules has come down significantly. “For a lot of the cloud players it comes down to cost: are 128-ports at 100-gigabit cheaper that 32, 400-gigabit modules?” says Sheth. The company says it is seeing a lot of interest in this application.
We expect 100-gigabit optics to be available sometime around mid-2018 and our next-generation 100-gigabit PAM-4 will be available in the early part of next year
Availability
Inphi has announced two Vega chips: a 400-gigabit gearbox and a 400-gigabit retimer and gearbox IC. “We are sampling,” says Sheth. “We have got customers running traffic on their line cards.” General availability is expected in the first quarter of 2018.
As for the 4x100-gigabit PAM-4 chips, Sheth expects solutions to appear in the first half of next year: “We have to see how mature the optics are at that point and whether something can go into production in 2018.”
Inphi maintains that the 8x50-gigabit optical module solutions will go to market first and that the 4x100-gigabit variants will appear a year later. “If you look at our schedules, Polaris and the 4x100-gigabit PAM-4 chip are one year apart,” he says.
SFP-DD: Turning the SFP into a 100-gigabit module
An industry initiative has started to quadruple the data rate of the SFP, the smallest of the pluggable optical modules. The Small Form Factor Pluggable – Double Density (SFP-DD) is being designed to support 100 gigabits by doubling the SFP’s electrical lanes from one to two and doubling their speed.
Scott SommersThe new multi-source agreement (MSA), to be completed during 2018, will be rated at 3.5W; the same power envelope as the current 100-gigabit QSFP module, even though the SFP-DD is expected to be 2.5x smaller in size.
The front panel of a 1-rack-unit box will be able to support up to 96 SFP-DD modules, a total capacity of 9.6 terabits.
The SFP-DD is adopting a similar philosophy as that being used for the 400-gigabit QSFP-DD MSA: an SFP-DD port will support legacy SFPs modules - the 25-gigabit SFP28 and 10-gigabit SFP - just as the QSFP-DD will be backward compatible with existing QSFP modules.
“Time and time again we have heard with the QSFP-DD that plugging in legacy modules is a key benefit of that technology,” says Scott Sommers, group product manager at Molex and the chair of the new SFP-DD MSA. Sommers is also a co-chair of the QSFP-DD MSA.
Interest in the SFP-DD started among several like-minded companies at the OFC show held in March. Companies such as Alibaba, Molex, Hewlett Packard Enterprise and Huawei agreed on the need to extend the speed and density of the SFP similar to how the QSFP-DD is extending the QSFP.
The main interest in the SFP-DD is for server to top-of-rack switch connections. The SFP-DD will support one or two lanes of 28 gigabit-per-second (Gbps) or of 56Gbps using 4-level pulse-amplitude modulation (PAM-4).
“We tried to find server companies and companies that could help with the mechanical form factor like connector companies, transceiver companies and systems companies,” says Sommers. Fourteen promoter companies supported the MSA at its launch in July.
Specification work
The SFP-DD MSA is developing a preliminary hardware release that will be published in the coming months. This will include the single-port surface mount connector, the cage surrounding it and the module’s dimensions.
The goal is that the module will be able to support 3.5W. “Once we pin down the form factor, we will be able to have a better idea whether 3.5W is achievable,” says Sommers. “But we are very confident with the goal.”
The publication of the mechanical hardware specification will lead to other companies - contributors - responding with their comments and suggestions. “This will make the specification better but it does slow down things,” says Sommers.
The MSA’s attention will turn to the module’s software management specification once the hardware release is published. The software must understand what type of SFP module is plugged into the SFP-DD port, for example.
Supporting two 56Gbps lanes using PAM-4 means that up to four SFP-DD modules can be interfaced to a 400-gigabit QSFP-DD. But the QSFP-DD is not the only 400-gigabit module the SFP-DD could be used with in such a ‘breakout’ mode.“I don’t want to discount the OSFP [MSA],” says Sommers. “That is a similar type of technology to the QSFP-DD where it is an 8-channel-enabling form factor.”
The SFP could eventually support a 200-gigabit capacity. “It is no secret that this industry is looking to double speeds every few years,” says Sommers. He stresses this isn't the goal at present but it is there: “This MSA, for now, is really focussed on 25-gigabit non-return-to-zero or 50-gigabit PAM-4.”
Challenges
One challenge Sommers highlights for the SFP-DD is achieving a mechanically robust design: achieving the 3.5W as well as the signal integrity given the two lanes of 56Gbps.
The signal integrity advances achieved with the QSFP-DD work will be adopted for the SFP-DD. “That is why we don’t think it is going to take as long as the QSFP-DD,” he says.
The electro-optic components need to be squeezed into a smaller space and with the SFP-DD’s two lanes, there is a doubling of the copper lines going into the same opening. “This is not insurmountable but it is challenging,” says Sommers.
Further reading
Mellanox blog on the SFP-DD, click here
The OIF’s 400ZR coherent interface starts to take shape
Part 2: Coherent developments
The Optical Internetworking Forum’s (OIF) group tasked with developing two styles of 400-gigabit coherent interface is now concentrating its efforts on one of the two.
When first announced last November, the 400ZR project planned to define a dense wavelength-division multiplexing (DWDM) 400-gigabit interface and a single wavelength one. Now the work is concentrating on the DWDM interface, with the single-channel interface deemed secondary.
Karl Gass"It [the single channel] appears to be a very small percentage of what the fielded units would be,” says Karl Gass of Qorvo and the OIF Physical and Link Layer working group vice chair, optical, the group responsible for the 400ZR work.
The likelihood is that the resulting optical module will serve both applications. “Realistically, probably both [interfaces] will use a tunable laser because the goal is to have the same hardware,” says Gass.
The resulting module may also only have a reach of 80km, shorter than the original goal of up to 120km, due to the challenging optical link budget.
Origins and status
The 400ZR project began after Microsoft and other large-scale data centre players such as Google and Facebook approached the OIF to develop an interoperable 400-gigabit coherent interface they could then buy from multiple optical module makers.
The internet content providers’ interest in an 80km-plus link is to connect premises across the metro. “Eighty kilometres is the magic number from a latency standpoint so that multiple buildings can look like a single mega data centre,” says Nathan Tracy of TE Connectivity and the OIF’s vice president of marketing.
Since then, traditional service providers have shown an interest in 400ZR for their metro needs. The telcos’ requirements are different to the data centre players, causing the group to tweak the channel requirements. This is the current focus of the work, with the OIF collaborating with the ITU.
The catch is how much can we strip everything down and still meet a large percentage of the use cases
“The ITU does a lot of work on channels and they have a channel measurement methodology,” says Gass. “They are working with us as we try to do some division of labour.”
The group will choose a forward error correction (FEC) scheme once there is common agreement on the channel. “Imagine all those [coherent] DSP makers in the same room, each one recommending a different FEC,” says Gass. “We are all trying to figure out how to compare the FEC schemes on a level playing field.”
Meeting the link budget is challenging, says Gass, which is why the link might end up being 80km only. “The catch is how much can we strip everything down and still meet a large percentage of the use cases.”
The cloud is the biggest voice in the universe
400ZR form factors
Once the FEC is chosen, the power envelope will be fine-tuned and then the discussion will move to form factors. The OIF says it is still too early to discuss whether the project will select a particular form factor. Potential candidates include the OSFP MSA and the CFP8.
Nathan TracyThe industry assumption is that the 80km-plus 400ZR digital coherent optics module will consume around 15W, requiring a very low-power coherent DSP that will be made using 7nm CMOS.
“There is strong support across the industry for this project, evidenced by the fact that project calls are happening more frequently to make the progress happen,” says Tracy.
Why the urgency? “The cloud is the biggest voice in the universe,” says Tracy. To support the move of data and applications to the cloud, the infrastructure has to evolve, leading to the data centre players linking smaller locations spread across the metro.
“At the same time, the next-gen speed that is going to be used in these data centres - and therefore outside the data centres - is 400 gigabit,” says Tracy.
