Microsoft has trialled optical modules that use signalling technology developed by the Open Eye Consortium.
The webscale player says optical modules using the Open Eye’s analogue 4-level pulse-amplitude modulation (PAM-4) technology consume less power than modules with a PAM-4 digital signal processor (DSP).
“Open Eye has shown us at least an ability that we can do better on power,” says Brad Booth, director, next cloud system architecture, Azure hardware systems and infrastructure at Microsoft, during an Open Eye webinar.
Optical module power consumption is a key element of the total power budget of data centres that can have as many as 100,000 servers and 50,000 switches.
“You want to avoid running past your limit because then you have to build another data centre,” says Booth.
But challenges remain before Open Eye becomes a mainstream technology, says Dale Murray, principal analyst at market research firm, LightCounting.
Open Eye MSA
When the IEEE standards body developed specifications using 50-gigabit PAM-4 optical signals, the assumption was that a DSP would be needed for signal recovery given the optics’ limited bandwidth.
But as optics improved, companies wondered if analogue circuitry could be used after all.
Such PAM-4 analogue chips would be similar to non-return-to-zero (NRZ) signalling chips used in modules, as would the chip assembly and testing, says Timothy Vang, vice president of marketing and applications, signal integrity products group, Semtech. The analogue chips also promised to be cheaper than DSPs.
This led to the formation of the Open Eye multi-source agreement (MSA) in January 2019. Led by MACOM and Semtech, the MSA now has 37 member companies.
“We felt that if we could enable that capability, you could use the same low-cost optics and, with an Open Eye specification - an eye-mask specification - you get a manufacturable low-cost ecosystem,” says Vang. “That was our goal and we were not alone.”
But a key issue is whether Open Eye solutions will work with existing DSP-based PAM-4 modules that have their own testing procedure.
“Can they eliminate all concerns for interoperability between analogue and DSP based modules without dual testing?” says Murray. “And will end users go with a non-standard solution rather than an IEEE-standard solution?”
“We do face the dilemma LightCounting points out,” says Vang. “It is possible there are poor or older DSP-based modules that wouldn’t pass the Open Eye test, and that could lead data centres to say: ‘Well, that is not good enough’.”
“It is a concern,” says Microsoft’s Booth. The first Open Eye samples Microsoft received didn't talk to all the DSP-based modules, he says, but the next revision appeared to address the issue.
“Digital interfaces are certainly easier, but we're burning a lot of power with the DSPs, in the modules and the switch ASIC,” says Booth. “The switch ASIC needs it for direct attach copper (DAC) cables.”
However, the MSA believes that the cost, power and latency advantages of the Open Eye ICs will prove decisive.
Data centre considerations
Microsoft’s Booth outlined the challenges data centre operators face as bandwidth requirements grow exponentially.
The drivers for greater bandwidth include more home-workers using cloud services during the Covid-19 pandemic and the adoption of artificial intelligence and machine learning.
“With machine learning, the more machines you have talking to each other, the more intensive jobs you can handle,” says Booth. “But for distances greater than a few meters you fall into the realm of the 100m range, and that drives you to an optical solution.”
But optics are costly while going from 100-gigabit to 400-gigabit optical modules has not reduced power consumption. Booth says 400-gigabit SR8 modules consume about 10W while the 400-gigabit DR4 and FR4, it is 12W. Yet for 100-gigabit modules the power consumed is a quarter of these figures.
Low latency is another requirement if data centres are to adopt disaggregated servers where memory is pooled and shared between platforms. “Adding latency to these links, which are fairly short, is an impediment to do this disaggregation scenario,” says Booth.
Microsoft trialled an eight-lane on-board optics COBO module using Open-Eye and achieved a 30 per cent power saving compared to QSFP-DD or OSFP DSP-based pluggable modules.
Open Eye technology could also be used for co-packaged optics, promising a further 10 per cent power saving, says Booth.
Given future 51.2-terabit and 102.4-terabit switch silicon, with their significant connectivity, this will help reduce the overall thermal load and hence cooling which is part of a data centre’s overall power consumption.
“Anything that keeps that heat lower as I increase the bandwidth is an advantage,” says Booth.
Cost, power and latency
The Open Eye MSA claims it will cost a company $80 million to develop a next-generation 5nm CMOS PAM-4 DSP. Such a hefty development cost will need to be recouped, adding to a module's price.
Semtech says its Open Eye analogue ICs use a BiCMOS process which is a far cheaper approach.
The PAM-4 DSPs may consume more power, says Vang, but that will improve with newer CMOS processes. First-generation DSPs were implement using 16nm CMOS while the latest devices are at 7nm CMOS.
So the power advantage of Open Eye devices will shrink, says Vang, although Semtech claims its second-generation Open Eye devices will reduce power by 20 per cent.
Open Eye also has a latency advantage. Citing analysis from Nvidia (Mellanox), a PAM-4 DSP-based optical module adds 100ns of latency per link.
In a multi-hop network linking servers, the optical modules account for 40 per cent of the total latency, the rest being the switch, the network interface card and the optical flight time. Using Open Eye-based modules, the optical module portion shrinks to eight per cent only.
Specification status
The Open Eye MSA has specified 53-gigabit PAM-4 signalling for long-reach and short-reach optical links.
In particular, to its 200-gigabit FR4 specification, the MSA is adding 50-gigabit LR1, while an ER1 lite and 200-gigabit LR4 will be completed in early 2021. Meanwhile, the multi-mode 50-gigabit SR1, 200-gigabit SR4 and 400-gigabit SR8 specifications are done.
The third phase of the Open Eye work, producing a 100-gigabit PAM-4 specification, is starting now. Achieving the specification is important for Open Eye since modules are moving to 100-gigabit PAM-4, says Murray.
Products
Semtech is already selling 200-gigabit Open Eye short-reach chips, part of its Tri-Edge family. The two 4x50-gigabit devices are dubbed the GN2558 and GN2559.
The GN2558 is the transmitter chip. It retimes four 50-gigabit signals from the host and feeds them to the integrated VCSEL drivers that generate the optical PAM-4 signals sent over four fibres. The four photo-detector outputs are the receiver are then fed to the GN2559 that includes trans-impedance amplifiers (TIAs) and clock data recovery.
Equalisation is used within both devices. “The eye is opened on the transmitter as well as on the receiver; they equalise the signal in each direction,” says Vang.
The Semtech devices are being used for a 200-gigabit SR4 module and for a 400-gigabit SR8 active optical cable where two pairs of each chip are used.
Semtech will launch Tri-Edge long-reach Open Eye chips. The chips will drive externally-modulated lasers (EMLs), directly- modulated lasers (DMLs) and silicon photonics-based designs for single-mode fibre applications.
“We have early versions of these chips sampled and demonstrated,” says Vang. “In the Open Eye MSA, we have shown the chips interoperating with, for example, MACOM’s chipset.”
Semtech’s Tri-Edge solutions are in designs with over two dozen module customers, says Vang.
Meanwhile, pluggable module maker CIG detailed a 200-gigabit QSFP56-FR4 while Optomind discussed a 400-gigabit QSFP56-DD active optical cable design as part of the Open Eye webinar.