Arista Networks embeds optics to boost 100 Gig port density
Arista Networks' latest 7500E switch is designed to improve the economics of building large-scale cloud networks.
The platform packs 30 Terabit-per-second (Tbps) of switching capacity in an 11 rack unit (RU) chassis, the same chassis as Arista's existing 7500 switch that, when launched in 2010, was described as capable of supporting several generations of switch design.

"The CFP2 is becoming available such that by the end of this year there might be supply for board vendors to think about releasing them in 2014. That is too far off."
Martin Hull, Arista Networks
The 7500E features new switch fabric and line cards. One of the line cards uses board-mounted optics instead of pluggable transceivers. Each of the line card's ports is 'triple speed', supporting 10, 40 or 100 Gigabit Ethernet (GbE). The 7500E platform can be configured with up to 1,152 10GbE, 288 40GbE or 96 100GbE interfaces.
The switch's Extensible Operating System (EOS) also plays a key role in enabling cloud networks. "The EOS software, run on all Arista's switches, enables customers to build, manage, provision and automate these large scale cloud networks," says Martin Hull, senior product manager at Arista Networks.
Applications
Arista, founded in 2004 and launched in 2008, has established itself as a leading switch player for the high-frequency trading market. Yet this is one market that its latest core switch is not being aimed at.
"With the exception of high-frequency trading, the 7500 is applicable to all data centre markets," says Hull. "That it not to say it couldn't be applicable to high-frequency trading but what you generally find is that their networks are not large, and are focussed purely on speed of execution of their transactions." Latency is a key networking performance parameter for trading.
The 7500E is being aimed at Web 2.0 companies and cloud service providers. The Web 2.0 players include large social networking and on-line search companies. Such players have huge data centres with up to 100,000 servers.
The same network architecture can also be scaled down to meet the requirements of large 'Fortune 500' enterprises. "Such companies are being challenged to deliver private cloud as the same competitive price points as the public cloud," says Hull.
The 7500 switches are typically used in a two-tier architecture. For the largest networks, 16 or 32 switches are used on the same switching tier in an arrangement known as a parallel spine.
A common switch architecture for traditional IT applications such as e-mail and e-commerce uses three tiers of switching. These include core switches linked to distribution switches, typically a pair of switches used in a given area, and top-of-rack or access switches connected to each distribution pair.
For newer data centre applications such as social networking, cloud services and search, the computation requirements result in far greater traffic shared on the same tier of switching, referred to as east-west traffic. "What has happened is that the single pair of distribution switches no longer has the capacity to handle all of the traffic in that distribution area," says Hull.
Customers address east-west traffic by throwing more platforms together. Eight or 16 distribution switches are used instead of a pair. "Every access switch is now connected to each one of those 16 distribution switches - we call them spine switches," says Hull.
The resulting two-tier design, comprising access switches and distribution switches, requires that each access switch has significant bandwidth between itself and any other access switch. As a result, many 7500 switches - 16 or 32 - can be used in parallel at the distribution layer.
Source: Arista Networks
"If I'm a Fortune 500 company, however, I don't need 16 of those switches," says Hull. "I can scale down, where four or maybe two [switches] are enough." Arista also offers a smaller 4-slot chassis as well as the 8 slot (11 RU) 7500E platform.
7500E specification
The switch has a capacity of 30Tbps. When the switch is fully configured with 1,152 10GbE ports, it equates to 23Tbps of duplex traffic. The system is designed with redundancy in place.
"We have six fabric cards in each chassis," says Hull, "If I lose one, I still have 25 Terabits [of switching fabric]; enough forwarding capacity to support the full line rates on all those ports." Redundancy also applies to the system's four power supplies. Supplies can fail and the switch will continue to work, says Hull.
The switch can process 14.4 billion 64-byte packets a second. This, says Hull, is another way of stating the switch capacity while confirming it is non-blocking.
The 7500E comes with four line card options: three use pluggable optics while the fourth uses embedded optics, as mentioned, based on 12 10Gbps transmit and 12 10Gbps receive channels (see table).
Using line cards supporting pluggable optics provides the customer the flexibility of using transceivers with various reach options, based on requirements. "But at 100 Gigabit, the limiting factor for customers is the size of the pluggable module," says Hull.
Using a CFP optical module, each card supports four 100Gbps ports only. The newer CFP2 modules will double the number to eight. "The CFP2 is becoming available such that by the end of this year there might be supply for board vendors to think about releasing them in 2014," says Hull. "That is too far off."
Arista's board mounted optics delivers 12 100GbE ports per line card.
The board-mounted triple-speed ports adhere to the IEEE 100 Gigabit SR10 standard, with a reach of 150m over OM4 fibre. The channels can be used discretely for 10GbE, grouped in four for 40GbE, while at 100GbE they are combined as a set of 10.
"At 100 Gig, the IEEE spec uses 20 out of 24 lanes (10 transmit and 10 receive); we are using all 24," says Hull. "We can do 12 10GbE, we can do three 40GbE, but we can still only do one 100Gbps because we have a little bit left over but not enough to make another whole 100GbE." In turn, the module can be configured as two 40GbE and four 10GbE ports, or 40GbE and eight 10GbE.
Using board-mounted optics reduces the cost of 100Gbps line card ports. A full 96 100GbE switch configuration achieves a cost of $10k/port while using existing CFP modules the cost is $30k-50k/ port, claims Arista.
Arista quotes 10GbE as costing $550 per line card port not including the pluggable transceiver. At 40GbE this scales to $2,200. For 100GbE the $10k/ port comprises the scaled-up port cost at 100GbE ($2.2k x 2.5) to which is added the cost of the optics. The power consumption is under 4W/ port when the system is fully loaded.
The company uses merchant chips rather than an in-house ASIC for its switch platform. Can't other vendors develop similar performance systems based on the same ICs? "They could, but it is not easy," says Hull.
The company points out that merchant chip switch vendors use a CMOS process node that is typically a generation ahead of state-of-the-art ASICs. "We have high-performance forwarding engines, six per line card, each a discrete system-on-chip solution," says Hull. "These have the technology to do all the Layer 2 and Layer 3 processing." All these devices on one board talk to all the other chips on the other cards through the fabric.
In the last year, equipment makers have decided to bring silicon photonics technology in-house: Cisco Systems has acquired Lightwire while Mellanox Technologies has announced its plan to acquire Kotura.
Arista says it is watching silicon photonics developments with keen interest. "Silicon photonics is very interesting and we are following that," says Hull. "You will see over the next few years that silicon photonics will enable us to continue to add density."
There is a limit to where existing photonics will go, and silicon photonics overcomes some of those limitations, he says.
Extensible Operating System
Arista's highlights several characteristics of its switch operating system. The EOS is standards-compliant, self-healing, and supports network virtualisation and software-defined networking (SDN).
The operating system implements such protocols as Border Gateway Protocol (BGP) and spanning tree. "We don't have proprietary protocols," says Hull. "We support VXLAN [Virtual Extensible LAN] an open standards way of doing Layer 2 overlay of [Layer] 3."
EOS is also described as self-healing. The modular operating system is composed of multiple software processes, each process described as an agent. "If you are running a software process and it is killed because it is misbehaving, when it comes back typically its work is lost," says Hull. EOS is self-healing in that should an agent need to be restarted, it can continue with its previous data.
"We have software logic in the system that monitors all the agents to make sure none are misbehaving," says Hull. "If it finds an agent doing stuff that it should not, it stops it, restarts it and the process comes back running with the same data." The data is not packet related, says Hull, rather the state of the network.
The operating system also supports network virtualisation, one aspect being VXLAN. VXLAN is one of the technologies that allows cloud providers to provide a customer with server resources over a logical network when the server hardware can be distributed over several physical networks, says Hull. "Even a VLAN can be considered as network virtualisation but VXLAN is the most topical."
Support for SDN is an inherent part of EOS from its inception, says Hull. “EOS is open - the customers can write scripts, they can write their own C-code, or they can install Linux packages; all can run on our switches." These agents can talk back to the customer's management systems. "They are able to automate the interactions between their systems and our switches using extensions to EOS," he says.
"We encompass most aspects of SDN," says Hull. "We will write new features and new extensions but we do not have to re-architect our OS to provide an SDN feature."
Arista is terse about its switch roadmap.
"Any future product would improve performance - capacity, table sizes, price-per-port and density," says Hull. "And there will be innovation in the platform's software.
OFC/NFOEC 2013 industry reflections - Final part
Gazettabyte spoke with Jörg-Peter Elbers, vice president, advanced technology at ADVA Optical Networking about the state of the optical industry following the recent OFC/NFOEC exhibition.

"There were many people in the OFC workshops talking about getting rid of pluggability and the cages and getting the stuff mounted on the printed circuit board instead, as a cheaper, more scalable approach"
Jörg-Peter Elbers, ADVA Optical Networking
Q: What was noteworthy at the show?
A: There were three big themes and a couple of additional ones that were evolutionary. The headlines I heard most were software-defined networking (SDN), Network Functions Virtualisation (NFV) and silicon photonics.
Other themes include what needs to be done for next-generation data centres to drive greater capacity interconnect and switching, and how do we go beyond 100 Gig and whether flexible grid is required or not?
The consensus is that flex grid is needed if we want to go to 400 Gig and one Terabit. Flex grid gives us the capability to form bigger pipes and get those chunks of signals through the network. But equally it allows not only one interface to transport 400 Gig or 1 Terabit as one chunk of spectrum, but also the possibility to slice and dice the signal so that it can use holes in the network, similar to what radio does.
With the radio spectrum, you allocate slices to establish a communication link. In optics, you have the optical fibre spectrum and you want to get the capacity between Point A and Point B. You look at the spectrum, where the holes [spectrum gaps] are, and then shape the signal - think of it as software-defined optics - to fit into those holes.
There is a lot of SDN activity. People are thinking about what it means, and there were lots of announcements, experiments and demonstrations.
At the same time as OFC/NFOEC, the Open Networking Foundation agreed to found an optical transport work group to come up with OpenFlow extensions for optical transport connectivity. At the show, people were looking into use cases, the respective technology and what is required to make this happen.
SDN starts at the packet layer but there is value in providing big pipes for bandwidth-on-demand. Clearly with cloud computing and cloud data centres, people are moving from a localised model to a cloud one, and this adds merit to the bandwidth-on-demand scenario.
This is probably the biggest use case for extending SDN into the optical domain through an interface that can be virtualised and shared by multiple tenants.
"This is not the end of III-V photonics. There are many III-V players, vertically integrated, that have shown that they can integrate and get compact, high-quality circuits"
Network Functions Virtualisation: Why was that discussed at OFC?
At first glance, it was not obvious. But looking at it in more detail, much of the infrastructure over which those network functions run is optical.
Just take one Network Functions Virtualisation example: the mobile backhaul space. If you look at LTE/ LTE Advanced, there is clearly a push to put in more fibre and more optical infrastructure.
At the same time, you still have a bandwidth crunch. It is very difficult to have enough bandwidth to the antenna to support all the users and give them the quality of experience they expect.
Putting networking functions such as cacheing at a cell site, deeper within the network, and managing a virtualised session there, is an interesting trend that operators are looking at, and which we, with our partnership with Saguna Networks, have shown a solution for.
Virtualising network functions such as cacheing, firewalling and wide area network (WAN) optimisation are higher layer functions. But as you do that, the network infrastructure needs to adapt dynamically.
You need orchestration that combines the control and the co-ordination of the networking functions. This is more IT infrastructure - server-based blades and open-source software.
Then you have SDN underneath, supporting changes in the traffic flow with reconfiguration of the network infrastructure.
There was much discussion about the CFP2 and Cisco's own silicon photonics-based CPAK. Was this the main silicon photonics story at the show?
There is much interest in silicon photonics not only for short reach optical interconnects but more generally, as an alternative to III-V photonics for integrated optical functions.
For light sources and amplification, you still need indium phosphide and you need to think about how to combine the two. But people have shown that even in the core network you can get decent performance at 100 Gig coherent using silicon photonics.
This is an interesting development because such a solution could potentially lower cost, simplify thermal management, and from a fab access and manufacturing perspective, it could be simpler going to a global foundry.
But a word of caution: there is big hype here too. This is not the end of III-V photonics. There are many III-V players, vertically integrated, that have shown that they can integrate and get compact, high-quality circuits.
You mentioned interconnect in the data centre as one evolving theme. What did you mean?
The capacities inside the data centre are growing much faster than the WAN interconnects. That is not surprising because people are trying to do as much as possible in the data centre because WAN interconnect is expensive.
People are looking increasingly at how to integrate the optics and the server hardware more closely. This is moving beyond the concept of pluggables all the way to mounted optics on the board or even on-chip to achieve more density, less power and less cost.
There were many people in the OFC workshops talking about getting rid of pluggability and the cages and getting the stuff mounted on the printed circuit board instead, as a cheaper, more scalable approach.
"Right now we are running 28 Gig on a single wavelength. Clearly with speeds increasing and with these kind of developments [PAM-8, discrete multi-tone], you see that this is not the end"
What did you learn at the show?
There wasn't anything that was radically new. But there were some significant silicon photonics demonstrations. That was the most exciting part for me although I'm not sure I can discuss the demos [due to confidentiality].
Another area we are interested in revolves around the ongoing IEEE work on short reach 100 Gigabit serial interfaces. The original objective was 2km but they have now honed in on 500m.
PAM-8 - pulse amplitude modulation with eight levels - is one of the proposed solutions; another is discrete multi-tone (DMT). [With DMT] using a set of electrical sub-carriers and doing adaptive bit loading means that even with bandwidth-limited components, you can transmit over the required distances. There was a demo at the exhibition from Fujitsu Labs showing DMT over 2km using a 10 Gig transmitter and receiver.
This is of interest to us as we have a 100 Gigabit direct detection dense WDM solution today and are working on the product evolution.
We use the existing [component/ module] ecosystem for our current direct detect solution. These developments bring up some interesting new thoughts for our next generation.
So you can go beyond 100 Gigabit direct detection?
Right now we are running 28 Gig on a single wavelength. Clearly with speeds increasing and with these kind of developments [PAM-8, DMT], you see that this is not the end.
Part 1: Software-defined networking: A network game-changer, click here
Part 2: OFC/NFOEC 2013 industry reflections, click here
Part 3: OFC/NFOEC 2013 industry reflections, click here
Part 4: OFC/NFOEC industry reflections, click here
