Using an open-source model to spur AI adoption
The Linux Foundation’s (LF) Deep Learning Foundation has set itself the ambitious goal of providing companies with all the necessary artificial intelligence (AI) software they will need.
Eyal Felstaine“Everything AI, we want you to take from open source,” says Eyal Felstaine, a member of the LF Deep Learning governing board and also the CTO of Amdocs. “We intend to have the entire [software] stack.”
The Deep Learning Foundation is attracting telecom, large-scale data centre operators and other players. Orange, Ciena, Red Hat, Chinese ride-sharing firm, Didi, and Intel are the latest companies to join the initiative.
The Deep Learning Foundation’s first project is Acumos, a platform for developers to build, share and deploy AI applications. Two further projects have since been added: Angel and Elastic Deep Learning.
Goal
The 'democratisation of data' is what has motivated the founding of the deep-learning initiative, says Felstaine.
A company using a commercial AI platform must put its data in a single repository. “You are then stuck [in that environment],” says Felstaine. Furthermore, fusing data from multiple sources exacerbates the issue in that the various datasets must be uploaded to the one platform.
Using an open-source approach will result in AI software that companies can download for free. “You can run it at your own place and you are not locked into any one vendor,” says Felstaine.
Everything AI, we want you to take from open source
Deep learning, machine learning and AI
Deep learning is associated with artificial neural networks which is one way to perform machine learning. And just as deep learning is a subset of machine learning, machine learning is a subset of AI, albeit the predominant way AI is undertaken today.
“Forty years ago if you had computer chess, the program’s developers had to know how to play chess,” says Felstaine. “That is AI but it is not machine learning.”
With machine learning, a developer need not know the rules of chess. “The software developer just needs to get the machine to see enough games of chess such that the machine will know how to play,” says Felstaine.
A neural network is composed of interconnected processing units or neurons. Like AI, it is a decades-old computer science concept. But an issue has been the efficient execution of a neural network when shared across processors due to input-output constraints. Now, with the advent of the internet content providers and cloud, not only can huge datasets be used to train neural networks but the ‘hyper-connectivity’ between the servers’ virtual machines or containers means large-scale neural networks can be used.
Containers offer a more efficient way to run many elements on a server. “The numbers of virtual machines on a CPU is maybe 12 if you are lucky; with containers, it is several hundred,” says Felstaine. Another benefit of developing an application using containers is that it can be ported across different platforms.
“This [cloud clustering] is a quantitative jump in the enabling technology for traditional neural networks because you can now have thousands and even tens of thousands of nodes [neurons] that are interconnected,” says Felstaine. Running the same algorithms on much larger neural networks has only become possible in the last five years, he says.
Felstaine cites as an example the analysis of X-ray images. Typically, X-rays are examined by a specialist. For AI, the images are sent to a firm for parsing where the images are assessed and given a ‘label’. Millions of X-ray images can be labelled before being fed to a machine-learning application such as Tensorflow or H2O. Tensorflow, for example, is open-source software that is readily accessible.
The resulting trained software, referred to as a predictor, is then capable of analysing an X-ray picture and give a prognosis based on what it has learnt from the dataset of X-rays and labels created by experts. “This is pure machine learning because the person who defined Tensorflow doesn’t know anything about human anatomy,” says Felstaine. Using the software creates a model. “It’s an empty hollow brain that needs to be taught.”
Moreover, the X-ray data could be part of a superset of data from several providers such as life habits from a fitness watch, the results of a blood test, and heart data to create a more complex model. And this is where an open-source framework that avoids vendor lock-in has an advantage.
Acumos
Acumos started as a collaboration between AT&T and the Indian IT firm, Tech Mahindra, and was contributed to the LF Deep Learning Foundation.
Felstaine describes Acumos as a way to combine, or federate, different AI tools that will enable users to fuse data from various sources "and make one whole out of it”.
There is already an alpha release of Acumos and the goal, like other open-source projects, is to issue two new software releases a year.
How will such tools benefit telecom operators? Felstaine says AT&T is already using AI to save costs by helping field engineers maintain its cell towers. The field engineer uses a drone to inspect the operator’s cell towers, and employing AI to analyse the drone’s images, it guides the field engineer as to what maintenance, if any, is needed.
One North American operator has said it has over 30 AI projects including one that is guiding the operator as to how to upgrade a part of its network to minimise the project's duration and the disruption.
One goal for Acumos is to benefit the Open Networking Automation Platform (ONAP) that oversees Network Functions Virtualisation (NFV)-based networks. ONAP is an open-source project that is managed by the Linux Foundation Networking Fund.
NFV is being adopted by operators to help them lunch and scale services more efficiently and deliver operational and capital expenditure savings. But the operation and management of NFV across a complex telecom network is a challenge to achieving such benefits, says Felstaine.
ONAP already has a Data Collection, Analytics, and Events (DCAE) subsystem which collects data regarding the network’s status. Adding Acumos to ONAP promises a way for machine learning to understand the network’s workings and provide guidance when faults occur, such as the freezing of a virtual machine running a networking function.
With such a fault, the AI could guide the network operations engineer, pointing out that humans take this action next and that the action has an 85 percent success rate. It then gives the staff member the option to proceed or not. Ultimately, AI will control the networking actions and humans will be cut out of the loop. “AI as part of ONAP? That is in the future,” says Felstaine.
The two new framework projects - Angel and Elastic Deep Learning - have been contributed to the Foundation from the Chinese internet content providers, Tencent and Baidu, respectively.
Both projects address scale and how to do clustering. “They are not AI, more ways to distribute and scale neural networks,” says Felstaine.
The Deep Learning Foundation was launched in March by the firms Amdocs, AT&T, B.Yond, Baidu, Huawei, Nokia, Tech Mahindra, Tencent, Univa, and ZTE.
OPNFV's releases reflect the evolving needs of the telcos
Heather KirkseyThe open source group, part of the Linux Foundation, specialises in the system integration of network functions virtualisation (NFV) technology.
The OPNFV issued Fraser, its latest platform release, earlier this year while its next release, Gambia, is expected soon.
Moreover, the telcos continual need for new features and capabilities means the OPNFV’s work is not slowing down.
“I don’t see us entering maintenance-mode anytime soon,” says Heather Kirksey, vice president, community and ecosystem development, The Linux Foundation and executive director, OPNFV.
Meeting a need
The OPNFV was established in 2014 to address an industry shortfall.
“When we started, there was a premise that there were a lot of pieces for NFV but getting them to work together was incredibly difficult,” says Kirksey.
Open-source initiatives such as OpenStack, used to control computing, storage, and networking resources in the data centre, and the OpenDaylight software-defined networking (SDN) controller, lacked elements needed for NFV. “No one was integrating and doing automated testing for NFV use cases,” says Kirksey.
I don’t see us entering maintenance-mode anytime soon
OPNFV set itself the task of identifying what was missing from such open-source projects to aid their deployment. This involved working with the open-source communities to add NFV features, testing software stacks, and feeding the results back to the groups.
The nature of the OPNFV’s work explains why it is different from other, single-task, open-source initiatives that develop an SDN controller or NFV management and orchestration, for example. “The code that the OPNFV generates tends to be for tools and installation - glue code,” says Kirksey.
OPNFV has gained considerable expertise in NFV since its founding. It uses advanced software practices and has hardware spread across several labs. “We have a large diversity of hardware we can deploy to,” says Kirksey.
One of the OPNFV’s advanced software practices is continuous integration/ continuous delivery (CI/CD). Continuous integration refers to how code is added to a software-build while it is still being developed unlike the traditional approach of waiting for a complete software release before starting the integration and testing work. For this to be effective, however, requires automated code testing.
Continuous delivery, meanwhile, builds on continuous integration by automating a release’s update and even its deployment.
“Using our CI/CD system, we will build various scenarios on a daily, two-daily or weekly basis and write a series of tests against them,” says Kirksey, adding that the OPNFV has a large pool of automated tests, and works with code bases from various open-source projects.
Kirksey cites two examples to illustrate how the OPNFV works with the open-source projects.
When OPNFV first worked with OpenStack, the open-source cloud platform took far too long - about 10 seconds - to detect a faulty virtual machine used to implement a network function running on a server. “We had a team within OPNFV, led by NEC and NTT Docomo, to analyse what it would take to be able to detect faults much more quickly,” says Kirksey.
The result required changes to 11 different open-source projects, while the OPNFV created test software to validate that the resulting telecom-grade fault-detection worked.
Another example cited by Kirksey was to enable IPv6 support that required changes to OpenStack, OpenDaylight and FD.io, the fast data plane open source initiative.
The reason cloud-native is getting a lot of excitement is that it is much more lightweight with its containers versus virtual machines
OPNFV Fraser
In May, the OPNFV issued its sixth platform release dubbed Fraser that progresses its technology on several fronts.
Fraser offers enhanced support for cloud-native technology that use microservices and containers, an alternative to virtual machine-based network functions.
The OPNFV is working with the Cloud Native Computing Foundation (CNCF), another open-source organisation overseen by the Linux Foundation.
CNCF is undertaking several projects addressing the building blocks needed for cloud-native applications. The most well-known being Kubernetes, used to automate the deployment, scaling and management of containerised applications.
“The reason cloud-native is getting a lot of excitement is that it is much more lightweight with its containers versus virtual machines,” says Kirksey. “It means more density of what you can put on your [server] box and that means capex benefits.”
Meanwhile, for applications such as edge computing, where smaller devices will be deployed at the network edge, lightweight containers and Kubernetes are attractive, says Kirksey.
Another benefit of containers is faster communications. “Because you don’t have to go between virtual machines, communications between containers is faster,” she says. “If you are talking about network functions, things like throughput starts to become important.”
The OPNFV is working with cloud-native technology in the same way it started working with OpenStack. It is incorporating the technology within its frameworks and undertaking proof-of-concept work for the CNCF, identifying shortfalls and developing test software.
OPNFV has incorporated Kubernetes in all its installers and is adopting other CNCF work such as the Prometheus project used for monitoring.
“There is a lot of networking work happening in CNCF right now,” says Kirksey. “There are even a couple of projects on how to optimise cloud-native for NFV that we are also involved in.”
OPNFV’s Fraser also enhances carrier-grade features. Infrastructure maintenance work can now be performed without interrupting virtual network functions.
Also expanded are the metrics that can be extracted from the underlying hardware, while the OPNFV’s Calipso project has added modules for service assurance as well as support for Kubernetes.
Fraser has also improved the support for testing and can allocate hardware dynamically across its various labs. “Basically we are doing more testing across different hardware and have got that automated as well,” says Kirksey.
Linux Foundation Networking Fund
In January, the Linux Foundation combined the OPNFV with five other open-source telecom projects it is overseeing to create the Linux Foundation Networking Fund (LNF).
The other five LNF projects are the Open Network Automation Platform (ONAP), OpenDaylight, FD.io, the PNDA big data analytics project, and the SNAS streaming network analytics system.
Edge is becoming a bigger and more important use-case for a lot of the operators
“We wanted to break down the silos across the different projects,” says Kirksey. There was also overlap with members sitting on several projects’ boards. “Some of the folks were spending all their time in board meetings,” says Kirksey.
Service provider Orange is using the OPNFV Fraser functional testing framework as it adopts ONAP. Orange used the functional testing to create its first test container for ONAP in one day. Orange also achieved a tenfold reduction in memory demands, going from a 1-gigabyte test virtual machine to a 100-megabyte container. And the operator has used the OPNFV’s CI/CD toolchain for the ONAP work.
By integrating the CI/CD toolchain across projects, OPNFV says it is much easier to incorporate new code on a regular basis and provide valuable feedback to the open source projects.
The next code release, Gambia, could be issued as early as November.
Gambia will offer more support for cloud-native technologies. There is also a need for more work around Layer 2 and Layer 3 networking as well as edge computing work involving OpenStack and Kubernetes.
“Edge is becoming a bigger and more important use-case for a lot of the operators,” says Kirksey.
OPNFV is also continuing to enhance its test suites for the various projects. “We want to ensure we can support the service providers real-world deployment needs,” concludes Kirksey.
