IBM Storage and NVIDIA have teamed up to enhance artificial intelligence (AI) project development and streamline the AI data pipeline. This approach — IBM SpectrumAI with NVIDIA DGX Reference Architecture — provides data scientists and other AI project team members with a solid framework that can help in AI deployments and ends with design based on IBM and NVIDIA system and software products.

The companies’ partnership is important not only because the field of AI is growing very rapidly, but because major AI projects can be a real challenge to any organization. IBM Storage in combination with NVIDIA and its joint channel partners offers skills, resources and products to enable organizations to overcome whatever challenges they might face for their AI workloads.

The AI Revolution

Information technology (IT) always seems to be in the throes of a major revolution. AI is one such revolution and despite all that is going on, AI is still in its infancy. Many years hence, AI may still not even be at the knee of the curve of a decades-long exponential growth. Every day it seems that there is a new or expanded practical use of AI technology — such as self-driving cars, a huge number of customer sentiment and sensor-based analysis examples, threat analysis, and image interpretation. Almost all organizations should be able to benefit from AI technology, now or in the future. And infrastructure vendors are thrilled by the prospect since AI projects often demand seemingly inexhaustible compute and storage resources.

From reference architecture to a converged infrastructure solution

AI projects are data-driven in contrast to the process orientation of online transaction processing systems (OLTP). An AI data pipeline consists of ingest, classification and analyzing/training phases that require considerable development time and thought, so an AI reference architecture can substantially aid the efforts of project teams. In general, reference architectures are increasing in popularity as they provide a frame of reference for a particular domain. Reference architectures are available for specific industries and processes, such as banking, telecommunications, and manufacturing and supply chains.

These play an important role, but so can vendor-supplied reference architectures, such as the IBM SpectrumAI with NVIDIA DGX reference architecture. Vendor-specific reference architectures lead AI project teams down a path to purchasing products that implement an AI infrastructure solution. This is not a problem if AI project teams understand up front what they are getting into and are comfortable with the vendors.

The roles of IBM and NVIDIA in the IBM SpectrumAI with NVIDIA DGX Reference Architecture

Most, if not all, organizations should be comfortable with IBM and NVIDIA, two of the giants in the AI industry. Of course, IBM Watson is familiar to many, but the company has strengths and expertise in non-Watson-related AI activities. NVIDIA notably invented the GPU (Graphical Processing Unit), which has become a chief computing element in AI (such as in NVIDIA DGX servers), where it serves as an accelerator for the highly dense parallel processing engine AI projects typically demand. This is now complemented on the storage side by IBM Spectrum Scale, which, at its software-storage-sysem-based heart, has a long-proven and well-accepted parallel file system that enables close integration with DGX servers.

The net result — a powerful combination of IBM and NVIDIA for AI workloads — which encompasses all the necessary computing, storage and networking hardware that is accompanied by all the required supporting software in a single physical rack put together by IBM’s and NVIDIA’s channel partners.

The system consists of NVIDIA DGX-1 servers with Tesla V100 Tensor Core GPUs for computing. IBM supplies the storage solution with ESS (Enterprise Storage Servers) GS4S (All-Flash, non-NVMe) storage systems for immediate use, but moving to NVMe flash arrays in mid-2019 according to IBM (and that should be sufficient time as typical large AI projects have a significant gestation period). Mellanox IB (InfiniBand) Networking provides the necessary connectivity between the servers and storage elements.

But don’t forget the software.  The NVIDIA DGX software stack is specifically designed to deliver GPU-accelerated training performance, and that includes the new RAPIDS framework whose purpose is to accelerate data science workflow. At the heart of the IBM software-defined-storage (SDS) for files is IBM Spectrum Scale v5, which was specifically architected for the high-performance demand of modern AI workloads.

Now, NVIDIA’s arrangement with IBM is not an exclusive one — DDN Storage, NetApp and Pure Storage also work with the company on AI-related solutions — so how does IBM differentiate itself from these strong competitors? IBM claims that it has a performance advantage, stating that it will have a 1.5x NVMe advantage against competitors.  Additionally, IBM Spectrum Scale has extensive use in AI workloads already, including two AI reference architectures with IBM Power servers, and vast experience in the HPC-like needs of AI use cases.

IBM SpectrumAI with NVIDIA DGX will be sold only through selected channel partners supported by both companies. This makes a great deal of sense as major AI projects require a level of planning and design knowledge, along with collaboration and coordination skills, that only selected channel partners can bring to the table.

Mesabi musings

If you have not already done so, the time may be right to hop on the AI bandwagon. If you agree, looking into vendor-sponsored reference architectures, such as the one featured with IBM SpectrumAI with NVIDIA DGX, might be a good starting point. Just be sure that you realize that these vendors will eventually propose an AI deployment involving their products.

Still, you are not planning such efforts just for the fun of it, so eventually a converged infrastructure solution could provide an ideal way forward. IBM and NVIDIA are both leaders in their respective parts of the AI domain and their new IBM SpectrumAI with NVIDIA DGX offering makes a strong case for the companies.