Author Archives: mesabigroup

IBM Spotlights Storage Made Simple for Data and AI

IBM first turned its “storage made simple” theme to block-based systems, most notably its FlashSystem offerings as the non-mainframe portion of its storage for hybrid multicloud. Now the company has applied the same theme to storage for its data and AI systems, most prominently, the file-based Elastic Storage System (ESS) storage hardware that is managed by Spectrum Scale software-defined storage (SDS) and object-based Cloud Object Storage hardware that is managed by IBM Cloud Object Storage SDS. The role these products play in edge computing was discussed in my recent blog (see IBM Storage Solutions for Edge Computing).

We can now turn our attention to the role of these products in an information architecture (IA) without which AI is impossible. In contrast to a transaction processing system where the role of the application is to serve as the platform for creating data using business process rules (i.e., a process-driven application), AI is a data-driven application. That means that the AI application must be able to respond dynamically to different data types. The IA for AI is divided into three buckets; collect, organize, and analyze. Visually, think of data being infused into a data lake on the collect side of the house. Then the organize phase pairs the necessary data for the analyze phase. But remember, it all starts with a functional, accessible, manageable data lake.

A data lake can be unbelievably large

We’re all familiar with data storage media and systems measured in megabytes (MB), gigabytes (GB) and terabytes (TB), and recently we have started hearing more about petabyte (PB – 1024 TB as we recall that storage bytes are a function of powers of 2 rather than powers of 10) systems.  The exabyte (EB) is the next 3 orders of magnitude plus increase followed by the zettabyte (ZB) and, finally, topped off by the unbelievably huge yottabyte (YB).

A data lake is a storage repository of file and/or object data in a natural/raw format. Even though incredibly huge, it is possible that some end use cases, to wit autonomous vehicles en masse, may someday require data lakes that exceed a YB in size.

IBM Focuses on the Collect Stage

As with a natural lake that is constantly refreshed with water from springs or rivers that flow into it, so is a data lake refreshed with the infusion of fresh new data.

IBM has just announced the addition of the IBM Elastic Storage System 5000. Unlike its slightly older brother the ESS 3000, which is a NVMe all-flash storage system that was announced in October 2019, the ESS 5000 is an all hard disk drive (HDD) storage system. Isn’t this a ‘back to the past” moment? The answer is no. We are talking about random access bulk storage demands where ingestion speeds in terms of bandwidth is the key metric, not I/O or latency where the ESS 3000 shines in the analysis stage of the infrastructure. And, of course, on a bulk basis, HDDs are still considerably more cost-effective than flash media.

The ESS 5000 scales up to 8YBs per data lake, although realistically the actual size is likely to be far smaller. The ESS 5000 can use one of two enclosures — an IBM standard size (SL) rack with the capacity of ½ PB to 8 PBs or the expanded size rack (SC) with capacity from 1 PB to 13.5 PB. Each enclosure is powered by two IBM Power9 servers. Up to 6 enclosures can be accommodated in a SL rack, whereas SC rack can handle 8 enclosures. IBM claims that the ESS 5000 has significant advantages in better density and faster performance that lead to both CAPEX and OPEX savings over two chief rivals — namely, Dell/EMC Isilon and NetApp FAS6000=DS460C.

A key point from the AI perspective is that there are continuous real-time updates of metadata enabled by IBM Spectrum Discover that leads to faster insights without the need to rescan (which could be a slow process).

Note that the ESS 5000 is file-oriented and uses IBM Spectrum Scale. So, what do we do about bringing object data to and filling the data lake? The answer is IBM Spectrum Scale Data Acceleration for AI that allows access and data movement between IBM Spectrum Scale and object storage on premises or in the cloud (planned GA Q4 2020).

IBM Cloud Object Storage (COS), in addition to its traditional roles of backup and archive, can now take a greater role in AI with faster data collection and integration, which according to IBM is “designed to increase system performance to 55 GB/s in a 12-node configuration, improving reads by 300% and writes by 150%, depending on object size.” (planned August 2020 GA)

IBM strengthens the organize stage by bringing Red Hat OpenShift into the mix

IBM Spectrum Discover is the software heart of the data organize stage that delivers the necessary real-time metadata management and indexing that AI software can access through the Spectrum Discover API, such as IBM Watson and IBM Cloud Pak for Data, and need for both the collect and analyze stages. In another piece of news, IBM Spectrum Scale, plus Spectrum Discover, plus Red Hat OpenShift are now integrated. Why is this important? The ability to leverage Red Hat OpenShift in this process makes multicloud deployments easier, and also according to IBM requires up to 50% less memory resources to be used. The original deployment configuration for Spectrum Discover was on virtual machines and with this announcement, Spectrum Discover can be deployed in a container configuration. This launch gives end users the choice of using a container configuration, a virtual configuration or both, if their needs require it. In addition, the Spectrum Discover policy engine has been upgraded to take better advantage of low-cost options for data movement and migration (planned GA Q4), including support for 3rd party data movers, such as Moonwalk.

Mesabi musings

IBM has long emphasized the importance of AI, but supporting artificial intelligence is not just about the software necessary to analyze the data; it is also about the collection of the data and the ability to organize it to support those analyses. With the introduction of the ESS 5000 for file data and the ability to incorporate object storage into the mix, IBM can now meet the storage needs of even the largest data lakes. But physically meeting the storage requirements is, in and of itself, not enough. The information required for AI workloads also has to be organized so that the right data can be found and processed in a timely manner, which IBM accomplishes in the organize phase with Spectrum Discover, especially with integration with Red Hat OpenShift.

All in all, IBM has built an impressive narrative and solution portfolio for simplifying the storage required to support data analysis and AI.


IBM Storage Solutions for Edge Computing

Edge computing continues to be increasingly important for organizations to make better data driven decisions. Since edge computing is a big generator of data and since data requires storage, storage is a major player. Let’s examine briefly how IBM positioned its approach to edge computing at its recent Think 2020 Digital conference, how it is expanding its role in edge computing and how IBM Storage, notably IBM Spectrum Scale and IBM Cloud Object Storage, is assisting those efforts.

IBM Expands Its Role in Edge Computing

Edge computing is generally thought of as a distributed IT architecture, where the data collected by an edge device can be processed locally to improve response time and save on bandwidth. This definition is somewhat confining as we shall see. Now, an edge device may be an endpoint on a private or public network, such as a mobile computing device like a smart phone or a pervasive computing device, say a sensor-based Internet of Things [IoT] solution.

At Think 2020, IBM and Red Hat announced new edge computing solutions for 5G network technologies, which bring with them unprecedented speed and extremely low latency. The goal is to enable enterprises to overcome the complexity of managing workloads across a massive number of devices. In particular, the IBM Telco Network Cloud Manager offering is designed to help telcos to quickly deliver 5G-based edge-enabled services to customers. In general, IBM plans to enable users to deliver value-adding insights through the application of AI and analytics close to the edge.

Now, back to the definition of edge computing. How local is local? Local should mean close enough to do the job properly in the most cost-efficient manner. All talk about computing being distributed or decentralized vs. centralized or core would seem to be irrelevant. For example, IBM uses Red Hat OpenShift as a linchpin for its edge computing work. Red Hat OpenShift even works at the mainframe level with IBM’s storage (see IBM Continues to Extend Its Mainframe Storage Solutions). Yet it would seem likely that, unless the mainframe can be used for edge computing from its current site, other IBM storage options could be used to place the necessary storage as close to the edge as necessary.

The Role of IBM Storage in Edge Computing

Although IBM’s data and AI storage solutions are often installed in central computing installations, they can also be deployed as appropriate to the edge as edge computing storage solutions.

Start off with the two members of the IBM Spectrum Storage software-defined storage (SDS) suite that provide the software horsepower for its data and AI storage solutions — namely IBM Spectrum Scale and IBM Cloud Object Storage. Now IBM Spectrum Scale would be the choice if the decision is made to manage the data at the edge as file data whereas IBM Cloud Object Storage would be the choice if the data can best be managed as objects.

Please note that IBM’s SDS software can be sold separately from its storage hardware. This means that a partner within IBM’s edge ecosystem could choose non-IBM storage hardware to go along with the chosen Spectrum Storage software. Naturally, IBM would prefer to sell software and storage bundled together and the company believes that the value proposition of its integrated storage hardware/software solutions is profound.

In the case of IBM Spectrum Scale, the bundled solution for edge storage goes under the name of Elastic Storage System (ESS) 3000 whereas for IBM Cloud Object Storage the bundled solution has the same name, IBM Cloud Object Storage, as the software standalone product. Let’s focus on what each software product brings to the edge computing table as the hardware is covered elsewhere starting with IBM Reinforces Storage Portfolio for AI and Big Data.

IBM Spectrum Scale’s global single namespace enables management of data as a single virtual pool and at any scale necessary. This capability has enabled IBM Spectrum Scale to bring many years of successful deployment in large centralized installations, but IBM Spectrum Scale also brings features, such as Active File Management (AFM), and capabilities that make it attractive for edge computing. AFM automatically shares and caches data across geographically distributed size to achieve performance levels similar to local data. This allows nondisruptive, intelligent data integration between edge locations and central data centers using a single-pane window management.

IBM Cloud Object Storage uses its concurrent parallel access capability to bring data from the edge quickly and efficiently with geo–dispersed data protection. IBM Cloud Object Storage enables enterprises to store and manage massive amounts of data efficiently and securely with extreme system reliability and accessibility from any location.

IBM Spectrum Scale and IBM Cloud Object Storage are also key parts of IBM Storage Suite for IBM Cloud Paks, enterprise-grade container software packages that are designed to offer customers faster, more reliable ways to build, move and manage applications and workloads in hybrid clouds. Enterprises use this Suite, which is a pre-tested reference architecture with data resources for file, block, and object requirements, to speed up their development processes.

Mesabi musings

At IBM Think 2020, IBM and Red Hat launched new edge computing solutions for the 5G era. But edge computing is not only about processing power for storage also takes a major role. And IBM can play this major role with the use of IBM Spectrum Scale and IBM Cloud Object Storage, which deliver the flexibility and scalability with the AI and analytics capabilities that many edge computing applications, such as telcos, demand. Edge computing in the 5G era should be quite exciting.



IBM Continues to Extend its Mainframe Storage Solutions

According to IBM, its Z mainframes process 30 billion transactions per day, including 87% of all credit card transactions in the world. These two facts alone show the importance of the mainframe in the world economy, but it also underscores the importance of the complementary storage systems that all mainframes must have. Therefore, IBM’s continuing the extension of its mainframe storage solutions: the DS8900F all-flash storage systems for production data and the TS7770 Virtual Tape Library (VTL) for data protection.

Where IBM mainframe storage fits within its storage portfolio

IBM generally divides its storage portfolio into two categories: 1) storage for hybrid multicloud environments, and 2) storage for AI and big data, namely the Elastic Storage System and Cloud Object Storage. The storage for hybrid multicloud side of the house has two categories: non-mainframe storage, which is composed of the FlashSystem family as well as the SAN Volume Controller (SVC), and mainframe storage, which contains the DS8900F storage systems and the TS7700 Virtual Tape Library.

For background on the DS8900F storage system and the TS7700 VTL see IBM Introduces DS8900F Storage Systems and IBM Enhances the TS7700 Virtual Tape Library for IBM Z Platforms.

Focusing on four critical areas

For the latest additions to its Z System mainframe portfolio, the z15 Model T02 and the LinuxONE III T02, IBM emphasizes four key aspects where its mainframe storage systems play major roles: cloud native, encryption everywhere, cyber resilience, and flexible storage.

  1. Cloud native — the vast majority of enterprises already operate in a hybrid multicloud environment and that trend seems likely to continue until nearly all companies operate in a hybrid multicloud world.
  2. Encryption everywhere — 100% encryption everywhere that it is needed, to wit, at rest, in motion, and in the cloud, is essential to ensure privacy of data from prying eyes.
  3. Cyber resilience — malware and ransomware are two types of major security threats against which enterprise has to guard.
  4. Flexible storage — more package options are available.

Let’s look at these more closely.

Cloud native

Mainframe IT organizations on top of their game want to build new applications that creatively use the hybrid multicloud, while at the same time modernizing core parts of their business. Containerization is the name of the game that facilitates digital transformation in the hybrid multicloud. DS8900F storage works in conjunction with Red Hat OpenShift and IBM Cloud Paks to facilitate this digital transformation. IBM Cloud Paks deliver enterprise-class containerized software solutions that run wherever Red Hat OpenShift runs. The end result is the unification of the traditional mainframe storage (i.e., DS8900F storage) with cloud native storage that delivers the reliability, availability and security that is necessary to manage mission-critical containers.

Encryption everywhere

Encryption is an essential tool in the data privacy and cyber resiliency toolbox. Data privacy as part of data protection has, in many aspects, become mandatory for regulatory compliance and not just common sense for preserving the value of an organization’s information assets. Hence, IBM’s emphasis on encryption everywhere across its mainframe storage portfolio for both the DS8900F and the TS7770.

For the IBM Storage for Mainframe offerings this means encryption at rest on-premises and extending to hybrid cloud data movement with encryption in-flight.   Using AES-256 (Advanced Encryption Standard 256 bit), which is the strongest encryption method used by public companies, IBM implements storage encryption in hardware rather than software so there is no performance impact.  Avoiding performance impact with storage-based data encryption eases the concern of IT professionals on encryption’s impact on applications and workloads (IBM also has encryption with no performance impact with its FlashSystem family of solutions).

Cyber resiliency

The continual threat of cyber-attacks, such as malware and ransomware, easily lead to loss of sleep. IBM mainframe storage protects against this. For example, Safeguarded Copy (discussed in greater detail in IBM Introduces DS8900F Storage Systems) uses up to 500 immutable incremental snapshots per volume as a means of recovering to a point-in-time before an attack occurred.

A second method is to recover and restore from a good “air-gapped” (logically or physically separated) copy of the data that cyber-criminals cannot reach through external and internal networks. This can be done using logical air gapping to the public cloud or physical true air gapping to tape. If a DS8900F creates a copy of the data as an object store on a public cloud, this is a logical air gap. Technically, a hacker could still reach the data copy, but the degree of difficulty to extend the prying to this additional step would seem to be very difficult if not impossible. An IBM TS7770 VTL can connect to one of IBM’s Tape solutions using physical tape media.  Since the tape media can be physically removed and placed on an offline rack, you can create a physical true air gap. These protections aside, it might be wise to create an additional off-line copy of the data before remounting it in case a hacker has a piece of malicious software awaiting its return.

Flexible storage

IBM has long provided its mainframe storage mounted in its own racks (which are not the standard 19-inch racks). Now, in addition to offering its own racks for those who are comfortable with them, IBM also offers rack-less versions of the DS8900F and the TS7700. This does not mean that they operate without a rack, but rather that they can be mounted in a customer-supplied 19-inch industry-standard rack.

Mesabi musings

IBM mainframe storage customers face the same set of challenges that face the non-mainframe crowd, if not more, because they typically have increased needs for data availability, reliability, performance and scalability, including strong security needs. Thus, these storage users must understand the changes that are happening around them, while at the same time preserving and modernizing their existing base.

Supporting the containerization capabilities of Red Hat OpenShift with IBM Cloud Paks for use with DS8900F (through the FlexVolume driver) storage systems meets the cloud native functionality needed for the hybrid multicloud. The z15’s and IBM Storage’s core “encryption everywhere” is a critical solution for maximizing data security without impacting system performance. Safeguarded copy and air-gapping provide much needed cyber-resiliency that enterprise customers have come to depend on.

All in all, with these newest IBM Storage offerings the company continues to extend sophisticated essential features and deliver enterprise-class solutions of critical importance to IBM Z mainframe customers.


Avid Labs Announces What Could Be a Game Changing COVID-19 Test

Avid Labs, an Indiana-based company that is focused on innovative product design, has just announced what could be a game changing COVID-19 test that delivers results in as little as five minutes. This has major implications both in the short-term and long-term. Details are sketchy, but portability is critical. Approval from the FDA happened last Friday and the company promises to ship 50,000 kits per week starting this week.

Due to the shortage of commercial products, COVID-19 testing has been limited to those suspected of having the coronavirus and the results of the tests come back far too slowly. That is a critical issue since someone infected with COVID-19 can be infectious for days before displaying symptoms. Although the speed of testing is improving, many people, especially those at the frontline, including doctors and nurses, need to quickly determine whether or not they need to be self-quarantined to avoid spreading the virus to others.

Short-term, the objective is to scale the availability of tests to quickly determine who has the virus. As the number of kits expands rapidly beyond the triage level, a paradigm change needs to occur to prove that people are uninfected and are not a danger to others. This is a major shift in viewpoint and requires not only fast but also relatively inexpensive testing.

In such a scenario, a person who is tested could receive a badge with the time and date of the test, the test results and the length of time that the test is valid. That information could be entered into an online database for reference by authorized users and for analytical purposes.

In the short term, frontline medical personnel would receive that information on not only themselves and their fellow coworkers, but also anybody else that they come in contact with. This can quickly be expanded to all medical personnel and facilities in order to get them back up to full operations as possible.

This approach also promises benefits outside the medical industry in areas and industries, including package delivery, warehouses and supply chains, and manufacturing/production

Then we can consider the long-term. The great barrier that we face in resuming our normal lives is that without social distancing we face the risk that just one person out of a hundred or a thousand, can restart the epidemic. Let’s say every person boarding a plane has to be tested. The same is true of a hotel or a cruise ship. Or a factory and so on and so on. The goal here is to make sure that everyone tests negative.

Of course, cost is always an issue and who pays is always a concern. I suspect that we will find a way. Avid Labs could partner with other companies to manufacture the kits and therefore take advantage of the famous learning curve that costs will fall with increased volume.

Mesabi musings

Unfortunately, we have not seen all the horrors that the coronavirus pandemic is likely to impose upon us. However, rapid testing at a very large scale could help mitigate health concerns in the short-term and, in the long-term, might enable a faster and safer restart of the economy than would have otherwise been thought possible. Therefore, Avid Labs is to be commended for bringing this testing capability to the marketplace.


DH2i Helps Out During the COVID-19 Crisis with its Free Work at Home Software


The coronavirus has given a terrible new meaning to the term “March Madness.” As a major side effect of the COVID-19 pandemic, the need to work at home for those who can do so is not an option, but rather a mandatory requirement for the foreseeable future. Yet many organizations do not have the software in place to safely allow their employees to fully access from their homes all the applications and data that they can from their work computer. DH2i solves this problem with its DxOdyssey Work From Home (WFH) software offering. DH2i offers the software free of charge to any organization large or small, public or private.

What DH2i does

DH2i defines itself as making software-defined perimeter and Smart Availability software for Windows and Linux environments. In simple English language, DH2i acts as a “hamburger helper” by enhancing a current IT infrastructure to make it “always secure and always on.” DxOdyssey is a current member of its software portfolio. DH2i is now focusing on the work-at-home capabilities of DxOdyssey.

How DH2i makes its DxOdyssey WFH Software available free of charge

Employees who want to install the software on their home PCs can do so by going to DH2i Work From Home Client Portal on the Internet. The user has to install the DxOdyssey software client first on the user’s work computer before installing it on the home PC. The user then connects the two systems over the Internet using a one-time passkey that creates a secure network tunnel. DH2i does not consider this to be a VPN (virtual private network) but, in essence, DxOdyssey acts as such.

For those that do not have the budget or technical expertise to set up a VPN or other alternative,  DxOdyssey offers an easy, secure way to access the same applications and data that they have permission to use on their work computer. However, DH2i would also argue that even those who have access to a VPN can benefit by the use of DxOdyssey to avoid what the company thinks are the inherent security risks of a VPN, such as potentially allowing unauthorized access to business and/or home networks. Of course, one would be wise to get the permission of the appropriate business managers and IT staff before using the software.

What DH2i is providing to DxOdyssey WFH users:

  • Free DxOdyssey software downloads until at least August 31, 2020 — at which time it will reevaluate the situation.
  • Free technical support — this should relieve the anxiety about any technical issues that one might have, such as setup and configuration.
  • No obligation download — no personal information is collected during the download; this means that users do not have to worry about unwanted sales communications


Mesabi musings

Although many people have long been able to work at home (or permitted or even encouraged to do so either full or part time) using a computer, many more are being forced to do so because of the coronavirus crisis. Those who are new to the game may suffer anxiety in getting everything to work as it should. Many of those who are used to working at home may still benefit from a new way of connecting to their business applications or information for better security or availability reasons.

Of course, even as we all do our best to avoid the physical risks of actually contracting COVID-19, we will all suffer, in one way or another, emotionally and economically, from the disease. The ability for more people to be able to work efficiently and effectively from home enables productivity that preserves jobs and therefore reduces or prevents some job loss anxiety, as well as making a small positive statement about how we are trying to keep the economy rolling.

With those points in mind, DH2i deserves strong praise for making its DxOdyssey WFH software freely available.