Storage technology explained: AI and data storage

Artificial intelligence (AI) and machine learning (ML) have changed the automation landscape and the fundamentals to IT. Antony Adshead, Storage Editor at Computer Weekly, shares a post detailing storage technology and AI’s effects on data storage. Storage becomes a key part of AI, to supply data for training and store the potentially huge volumes of data generated, or during inference when the results of AI are applied to real-world workloads, Adshead notes.

What are the key features of AI workloads? “There are three key phases and deployment types to AI workloads:

  1. Training, where recognition is worked into the algorithm from the AI model dataset, with varying degrees of human supervision;
  2. Inference, during which the patterns identified in the training phase are put to work, either in standalone AI deployments and/or;
  3. Deployment of AI to an application or sets of applications.”
What are the I/O characteristics of AI workloads? “Training and inferencing in AI workloads usually requires massively parallel processing, using graphics processing units (GPUs) or similar hardware that offload processing from central processing units (CPUs). Processing performance needs to be exceptional to handle AI training and inference in a reasonable timeframe and with as many iterations as possible to maximize quality. Infrastructure also potentially needs to be able to scale massively to handle very large training datasets and outputs from training and inference. It also requires speed of I/O between storage and processing, and potentially also to be able to manage portability of data between locations to enable the most efficient processing.”

What kind of storage do AI workloads need? “The task of storage is to supply those GPUs as quickly as possible to ensure these very costly hardware items are used optimally. More often than not, that means flash storage for low latency in I/O. Capacity required will vary according to the scale of workloads and the likely scale of the results of AI processing, but hundreds of terabytes, even petabytes, is likely.

Storage for AI projects will range from that which provides very high performance during training and inference to various forms of longer-term retention because it won’t always be clear at the outset of an AI project what data will be useful.”

Is cloud storage good for AI workloads? “Cloud storage could be a viable consideration for AI workload data. The advantage of holding data in the cloud brings an element of portability, with data able to be “moved” nearer to its processing location. Many AI projects start in the cloud because you can use the GPUs for the time you need them. The cloud is not cheap, but to deploy hardware on-premise, you need to have committed to a production project before it is justified. “

 

For Full Article, Click Here

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *