Some cloud-based AI systems are returning to on-premises data centers

As a concept, artificial intelligence is very old. My first job out of college almost 40 years ago was as an AI systems developer using Lisp. Many of the concepts from back then are still in use today. However, it’s about a thousand times less expensive now to build, deploy, and operate AI systems for any number of business purposes.

Cloud computing revolutionized AI and machine learning, not because the hyperscalers invented it but because they made it affordable. Nevertheless, I and some others are seeing a shift in thinking about where to host AI/ML processing and AI/ML-coupled data. Using the public cloud providers was pretty much a no-brainer for the past few years. These days, the valuing of hosting AI/ML and the needed data on public cloud providers is being called into question. Why?

Cost of course. Many businesses have built game-changing AI/ML systems in the cloud, and when they get the cloud bills at the end of the month, they understand quickly that hosting AI/ML systems, including terabytes or petabytes of data, is pricey. Moreover, data egress and ingress costs (what you pay to send data from your cloud provider to your data center or another cloud provider) will run up that bill significantly.

Companies are looking at other, more cost-effective options, including managed service providers and co-location providers (colos), or even moving those systems to the old server room down the hall. This last group is returning to “owned platforms” largely for two reasons.

First, the cost of traditional compute and storage equipment has fallen a great deal in the past five years or so. If you’ve never used anything but cloud-based systems, let me explain. We used to go into rooms called data centers where we could physically touch our computing equipment—equipment that we had to purchase outright before we could use it. I’m only half kidding.

When it comes down to renting versus buying, many are finding that traditional approaches, including the burden of maintaining your own hardware and software, are actually much cheaper than the ever-increasing cloud bills.

Second, many are experiencing some latency with cloud. The slowdowns happen because most enterprises consume cloud-based systems over the open internet, and the multitenancy model means that you’re sharing processors and storage systems with many others at the same time. Occasional latency can translate into many thousands of dollars of lost revenue a year, depending on what you’re doing with your specific cloud-based AI/ML system in the cloud.

Many of the AI/ML systems that are available from cloud providers are also available on traditional systems. Migrating from a cloud provider to a local server is cheaper and faster, and more akin to a lift-and-shift process, if you’re not locked into an AI/ML system that only runs on a single cloud provider.

What’s the bottom line here? Cloud computing will continue to grow. Traditional computing systems whose hardware we own and maintain, not as much. This trend won’t slow down. However, some systems, especially AI/ML systems that use a large amount of data and processing and happen to be latency sensitive, won’t be as cost-effective in the cloud. This could also be the case for some larger analytical applications such as data lakes and data lake houses.

Some could save half the yearly cost of hosting on a public cloud provider by repatriating the AI/ML system back on-premises. That business case is just too compelling to ignore, and many won’t.

Cloud computing prices may lower to accommodate these workloads that are cost-prohibitive to run on public cloud providers. Indeed, many workloads may not be built there in the first place, which is what I suspect is happening now. It is no longer always a no-brainer to leverage cloud for AI/ML.

Leave a Reply

Your email address will not be published. Required fields are marked *

Translate »