Kevin Cochrane, CMO of Vultr, spotlights seven trends to watch in 2024 as enterprises operationalize generative AI on a global scale.
Even as predictive AI has been forcing a remaking of IT operations in every enterprise in every sector, the emergence of generative AI over the past year has acted as an accelerant of that transformation. In 2023, the world got a taste of what generative AI can do. And now, the technology has been turned over to the global community of innovators who will dream up new incarnations of large language models (LLMs) and generative AI applications that will fundamentally reimagine every facet of business and society.
In 2024, digital startups and enterprises alike will move beyond the art of the possible and start building practical roadmaps for operationalizing large language models operations (LLMOps) across the enterprise. The traditional IT stakeholders, including the CIO, CTO, and CISO, will partner with the Chief Data Officer (CDO) and/or the Chief Data Science Officer (CDSO) to fundamentally rethink all aspects of IT Operations to accommodate the rising prominence of both MLOps and LLMOps across the enterprise.
Here's how we expect to see that accelerating transformation play out in the coming year.
1. Generative AI will transform every web and mobile application.
The use of GenAI to power new customer and employee experiences will move mainstream. Enterprises will start embedding GenAI to generate new modalities of interactions across industries worldwide.
2. Open source will accelerate LLM innovation to scale AI.
For organizations looking to harness the power of AI (GenAI included), there will be a shift from big, monolithic LLM clusters, to smaller, highly specialized, open source LLMs. Cost to train and run LLMs will be a major driver of the shift, but it is not the only impetus. Smaller, specialized LLMs with domain-specific tuning will find favor with enterprise data science teams looking for better accuracy in specific use cases than monolithic all-in-one LLMs can provide.
3. Enterprise architects will control the future of Generative AI.
CIOs and enterprise architects who are rethinking application stacks will begin to reshape their infrastructure stack to apply the principles of composability. The only way enterprises can keep up with the relentless pace of innovation of the technologies underpinning generative AI will be to choose modular, atomic, orchestratable GPU stack components that can be quickly, easily, and cost effectively added and replaced as business requirements or component capabilities change.
4. AI and machine learning will go full cloud-native.
CPU-based Kubernetes clusters will be extended for containerized services and models to run on GPU-based clusters. Principles of cloud engineering and architecture will be transferred to the GPU world so that developers who are already using Kubernetes and container registries for app development on CPUs will be empowered to shift their development efforts to GPUs.
5. The CISO will step in to help govern compliance and proper use of data and AI/ML.
While more than capable of functioning within the DevOps paradigm, data scientists are not security experts. Data scientists should not be the only ones thinking about whether or not the data being used to train AI/ML models contains bias. Rather, the onus is on the CISO to step in and establish standard governance and compliance policies around integrating security into AI/ML pipelines and models to avoid model bias.
6. Data and data storage strategies will be reimagined.
Data is everywhere, so enterprises need to radically rethink the underlying infrastructure supporting it. Datasets are too large, transfer costs are too great, and the potential misuse of data continues to expand. As a result, enterprises will take a new approach to sourcing, securing, transferring, and ensuring governance and compliance of the large-scale datasets needed to power future AI/ML applications.
7. Content delivery networks (CDNs) will be replaced by cloud compute.
In the early days of the web, CDNs were all about caching content at the edge. Looking ahead, all that content is going to be generated on the fly at the edge, personalized to a specific user by Generative AI. There will be millions of autonomous agents automatically assembling context from an incoming user request and generating content accordingly. CDNs will be transformed into inference-optimized GPU compute resources to power new AI-driven, cloud-native applications.
This blog was originally published by David Marshall on VMBlog.com.