[Webinaire] La reprise après sinistre des systèmes basés sur Kafka | Inscrivez-vous dès maintenant
Interest in AI has surged since 2020 and has dominated conversations across headlines and boardrooms ever since. So it’s unsurprising that business development has followed suit — 81% of IT leaders listed AI and machine learning as an important or top priority in their 2024 budgets, according to survey results in Confluent’s Data Streaming Report.
But is all this attention and investment leading to a near-term future where AI is ubiquitous and functions as intended? It all depends on whether businesses have a sufficient pipeline of engineers equipped with new skills, the right tools and trustworthy data to turn AI promises into real-world capabilities.
Let’s take a look at the AI trends that will have the biggest impact on engineering teams and our advice on how to overcome these challenges.
Large language models (LLMs) continue to follow a corollary to Moore’s Law, exponentially growing with the amount of training data they consider, the number of parameters that define them, and the size of the context window that their attention can consider. Yet, model interpretability remains elusive in general. LLMs in general are poor reasoning agents, so coupling them with machinery we already know works, will get us a lot further in overcoming the hallucinations that result in relying on LLMs alone.
Developer impact
LLMs are stochastic in nature, while many traditional QA best practices assume the tested system is deterministic. Developers will have to rely on different approaches to test and build confidence in LLM-enabled applications. We can apply historically useful machine learning (ML) techniques and other technologies to measure output quality and minimize hallucinations. With application-specific guardrails in place, engineers can build LLMs capable of reliably identifying when they're hallucinating, or providing information with low confidence scores.
Compared to LLMs, using a small fine-tuned language model will provide a better response. However, developers need to feed it with the right insights, such as events, and timely and personalized data, so that it can succeed. One promising pattern developers can use to reduce the impact of hallucinations is retrieval-augmented generation (RAG), coupling prompts at inference time with relevant domain-specific information.
Agentic AI systems promise to make decisions and act independently on behalf of specific business functions, teams, and even individuals within the business. However as AI models become more sophisticated, they tend to lose transparency, and that introduces difficult questions for engineering teams to answer when building and deploying AI agents.
Developer impact
Compared to less automated systems, it’s much more difficult to recognize errors in AI before dependent systems consume the outputs. Implementing RAG with real-time data pipelines can help augment agentic AI solutions with important context to improve environmental awareness and decision-making.
As these solutions go from conception to development and production, more organizations will need a data-streaming platform (DSP) complete with streaming, processing, and governance capabilities to sustainably build and scale these capabilities in the long term. Event-driven architectures enabled with a DSP provide an implementation framework for agentic systems, modeling them as asynchronous workflows of composable microservices. This approach promotes the reusability of individual components of agentic systems, and makes the larger systems easier to analyze and scale, than if they are created as large monoliths.
Rising demand for dynamic or real-time data access isn’t unique to AI/ML initiatives, but has contributed to the growth of real-time intelligence. Over the last decade, engineering teams have increasingly used open source streaming engines like Apache Kafka® and Apache Flink® to power real-time recommendations, predictions and anomaly detection.
Developer impact
This trend will also affect the infrastructure and teams behind these projects. This shift to real-time data access will allow for more flexible and dynamic data organizations, which enables human users, chatbots and even AI agents to quickly access and query a wide range of data.
Companies looking for ways to reduce complexity and costs when building real-time AI solutions need to shift data processing left, and use data contracts to enable dynamic access to trustworthy data products. The resulting data products can be consumed as either data streams or open table formats. This approach allows data teams to facilitate efficient data processing, supplying engineers with data in a clean, consistent format, and enabling them to build dynamic AI applications with more confidence and less risk.
However, providing engineers with trustworthy data isn’t enough to ensure AI initiatives succeed. Leaders also need to motivate experienced engineers to train and mentor junior team members with the time, resources and support to focus on building differentiated applications.
Data engineers can use LLMs and GenAI tools to develop their prompt engineering skills and increase their familiarity with coding templates. It can help if engineers stay sharp on computer science fundamentals, increase their proficiency in popular languages in the data and AI/ML spaces like Python and Java, and understand real-time data processing and event guarantees.
This article originally appeared on The New Stack.
The Confluent for Startups AI Accelerator Program is a 10-week virtual initiative designed to support early-stage AI startups building real-time, data-driven applications. Participants will gain early access to Confluent’s cutting-edge technology, one-on-one mentorship, marketing exposure, and...
This series of blog posts will take you on a journey from absolute beginner (where I was a few months ago) to building a fully functioning, scalable application. Our example Gen AI application will use the Kappa Architecture as the architectural foundation.