How To Unlock Better Data Analytics with AI Agents
The new year brings with it new use cases. The speed with which the data industry evolves is incredible. It seems that the LLMs only appeared on the wider scene just a year ago. But we already have a plethora of exciting applications for it across multiple businesses, ranging anywhere from automaton of the admin tasks to doing data analytics. I was just at a Microsoft partner event the other week and companies run dozens of Copilot proofs of concept (PoCs). While there are still few full scale implementations of AI, it is just a matter of time before it goes mainstream.
Naturally, we are full on busy helping to bring the power of AI to the “business as usual” (BAU). The hardest part now is converting AI PoC into production applications. There are numerous issues standing in the way, such as technical, organisational, financial, and regulatory. It is not simple working through them all (and some will remain challenging for years to come).
Technical Challenges
Scalability Issues: Pilots are easy. AI pilots often run on limited datasets and controlled environments. Scaling them to real-world applications requires more data, infrastructure, automation and performance optimisation.
Data Quality & Availability: Many pilots succeed because they rely on curated data. In BAU, maintaining high-quality, real-time, and unbiased data becomes a challenge.
Integration with Legacy Systems: Many companies operate on legacy software that may not be compatible with AI solutions, usually requiring costly modifications.
Model Drift: AI models need continuous monitoring and retraining as business conditions and data patterns change over time.
Organisational Challenges
Lack of AI Literacy: Employees resist AI adoption due to the fear of job displacement or a lack of understanding of AI benefits. Fear is a powerful blocker.
Cultural Resistance: Shifting from traditional processes to AI-driven decision-making requires a change in mindset and company culture. It’s a classic case of a technological revolution and people are naturally resistant to the change. They don’t understand how the AI works “under the bonnet”, so they view it with suspicion.
Limited Talent & Expertise: AI requires skilled professionals for deployment and maintenance, but talent shortages slow down implementation. It is becoming easier to work with AI as the tooling catches up. But talent shortages will continue to bite for the foreseeable future.
Financial & Resource Constraints
High Implementation Costs: Scaling AI solutions often requires significant investments in cloud computing, data infrastructure, and cybersecurity. The CFOs quickly change their minds when they see the numbers. But the good news is the AI will get cheaper to run as the tech becomes better optimised.
Unclear ROI: AI pilots are usually low-cost experiments, but proving long-term ROI for full deployment is complex and uncertain. As much as I am a proponent of the tech, a lot of the pilots appear to be solutions in search of a problem now. Until the use cases become clear, the ROI story will remain dubious.
Ongoing Maintenance Costs: AI systems need continuous monitoring, retraining, and updates, which can be costly. You either have to do it in-house, meaning expensive talent, or you outsource it to specialist houses (not cheap either).
Regulatory & Compliance Issues
Data Privacy & Security: AI often relies on sensitive user data, which must comply with GDPR, HIPAA, or local regulations. These are huge concerns for banking, insurance and healthcare industries at the moment.
Ethical & Bias Concerns: AI systems must be fair and unbiased, and companies need governance structures to prevent unintended discrimination.
Industry-Specific Regulations: In sectors like healthcare and finance, AI models require regulatory approval (multiple layers of such), slowing down deployment.
Business Process Alignment
Defining Clear Use Cases: Many AI pilots are exploratory, but moving to BAU requires well-defined business objectives and alignment with strategic goals.
Ownership & Accountability: Who owns the AI system once it becomes BAU? We see many companies struggle with defining responsibility between IT, data science teams, and business units.
User Adoption & Trust: Employees and customers need confidence in AI-driven decisions. Transparent explanations and user-friendly interfaces are critical. No one likes a “black box” answer, especially if you then need to clearly justify the answers.
The IOblend AI agent use case
We are working to help companies navigate the transition to the AI in BAU. Naturally, we focus on making the application of AI simpler, regardless of the use case. We are a data integration company at its core, so we focus on the technical parts. If we can take away the cost and time-to-value obstacles, we can accelerate the cultural and regulatory journeys too. That’s the notion.
I am a practical guy. Theory is exciting and all, but unless I can see AI work in practice, I won’t be buying it. And neither do many of the CFOs, incidentally. Using that view as our guiding principle, we had to make sure our customers can see the immediate benefits.
As a data integrator, we applied the power of LLMs to the ETL process. Our latest use case focused on combining structured data from an Oracle database, three operational systems and a document repository (essentially X12 messaging and PDF contracts) to provide a clear financial overview of the operations.
Embedding AI into the ETL
The brilliance of IOblend is that by design, it allows implementation of all sorts of stored procedures using a Python script. What it means is that we can embed AI agents into the data pipeline. As you can see on the diagram above, we pull the data from the unstructured documents, various systems, validate and transform it on the fly. Essentially, what this means is that we do data grounding automatically.
We invoke a stored procedure that makes a specific LLM to find and analyse an unstructured PDF. The LLM then runs the data extraction logic for structured data based on the agentic parameters. The results are then plugged into the ETL pipeline to perform validations. Abnormalities are quarantined for an SME to check (~5%).
We then enrich the output with historical and operational data to create a rounded picture. All of this is event based and runs CDC. For this client, we output to a cloud for archive (AWS in a JSON format) and an operational system as a packaged data product for immediate consumption. We then have another AI agent running to enhance the work of the analysts.
The benefits of using AI agents as part of the data integration process
What we achieved here was several benefits that were directly attributable to the positive ROI:
- Embedding the AI agent into the data integration process, we saved the employees many work hours of sifting through the contract information. We tuned the pipeline to validate the output thoroughly to ensure everyone was comfortable with reliable (and repeatable) output. Any abnormalities are flagged and quarantined for an SME to investigate.
- We improved security of the document handling as the access was now strictly controlled.
- We integrated the information from the unstructured documents with the structured data on-the-fly, removing the need for post processing in a staging layer. We used all our usual in-built data management, quality and validation steps. This saved developer time and improved the reliability of the output.
- We then made the output easily digestible by the SMEs via another AI agent, thus removing the need for complex prompt engineering. This saved SME time and unlocked new ways to view and analyse the data.
The other benefit was the shift in the outlook towards the benefits of AI. What was really interesting to see was the shift in attitudes. Folks who eyed the tech dismissively, were now at least more curious. It’s a small step, I know, but a step in the right direction. They have already inquired about further AI applications within their own functions.
In conclusion
What people want to see is “the helpful colleague” AI, not something they are scared of or get easily frustrated with. They also do not want to constantly write prompts or even learn how to write prompts correctly – they don’t have time for that. But they do want to use the tech if they could access it more easily. They don’t want to constantly question the outputs either (poor prompts will do that), but do want to have an oversight of the final output. They are actually quite happy to do a visual scan of the outputs even if to say that the people are still making the ultimate decisions. If we can enable them to use the AI this way, it’s all for the better.
Oh, and BTW, we did the above integration in three weeks. In prod. Once the LLM is trained on the client data, the integration job takes no time at all with IOblend. If you deliver fast results, people take notice.
If you want to learn more about how we do data integration and can help you, do feel free to get in touch. We love what we do and want to make complex data integration simple for all.
IOblend presents a ground-breaking approach to IoT and data integration, revolutionizing the way businesses handle their data. It’s an all-in-one data integration accelerator, boasting real-time, production-grade, managed Apache Spark™ data pipelines that can be set up in mere minutes. This facilitates a massive acceleration in data migration projects, whether from on-prem to cloud or between clouds, thanks to its low code/no code development and automated data management and governance.
IOblend also simplifies the integration of streaming and batch data through Kappa architecture, significantly boosting the efficiency of operational analytics and MLOps. Its system enables the robust and cost-effective delivery of both centralized and federated data architectures, with low latency and massively parallelized data processing, capable of handling over 10 million transactions per second. Additionally, IOblend integrates seamlessly with leading cloud services like Snowflake and Microsoft Azure, underscoring its versatility and broad applicability in various data environments.
At its core, IOblend is an end-to-end enterprise data integration solution built with DataOps capability. It stands out as a versatile ETL product for building and managing data estates with high-grade data flows. The platform powers operational analytics and AI initiatives, drastically reducing the costs and development efforts associated with data projects and data science ventures. It’s engineered to connect to any source, perform in-memory transformations of streaming and batch data, and direct the results to any destination with minimal effort.
IOblend’s use cases are diverse and impactful. It streams live data from factories to automated forecasting models and channels data from IoT sensors to real-time monitoring applications, enabling automated decision-making based on live inputs and historical statistics. Additionally, it handles the movement of production-grade streaming and batch data to and from cloud data warehouses and lakes, powers data exchanges, and feeds applications with data that adheres to complex business rules and governance policies.
The platform comprises two core components: the IOblend Designer and the IOblend Engine. The IOblend Designer is a desktop GUI used for designing, building, and testing data pipeline DAGs, producing metadata that describes the data pipelines. The IOblend Engine, the heart of the system, converts this metadata into Spark streaming jobs executed on any Spark cluster. Available in Developer and Enterprise suites, IOblend supports both local and remote engine operations, catering to a wide range of development and operational needs. It also facilitates collaborative development and pipeline versioning, making it a robust tool for modern data management and analytics

Unify Clinical & Financial Data to Cut Readmissions
Clinical-Financial Synergy: The Seamless Integration of Clinical and Financial Data to Minimise Readmissions 🚑 Did You Know? Unnecessary hospital readmissions within 30 days represent a colossal financial burden, often reflecting suboptimal transitional care. Clinical-Financial Synergy: The Seamless Integration of Clinical and Financial Data to Minimise Readmissions The Convergence of Clinical and Financial Data The convergence of clinical and financial

Agentic Pipelines and Real-Time Data with Guardrails
The New Era of ETL: Agentic Pipelines and Real-Time Data with Guardrails For years, ETL meant one thing — moving and transforming data in predictable, scheduled batches, often using a multitude of complementary tools. It was practical, reliable, and familiar. But in 2025, well, that’s no longer enough. Let’s have a look at the shift

Real-Time Insurance Claims with CDC and Spark
From Batch to Real-Time: Accelerating Insurance Claims Processing with CDC and Spark 💼 Did you know? In the insurance sector, the move from overnight batch processing to real-time stream processing has been shown to reduce the average claims settlement time from several days to under an hour in highly automated systems. Real-Time Data and Insurance

Agentic AI: The New Standard for ETL Governance
Autonomous Finance: Agentic AI as the New Standard for ETL Governance and Resilience 📌 Did You Know? Autonomous data quality agents deployed by leading financial institutions have been shown to proactively detect and correct up to 95% of critical data quality issues. The Agentic AI Concept Agentic Artificial Intelligence (AI) represents the progression beyond simple prompt-and-response

IOblend: Simplifying Feature Stores for Modern MLOps
IOblend: Simplifying Feature Stores for Modern MLOps Feature stores emerged to solve a real challenge in machine learning: managing features across models, maintaining consistency between training and inference, and ensuring proper governance. To meet this need, many solutions introduced new infrastructure layers—Redis, DynamoDB, Feast-style APIs, and others. While these tools provided powerful capabilities, they also

Rethinking the Feature Store concept for MLOps
Rethinking the Feature Store concept for MLOps Today we talk about Feature Stores. The recent Databricks acquisition of Tecton raised an interesting question for us: can we make a feature store work with any infra just as easily as a dedicated system using IOblend? Let’s have a look. How a Feature Store Works Today Machine

