How To Unlock Better Data Analytics with AI Agents

The new year brings with it new use cases. The speed with which the data industry evolves is incredible. It seems that the LLMs only appeared on the wider scene just a year ago. But we already have a plethora of exciting applications for it across multiple businesses, ranging anywhere from automaton of the admin tasks to doing data analytics. I was just at a Microsoft partner event the other week and companies run dozens of Copilot proofs of concept (PoCs). While there are still few full scale implementations of AI, it is just a matter of time before it goes mainstream.

Naturally, we are full on busy helping to bring the power of AI to the “business as usual” (BAU). The hardest part now is converting AI PoC into production applications. There are numerous issues standing in the way, such as technical, organisational, financial, and regulatory. It is not simple working through them all (and some will remain challenging for years to come).

Technical Challenges

Scalability Issues: Pilots are easy. AI pilots often run on limited datasets and controlled environments. Scaling them to real-world applications requires more data, infrastructure, automation and performance optimisation.

Data Quality & Availability: Many pilots succeed because they rely on curated data. In BAU, maintaining high-quality, real-time, and unbiased data becomes a challenge.

Integration with Legacy Systems: Many companies operate on legacy software that may not be compatible with AI solutions, usually requiring costly modifications.

Model Drift: AI models need continuous monitoring and retraining as business conditions and data patterns change over time.

Organisational Challenges

Lack of AI Literacy: Employees resist AI adoption due to the fear of job displacement or a lack of understanding of AI benefits. Fear is a powerful blocker.

Cultural Resistance: Shifting from traditional processes to AI-driven decision-making requires a change in mindset and company culture. It’s a classic case of a technological revolution and people are naturally resistant to the change. They don’t understand how the AI works “under the bonnet”, so they view it with suspicion.

Limited Talent & Expertise: AI requires skilled professionals for deployment and maintenance, but talent shortages slow down implementation. It is becoming easier to work with AI as the tooling catches up. But talent shortages will continue to bite for the foreseeable future.

Financial & Resource Constraints

High Implementation Costs: Scaling AI solutions often requires significant investments in cloud computing, data infrastructure, and cybersecurity. The CFOs quickly change their minds when they see the numbers. But the good news is the AI will get cheaper to run as the tech becomes better optimised.

Unclear ROI: AI pilots are usually low-cost experiments, but proving long-term ROI for full deployment is complex and uncertain. As much as I am a proponent of the tech, a lot of the pilots appear to be solutions in search of a problem now. Until the use cases become clear, the ROI story will remain dubious.

Ongoing Maintenance Costs: AI systems need continuous monitoring, retraining, and updates, which can be costly. You either have to do it in-house, meaning expensive talent, or you outsource it to specialist houses (not cheap either).

Regulatory & Compliance Issues

Data Privacy & Security: AI often relies on sensitive user data, which must comply with GDPR, HIPAA, or local regulations. These are huge concerns for banking, insurance and healthcare industries at the moment.

Ethical & Bias Concerns: AI systems must be fair and unbiased, and companies need governance structures to prevent unintended discrimination.

Industry-Specific Regulations: In sectors like healthcare and finance, AI models require regulatory approval (multiple layers of such), slowing down deployment.

Business Process Alignment

Defining Clear Use Cases: Many AI pilots are exploratory, but moving to BAU requires well-defined business objectives and alignment with strategic goals.

Ownership & Accountability: Who owns the AI system once it becomes BAU? We see many companies struggle with defining responsibility between IT, data science teams, and business units.

User Adoption & Trust: Employees and customers need confidence in AI-driven decisions. Transparent explanations and user-friendly interfaces are critical. No one likes a “black box” answer, especially if you then need to clearly justify the answers.

The IOblend AI agent use case

We are working to help companies navigate the transition to the AI in BAU. Naturally, we focus on making the application of AI simpler, regardless of the use case. We are a data integration company at its core, so we focus on the technical parts. If we can take away the cost and time-to-value obstacles, we can accelerate the cultural and regulatory journeys too. That’s the notion.

I am a practical guy. Theory is exciting and all, but unless I can see AI work in practice, I won’t be buying it. And neither do many of the CFOs, incidentally. Using that view as our guiding principle, we had to make sure our customers can see the immediate benefits.

As a data integrator, we applied the power of LLMs to the ETL process. Our latest use case focused on combining structured data from an Oracle database, three operational systems and a document repository (essentially X12 messaging and PDF contracts) to provide a clear financial overview of the operations.

Embedding AI into the ETL

The brilliance of IOblend is that by design, it allows implementation of all sorts of stored procedures using a Python script. What it means is that we can embed AI agents into the data pipeline. As you can see on the diagram above, we pull the data from the unstructured documents, various systems, validate and transform it on the fly. Essentially, what this means is that we do data grounding automatically.

We invoke a stored procedure that makes a specific LLM to find and analyse an unstructured PDF. The LLM then runs the data extraction logic for structured data based on the agentic parameters. The results are then plugged into the ETL pipeline to perform validations. Abnormalities are quarantined for an SME to check (~5%).

We then enrich the output with historical and operational data to create a rounded picture. All of this is event based and runs CDC. For this client, we output to a cloud for archive (AWS in a JSON format) and an operational system as a packaged data product for immediate consumption. We then have another AI agent running to enhance the work of the analysts.

The benefits of using AI agents as part of the data integration process

What we achieved here was several benefits that were directly attributable to the positive ROI:

Embedding the AI agent into the data integration process, we saved the employees many work hours of sifting through the contract information. We tuned the pipeline to validate the output thoroughly to ensure everyone was comfortable with reliable (and repeatable) output. Any abnormalities are flagged and quarantined for an SME to investigate.

We improved security of the document handling as the access was now strictly controlled.

We integrated the information from the unstructured documents with the structured data on-the-fly, removing the need for post processing in a staging layer. We used all our usual in-built data management, quality and validation steps. This saved developer time and improved the reliability of the output.

We then made the output easily digestible by the SMEs via another AI agent, thus removing the need for complex prompt engineering. This saved SME time and unlocked new ways to view and analyse the data.

The other benefit was the shift in the outlook towards the benefits of AI. What was really interesting to see was the shift in attitudes. Folks who eyed the tech dismissively, were now at least more curious. It’s a small step, I know, but a step in the right direction. They have already inquired about further AI applications within their own functions.

In conclusion

What people want to see is “the helpful colleague” AI, not something they are scared of or get easily frustrated with. They also do not want to constantly write prompts or even learn how to write prompts correctly – they don’t have time for that. But they do want to use the tech if they could access it more easily. They don’t want to constantly question the outputs either (poor prompts will do that), but do want to have an oversight of the final output. They are actually quite happy to do a visual scan of the outputs even if to say that the people are still making the ultimate decisions. If we can enable them to use the AI this way, it’s all for the better.

Oh, and BTW, we did the above integration in three weeks. In prod. Once the LLM is trained on the client data, the integration job takes no time at all with IOblend. If you deliver fast results, people take notice.

If you want to learn more about how we do data integration and can help you, do feel free to get in touch. We love what we do and want to make complex data integration simple for all.

IOblend presents a ground-breaking approach to IoT and data integration, revolutionizing the way businesses handle their data. It’s an all-in-one data integration accelerator, boasting real-time, production-grade, managed Apache Spark™ data pipelines that can be set up in mere minutes. This facilitates a massive acceleration in data migration projects, whether from on-prem to cloud or between clouds, thanks to its low code/no code development and automated data management and governance.

IOblend also simplifies the integration of streaming and batch data through Kappa architecture, significantly boosting the efficiency of operational analytics and MLOps. Its system enables the robust and cost-effective delivery of both centralized and federated data architectures, with low latency and massively parallelized data processing, capable of handling over 10 million transactions per second. Additionally, IOblend integrates seamlessly with leading cloud services like Snowflake and Microsoft Azure, underscoring its versatility and broad applicability in various data environments.

At its core, IOblend is an end-to-end enterprise data integration solution built with DataOps capability. It stands out as a versatile ETL product for building and managing data estates with high-grade data flows. The platform powers operational analytics and AI initiatives, drastically reducing the costs and development efforts associated with data projects and data science ventures. It’s engineered to connect to any source, perform in-memory transformations of streaming and batch data, and direct the results to any destination with minimal effort.

IOblend’s use cases are diverse and impactful. It streams live data from factories to automated forecasting models and channels data from IoT sensors to real-time monitoring applications, enabling automated decision-making based on live inputs and historical statistics. Additionally, it handles the movement of production-grade streaming and batch data to and from cloud data warehouses and lakes, powers data exchanges, and feeds applications with data that adheres to complex business rules and governance policies.

The platform comprises two core components: the IOblend Designer and the IOblend Engine. The IOblend Designer is a desktop GUI used for designing, building, and testing data pipeline DAGs, producing metadata that describes the data pipelines. The IOblend Engine, the heart of the system, converts this metadata into Spark streaming jobs executed on any Spark cluster. Available in Developer and Enterprise suites, IOblend supports both local and remote engine operations, catering to a wide range of development and operational needs. It also facilitates collaborative development and pipeline versioning, making it a robust tool for modern data management and analytics

Rapid AI Implementation: Moving Beyond Proof of Concept

Rapid AI Implementation: Moving Beyond Proof of Concept 💻 Did you know that in 2024, the average time it took for a business to deploy an AI model from the experimental stage to full production was approximately six months? Bringing AI Experiments to Life The journey of an AI project typically begins with a “proof

May 6, 2025

Agentic AI ETL: The Future of Data Integration

Agentic AI ETL: The Future of Data Integration 📓 Did you know? By 2025, the volume of data generated globally is projected to reach 175 zettabytes? That’s a truly enormous number, highlighting the ever-increasing importance of efficient data management. What is Agentic AI ETL? Agentic AI ETL represents a transformative evolution in data integration. Traditional

April 24, 2025

Data analytics

Break Down the Data Walls with IOblend

Break Down the Data Walls with IOblend 📑 Did you know? It’s estimated that a whopping 80% of business data is just floating about, unstructured and stuck in siloed systems. Siloed data only brings value (if at all!) to the domain it belongs to. But the true value lies in the insights in brings to

April 17, 2025

Data analytics

Put a Stop to Data Chaos with IOblend Governed Integration

Put a Stop to Data Chaos with IOblend Governed Integration 🤯💥Did you know? By 2025, the global datasphere is projected to grow to 175 zettabytes? This staggering figure underscores the sheer scale of data businesses must manage, making simplification not just a luxury, but a necessity. Today, businesses don’t have a shortage of data. What

April 11, 2025

Data analytics

Optimising Customer Experience Through Real Time Data Sync

Optimising Customer Experiences Through Real Time Data Sync 🧠 Fun Fact: Did you know that 90% of the world’s data has been created in just the past two years? That’s a lot of information to manage – and a massive opportunity for businesses that know how to use it wisely. Understanding your customers is the

March 25, 2025

How Poor Data Integration Drains Productivity & Profits

How Poor Data Integration Drains Productivity & Profits Data is one of the most valuable assets a company can possess. We all know that (and if you still do not, god help you). Businesses rely on data to make informed decisions, optimise operations, drive customer engagement, etc. Data is everywhere and it’s waiting for us

February 27, 2025

admin

See Full Bio