Complex World of Enterprise Data Estates
What a difference a modern data architecture makes to an enterprise. We were recently asked to help stitch four downstream systems so they would “talk” to each other automatically in real-time. Our bread and butter. But what a joy it was working on this project.
The data estate was top class. Everything is in the cloud (Azure stack), central governance, straightforward data access and data quality checks in place. Very sleek. Perfectly architected for a heavily data-driven business of a recent vintage. We were in and out of there within two weeks.
On reflection, I compared it to our experience with a large, well-established enterprise. Night and day difference.
The Challenge of Large Enterprise Data Estates
Managing data in large, established enterprises, is a very challenging endeavour. The challenges arise from the sprawling and disparate nature of their data estates. Managing such estates is a constant struggle for many businesses.
That’s why we love working with large enterprises! They have data issues coming out of their ears. Particularly with integration. Their challenges never end. Even when we deliver a successful solution to one problem, we find ten more. These days everyone wants AI in everything, which uncovers a ton of data integration and quality issues long swept under the carpet.
It’s a data vendor and consultant’s field of dreams.
But I wish enterprise data estates were more streamlined. Like in the newer companies.
The Simplicity Conundrum
Most of the enterprise analytics are not rocket science. They want to develop fast insights to increase revenue and improve operations. But what they end up with is a constant struggle with departmental silos.
Try getting a report from outside your function: “Sorry, not simple to get that data. The system is old and only Helen has access to it. She’s away on holiday now. Try again in three weeks”.
The business can’t wait three weeks to generate a report.
So, what does the business do? They create a whole grassroots industry of workarounds, patches and “quick and dirty” insights from whatever data they can access. They would lobby for analytics capabilities of their own (and get them!) and buy own tools and systems.
What the enterprise gets is a bunch of additional heads and spends small fortunes on tech that serves only Marketing or only Logistics or only BI. These systems are not interconnected with each other or, at times, even with the EDW.
Such an approach is completely dysfunctional at the enterprise level – it creates more complexity and more silos that just end up collating a massive tech debt burden. Scale the issue across multiple jurisdictions, corporate entities and affiliates and imagine the implications.
The Fragmented Data Landscape
Unfortunately, simplicity of the data architecture is not something enterprises are known for. The older the business, the more complex their architecture.
Top level: Typically, data in these businesses is structured hierarchically. At the top, you have the enterprise data warehouse (EDW), a central repository that consolidates data from across the organization. If you are lucky. Often, you will find multiple disparate warehouses (on-prem and Cloud) that are not connected to each other. If you dig deep enough, you may even find a mainframe still running core functions.
Next level down: Beneath all this, there are data marts, specialized subsets of the EDW tailored for specific business units or functions. At the granular level, there are databases, which might be relational (e.g. SQL Server or Oracle) or NoSQL (MongoDB or Cassandra), that store specific sets of data, mainly just for specific applications or functions.
Frontlines: At the lower level the data is rarely made shareable in a simple manner. You need to go through gatekeepers, work with summaries without metadata or clear semantics. Have fun deciphering it. It takes weeks to discover discrepancies in the source data and trace the errors, usually only after the C-level execs called those out in reviews.
If Planning were doing strategic analysis and required data from sales, finance, marketing and operations, they could not just access it. They wouldn’t even know the originating sources or the business logic that went into compiling that data. That knowledge sits with the relevant teams and is only ever shared within. Your requests to obtain it are usually treated with deep suspicion. “Why do you need this data? We will give you only what we think you need.”
There is no coordinated knowledge of the enterprise data where someone could, with full confidence, state what is where, how it all fits together and how can one get access to it. There are islands of good data practices in every organisation, usually on an individual departmental level and central IT. But not comprehensively.
Behind the Scenes
Even if the enterprise decides to implement a coherent data strategy, it still faces a major obstacle to bring it to fruition. The technology stack and data practices are vast and varied, often shaped by what was prevalent at the inception of the business and subsequent growth phases.
Databases and Storage Systems
Mainframes, on-prem warehouses, cloud storage, data lakes can all co-exist in the same business. Each tech requires a separate skillset to develop, maintain and govern. The teams operating such systems are often distinct, adding to the complexity and cost. Intimate knowledge of the mainframes is not easy to come by these days.
Data governance
There is strict central governance for EDW and mission-critical systems. But elsewhere in the business the proper governance is often lacking, leading to all sorts of data issues. The further away you move from the EDW, the more “wild west” it becomes.
Data Integration and analytics
Central IT: You can write a history book on this. There are generations of data integration tools in use in large enterprises. They would include obscure logic written a decade ago in SSIS. No documentation exists. No one dares unpick it.
Then you get to the more sophisticated tools like Informatica that congregate around the EDW and ensure robust production pipelines and governance there.
BI: The recently established BI department came trained on Google’s stack, so they chose it for their data integration and analytics.
Finance: Finance use on-prem cubes and spreadsheets, as well as a legacy accounting software.
Marketing: Marketing people lobbied for simple-to-use ELT tools and do their data integration, so they went with Snowflake and MDS.
Ops: While the Ops guys picked Databricks, but only in two departments, where they needed real-time data. The rest rely on the trusty SSIS packages feeding their SQL Server and spreadsheet reports.
HR: The HR have SAP that never got properly integrated. So, they use a small army of analysts to wrangle the paper forms and data manually, with all associated errors that come with such process.
The organisation we encountered had hundreds of disparate systems, databases and stores they had to manage. You can only imagine the scale of their data challenges.
Given the diversity, integrating these technologies into a unified estate is a never-ending task.
As we see, navigating the world of enterprise data estates is no small feat. The sheer complexity and operating costs are a major drain on enterprise resources and one of the key drivers of data silos and inefficiencies. But this complexity is what has helped the businesses get to where they are now, so not all is bad. It all somehow works.
And keeps us data vendors and consultants in play, mind.
Ideally, every enterprise should apply a clean-sheet approach and redesign the whole estate from the ground up to make it as sleek as possible. But that is too disruptive for a larger business. They do upgrades in a modular manner. Technology has a knack for rapid advances, so a modular stack is a solid approach. You just need to make different pieces talk and work with each other seamlessly. Robust integration, governance and data management are key. The rest falls into place.
Efficient data integration and management
At IOblend, we solve such integration challenges by replacing multiple technologies and manual work with just one automated tool. We organise their technical governance and data management in one place. That way, they achieve robustness, standardisation and massive operating cost reduction for data integration. As an additional benefit, the business is later able to swap downstream and upstream systems with newer ones without much ado.
We push heavily for simplification of the data estates. Why maintain four disparate systems (legacy fears!) when you now only depend on one? Why spend money and time on five data integration and management tools to do the job of one?
Simplify your Data Estate
Enterprises are often risk-averse to big ticket upgrades, having experienced past failures. The “if it ain’t broke…” principle keeps their mainframes alive. However, older systems cost a fortune to maintain and run, let alone integrate with the new tech. Those costs are not always surfaced centrally, so they are hard to pin down. E.g. Marketing have their own budget for systems and data.
Simplification of the data estates and heavy automation will lead to material benefits. There are immediate tangible wins to upgrading (new ways to earn revenue or reduce costs). New tech breakthroughs (Cloud, GenAI) unlock new commercial opportunities, while greatly simplifying BAU.
We saved that legacy client seven figures on their integration project over twelve months by automating their data integration with IOblend and migrating data to the Cloud quickly. Just in one department. That got their instant attention.
Enterprise data estate complexity is the biggest obstacle to their progress. It breeds inefficiency and drains resources. Businesses thrive on decision speed and agility. This is how they make money. Simplifying your data estates will go a long way financially for your business.
In addressing the complexities of enterprise data estates, IOblend offers a transformative approach that harmonizes the often fragmented and unwieldy data structures within large organizations. Traditional enterprise data landscapes, burdened by sprawling, disparate systems and a mix of old and new technologies, create significant challenges in data management and integration. Enterprises grapple with a variety of systems, including legacy mainframes, on-prem warehouses, cloud storage, and data lakes, each requiring distinct skills and governance approaches. This diversity leads to silos, inefficient governance outside mission-critical systems, and a patchwork of data integration tools, resulting in inefficiencies and increased costs.
IOblend’s modern data architecture provides a streamlined solution to these challenges. By automating data integration and replacing multiple technologies with a single tool, IOblend centralizes technical governance and data management. This approach not only fosters robustness and standardization but also significantly reduces operating costs. It simplifies the enterprise data estate, eliminating the need to maintain disparate systems and multiple data integration tools. This simplification, coupled with automation, yields tangible benefits, including new revenue opportunities and cost reductions.
By implementing IOblend, enterprises can overcome the inherent complexities of their data estates. This approach promotes decision speed and agility, which are crucial for business success. For example, IOblend helped a legacy client save significantly on their integration project by automating data integration and facilitating swift cloud migration, demonstrating the substantial financial and operational advantages of simplifying and modernizing the data estate
.
Saving Cents on Data Sense: Less Cost, More Value
No company is immune from the pains of data integration. It is one of the top IT cost items. Companies must get on top of their integration effort.
Operational Analytics: Real-Time Insights That Matter
Operational analytics involves processing and analysing operational data in “real-time” to gain insights that inform immediate and actionable decisions.
Deciphering the True Cost of Your Data Investment
Many data teams aren’t aware of the concept of Total Ownership Cost or its importance. Getting it right in planning will save you a massive headache later.
When Data Science Meets Domain Expertise
In the modern days of GenAI and advanced analytics, businesses need to bring domain expertise and data knowledge together in an effective manner.
Keeping it Fresh: Don’t Let Your Data Go to Waste
Data must be fresh, i.e. readily available, relevant, trustworthy, and current to be of any practical use. Otherwise, it loses its value.
Behind Every Analysis Lies Great Data Wrangling
Most companies spend the vast majority of their resources doing data wrangling in a predominantly manual way. This is very costly and inhibits data analytics.