IOblend

technology

making it work

Build and run production-grade Apache Spark data pipelines

Powerful, versatile and simple - one tool for all data integration jobs

We believe in simplicity and versatility of data integration tools. This is why we created a “Swiss army knife” solution to allow you to do all data integration jobs with just a single tool.

You will drastically reduce the effort and cost of production-grade ETL development, multiple tool ecosystem maintenance and manual data wrangling.

If you want to get the most out of your valuable data or deploy the power of AI fast, choose IOblend.

Technical highlights

Desktop application

IOblend is a desktop application designed to run on the client’s infrastructure (local, on-prem, cloud or hybrid). Supports WinOS, MacOS and Linux.

Uses Apache Spark framework

Real-time, production grade, managed Apache Spark™ data pipelines in minutes. Easy-to-use designer and a powerful engine for any data pipeline architecture (ETL/ELT/ReETL).

Automatic in-built DataOps

Record-level lineage, full spectrum CDC, metadata, schema, eventing, de-duping, SCD, chained aggregations, MDM, cataloguing, regressions, windowing, compaction, embedded AI agents – all features are in-built and require no additional Spark coding.

Easy integration of real-time streaming (transactional event) and batch data

IOblend is built around Kappa architecture, which enables easy handling of both batch and real-time data, saving you time and cost.

Process streaming data at ultra low latencies (P99) and at high volumes (>1mn TPS).

Low code / no code development

Low code development without the usual downsides.

We specifically made sure you could use IOblend for any data integration job, no matter the complexity. We just abstracted away the coding complexity associated with Spark to reduce the development burden.

Use SQL or Python for data transformations to handle your business rules, specific quality policies and other constraints.

IOblend will automatically build, run and manage efficient Spark pipelines in the background for you.

Applicable for all data integration use cases

No data challenge is beyond reach with IOblend

Data migration, system integration, real-time IoT analytics, central or federated data architectures, data synchronisation, from simple ingests to full end-to-end data processing in-flight, features store to embedded AI agents – IOblend caters to all data integration use cases.

Work with any data wherever it resides

Real-time integration with Snowflake, GCP, AWS, Salesforce, Oracle, Databricks, Microsoft Azure, SAP plus many more.

IOblend connects to all data sources and sinks via JDBC, ODBC, API, ESB, dataframes or flat files.

Low latency, massively parallelised data processing

IOblend was designed with real-time data applications in mind. Execute at P99 latencies or run large batches of data – it devours both use cases.

We have optimised Spark for extreme performance on modest machinery to improve computational efficiencies and reduce cost (in prod settings >1mn transactions per sec).

Agentic AI ETL

Embed custom AI agents using our in-built Python module. Execute, validate and enrich AI output right inside your ETL pipeline, significantly reducing development time and management overhead. Bring GenAI to reality faster than ever before.

Flexible and cost-effective deployment

IOblend runs on any environment – local, on-prem, edge, cloud and hybrid.

Storage and compute are decoupled, so you can deploy the processing engine independently of your data repositories for best performance and cost.

Example: source the data from your on-prem Oracle ERP, process it using AWS EC2 or EMR, push the results into MS Azure for analytics.

Watch IOblend in action

See how easy it is to build and run Apache Spark data pipelines with IOblend.

Product Demo

How it works

IOblend consists of two core components

IOblend Designer and IOblend Engine.

IOblend Designer is a desktop GUI for interactively designing, building and testing data pipeline DAGs. This process produces IOblend metadata describing the data pipelines that need to be executed.

IOblend Engine is the heart of IOblend that takes data pipeline metadata and converts it into Spark streaming jobs to be executed on any Spark cluster.

Metadata driven development

IOblend data pipelines are defined in playbooks, which store configuration, run parameters and the business logic.

The playbook components are stored as JSON files and can be easily reused and shared among developers to speed up further development work.

The IOblend Engine dynamically converts playbook information to Apache Spark™ streaming jobs and executes them efficiently without the need to code.

Develop and run pipelines with a single software

IOblend gives you the power to develop and run data pipelines, simplifying the engineering process considerably

The intuitive Designer front end interface allows for easy, interactive data pipeline development.

You can add/remove as many dataflow components as required, build and test them one at a time, in sequence or its entirety, and create fully productionised data pipelines.

We made it very easy to build real-time streaming and batch data pipelines

Create advanced dataflows for streaming or batch data from any source and sink to any environment prem/cloud/hybrid environments – and push your data back to the source just as easily if required.

We have templated and annotated the Designer options to help you develop data pipelines faster.

Development mode

In Dev mode, we have added a visual interface to let you test and inspect each step of your data pipeline development as you progress.

Pause, amend and update your sources, transforms and sinks while running data pipelines, without the need to stop the job while you work on it. This approach makes it much easier and faster to test your logic.

You can easily see and export the schemas, pipeline components, metadata, error logs and run parameters to your preferred repositories for further reuse.

All IOblend data pipelines are stored as JSON metadata files, which means they can be placed in any code repository and versioned, just like standard software development.

Deploy anywhere

Local deployment

Local machine deployment is best suited for dataflow development.

IOblend ships with a containerised Spark environment, so it works out of the box – no need for developers, analysts and data scientists to build local Spark environments.

The software connects to the client’s data systems via the existing security protocols, so no data is ever exposed externally.

Cloud deployment

Only the Designer is installed on the local machine.

IOblend Engine is installed on any Cloud or on-prem environment as either a Linux container or directly into your Spark infrastructure such as Databricks, HDIsight, Google Cloud Proc or AWS EMR as well as on-prem Spark infrastructure.

The Linux image contains all the essential IOblend components within.

The user can easily interact and run data pipelines from the Cloud to any external systems (cloud and on- prem).

IOblend supports all major Cloud systems and allows you to work with multiple Clouds simultaneously (e.g. split your store and compute between clouds).

IOblend resides entirely inside the client’s environment, inheriting full security protocols for a complete peace of mind.

Ask us all about IOblend

Request an in-depth technical discussion and demo today.