Interview: The distinction between ESB and ETL is no longer meaningful relative to business requirements

EAI: ETL versus ESB systems

There are many abbreviations – ETL, ESB, EAI, EDI, SOA and APIM for instance – relating to data traffic, and it is easy to get lost in them. Some of them refer to similar things, others dovetail neatly together, and yet others overlap in some aspects.

In this interview, Edouard Cante, General Product Manager at Blueway, shares his view of the difference between ETL and ESB, and what he believes the consequences are for businesses.Is this the right discussion to have, in view of business requirements?

In a previous article, we reached the conclusion that ESB and API management were two sides of the same coin. Does the same apply to ESB and ETL?

No, but their respective scopes do partly overlap: they both relate to data transport and transformation within an information system. Given their nature, it could be decided there is a choice to be made, depending on the data flow type. This was not the case between ESB and APIM, which each complement the other.

The historical difference is primarily based on the architectural dimension. The specific features of ETL and ESB are not easy to see for those who are not experts in data management. In addition, the landscape has changed over recent years.

Historically, and perhaps somewhat simplistically, ETL is effective in handling huge volumes where performance is important, but the number of interchanges is not that great. It processes data as a set, wholesale. For example, if you want total turnover broken down by customer, ETL will aggregate all the rows and apply a process to all of them. This set-based approach is adopted in particular by Business Intelligence and Data Warehousing. In contrast, ESB is effective when processing large numbers of high-frequency interchanges, each with a limited volume of data, combined with an algorithmic aspect to the processing. It certainly plays a role in de-siloing data and supporting a Service-Oriented Architecture by acting as a secure exchange bus across the entire IS.

To summarise using extreme examples, ETL is used to build data warehouses from ERP and CRM systems. ESB is for a business that wants to use a semi-interface to expose estimates, orders and customers from its CRM, for instance, and allow other applications to fetch this data by connecting to the application bus.

Edouard CANTE

However, I firmly believe these historical differences now verge on the simplistic. Can anyone really say these days that they are buying an ETL system solely for BI?

Why do you believe that the distinction between ESB and ETL no longer holds true?

Another difference often put forward is that ETL is a “pull” technology, that works on demand, whereas ESB is a “push” technology, that produces messages. But from the customer’s standpoint, is it feasible to only pull or only push? When a business has implemented an ETL system, and then needs to “push”, it can’t be built into each separate application individually. That requires specific developments and highlights the problem at the heart of this distinction.

While compartmentalising ESB and ETL was reasonable ten years ago, I firmly believe it no longer holds true from a business point of view.  Different concepts do not need to entail different solutions. Asking the IT department to choose between two different tools depending on whether it wants to expose or interact is approaching the problem from the wrong angle. The business requirement cannot be constrained by technical factors.

For the majority of businesses, there is no ROI from installing a pure ETL on the one hand and a pure ESB on the other.

Edouard CANTE

How has the market developed since the appearance of the ETL and ESB concepts?

Originally, there were some genuine technological differences. The markets split and have developed separately. Some ETL publishers have encroached on ESB’s patch to win market share, and vice versa. At the same time, marketing pitches have compartmentalised requirements based on technology, to confirm their relative positioning. ESB software publishers have also sought to define themselves as purists.

The battle has mainly been waged on a technological front, and not in relation to the actual requirements. This is a mistake!

Edouard CANTE

The boundaries consequently became blurred from the customer’s point of view, and simplistic stances have been the result. There is widespread misunderstanding about what an ESB is, and its potential. An ESB was sometimes summarised as a data transporter, completely neglecting its role in organising data traffic in its entirety. The result is that many projects have failed because of these dogmatic approaches.

I saw one example with a retailer, where the project started with a theoretical mapping of data traffic on a magnificent diagram, to argue the case for 100% ESB. Ultimately, when the project kicked off, this extreme position came up against the cold reality that some applications could not use it in real time. The project was a complete failure.

So if making ETL and ESB mutually exclusive is a mistake, what is the real issue?

The real issue is to see things the other way round. It is because of marketing pushing “either ESB or ETL” that we have these failures! This dichotomy no longer makes any sense in most cases.

The IT department finds itself forced to choose between two tools when its actual requirement is for both: in most cases, it wants to de-silo data and circulate it around the various applications in the IS, produce some BI and centralise data within an MDM system. Matching each concept to a different solution forces it to deviate from the business requirement with no benefit to itself.

They should instead be joined. The solution should meet the business requirement, and not some technological dogma.

Edouard CANTE

This challenge arises even in user communities that have some proficiency in the tools rather than the concepts. Users then have a reduced perspective on data flow management as such. It is by understanding the concepts that users can become more proficient, and more readily adopt different tools.

If the question is not ETL versus ESB, what questions should be asked to re-engineer systems successfully?

First, start by accepting reality, and dealing with it. While it is important to map processes and take a step back to gain some perspective, a theoretical outlook of “everything will be SOA” or “everything will communicate using APIs” will not pass muster. A vision of the future is fine, but in reality, applications need to be able to communicate now!

The IT tools need to converge to adapt to what the business needs, not the other way round. The objective is to unify collaboration in the organisation.

The battle between ETL and ESB is pointless. If IT department needs are to be met, they cannot continue to be separate. The distinction now is between data transport and transformation on the one hand, and data exploitation on the other.

Edouard CANTE

If there has to be a difference, I would place it at another level. We can differentiate data transformation solutions, that provide secure data traffic and make data available to highly functional data preparation tools, for specific roles, such as data scientist, BI, etc.

Business areas consequently have access to highly functional tools, designed for them, and the data manager keeps the central role of ensuring the quality, transformation and availability of data. It is these data preparation tools that are changing the market. Independent business departments make perfect sense, without requiring them to get involved in how data circulates. Data traffic has to meet the crucial challenges of performance and legality.

Data traffic should be understood as a whole, regardless of the method used to convey data for a particular business need. The real difference in terms of technical tools is now between data transformation and data preparation.

Edouard CANTE

I therefore firmly believe that it is necessary to both meet current standards and also handle the famous legacy systems that any information system is still running. Customers using ETL/ESB/EAI solutions have to use them with these applications, they have no choice. As software publishers, we therefore also have to do so. We have to be the toolbox that enables customers to move data around.

At Blueway, we have never set much store in the difference between ESB and ETL. Our wish is to provide a comprehensive response to IS re-engineering challenges with a modular platform that unifies various concepts in terms of business issues and people. There are some extreme cases where an ultra-ETL system is necessary, but in reality they are few and far between.

Get in touch
with a Blueway Expert