A Peek On Denodo Data Virtualization

Yue
4 min readApr 25, 2022

--

I recently had the chance to work with Denodo, a data virtualization platform. As I’m quite new to such technology and data eng, I think its might be a good opportunity to note some learning down.

Data Virtualization

Denodo as a platform is used to virtualize data across an organization and create a single point of view. This can be useful when the data is scattered across the systems in various forms. Comparing to traditional data warehouse approach where we integrate multiple data sources into a single database, data virtualization will reduce the use of ETL as Denodo claimed and perform quick and easy analysis on different data sources.

Build a Quick Demo with Denodo

Deploy a Denodo Instance on Azure

Denodo is on Azure Marketplace, where we can simply deploy an VM with Denodo image.

Deploy the Denodo instance with free trial option

After making sure all Denodo services are up and running (I did that by using RDP to connect to the windows server and inspect the Denodo platform), we can connect to the Denodo Design Studio through our browser: http://{xx:xx:xx:xx}:9090/denodo-design-studio/#/

Replace xx:xx:xx:xx with the public IP address of your Denodo VM.

Denodo Design Studio

Connect to Data Sources

Now we have our Denodo instance up and running, we can connect try connecting to some sample data.

Denodo support various kinds of data source, which is kinda the purpose of data virtualization

I went for JDBC connection for my sample data source in Azure SQL DB

Filling the connection detail

Give the connection a name, select Azure SQL as Database Adapter and update the URL of your db along with the credentials. Click the Test Connection option on the top right, u will be able to see a notification of successgul connection:

Create a Base View from Connected Data Sources

A base view defined by Denodo is a view that directly comes from a connected data source, it can be created along with connecting to a new data source.

Select tables that you want to create base views from

Create Derived views

Derived views are results of some kinda of integration of other views. It can be created through the Denodo GUI by choosing any of these following options:

Or you can simply create a new VQL Shell and using the CREATE VIEW statements (which are close to SQL). I personally find it more flexible and easier to use than the designer UI.

The result of a derived view

With derived views u can publish them as an Web API for further analytic use.

Some Thoughts and Research

Now at this stage you should be able to have some grasp on how Denodo work. It created an virtualized SQL layer above different data sources and return an integrated result.

Is Denodo good? I think its definitely useful in some specific use cases. However, since Data Virtualization is not about replication of data, will querying on live systems cause any trouble?

Apart from that, I think Denodo is served more as a one-way data integration platform: it can serve the aggregated data for a reporting layer but is it possible to operate on these data if needed? Just reading through the OpenAPI documentation of manipulating Denodo views, I can only find end points for getting the data and posting new rows, which kinda makes sense as I can imagines updating an aggregated row won’t be an easy task. How will it reflect on the systems?

--

--