Cisco Data Virtulization

Data Virtualization provides a rich set of data source connectors for RDBMS, hadoop, xml, web services, etc.

Internally, tables from different data sources can be joined together as views and the views can be exposed to the business. The views use a SQL like language and is accessible through Java API.

thoughts:

The views can be considered as standardized data interfaces regardless of where they are from. This helps to virtualize the Data Vault layer, so it can have view_HUB, view_SAT, etc. Presentation layer views can be built on top of the Data Vault views as well, so an external EDW can query the Presentation layer views and persist the data externally for query.

As Cisco Data Virtualization takes care of the joins between data sources, so it is subject to how good the execution plan is and the amount of data to be transferred across the network. If millions of rows need to be pulled out of the source, the performance could be very bad.

If the Java API supports programmatic creation of views, then the Data Vault layer can be automated.