DATAZEN 2023 - DOCUMENTATION IS UNDER CONSTRUCTION
Click here to access the DataZen 2022 User Guide
Click here to access the DataZen 2022 Installation Instructions
Introduction
DataZen is a data extraction and replication platform that allows you to copy data from any source system into any
target system, with optional automatic change detection identification, forwarding only the records
that have actually changed. Because DataZen creates an universal Change Log changes
can be forwarded to virtually any target platform in the shape they are expected in.
Use Cases
DataZen supports the following high-level use cases:
-
Centralize all you Data
Build a centralize data mart of all your key data, from any source system, including social media feeds, SharePoint Online, databases and more, so you can have a centralized view of your information quickly.
Advanced options include support for schema drifting, data enrichment, and multi-casting changes on multiple systems.
-
Copy Any Data, Anywhere
Copy records from any source system into one or more target systems.
This includes support for virtually any HTTP/S REST API, databases, no-sql databases, ODBC drivers, files (XML/JSON/CSV/Parquet), Enzo Server, and messaging platforms (RabbitMQ, Azure EventHub/Message Bus, Kafka...) both as a source, and as a target, in any possible combination.
-
Native or Synthetic CDC
Identify changes made to any source system and forward them to one or more target systems.
This capability includes support for any source system, even if the system does not offer its own Change Data Capture (CDC) mechanism.
-
Messaging Integration
Listen for messages from any supported messaging platform (Kafka, MSMQ, RabbitMQ...) and forward them to any target system or any other messaging platform, including the ability to change the batching option and enhancing the message content.
-
Data Pipeline
Transform, mask, enrich, apply schema changes, perform data quality operations, and call custom .NET libraries for advanced operations on the fly.
-
Replay, Share
Keep your changes so you can replay them later on any target platform, or safely share with business partners so they can react to your internal data or business events using an FTP, Azure Blobs, AWS Containers, or Google Drive.
The following diagram depicts the various scenarios that are possible with DataZen. Because DataZen can inspect messages, perform relevant message conversions, and supports a large number of authentication mechanisms, it can forward full record sets or only the identified changes in the correct target format.

Pipeline Architecture
To better understand how DataZen works, let's review the major components of the
DataZen Pipeline architecture.

- Source: data is read from the source system (HTTP/S API, Database, ODBC, Enzo Server, Files...); uses an optional High Watermark to only read the necessary records
- Aggregation: data is optionally aggregated for certain data sources (ex: HTTP/S APIs) using Dynamic Parameters or from Files data sources
- Change Data Capture: when Key Columns are identified, the DayaZen Synthetic CDC engine eliminates records that have not changed or been deleted
- Data Pipeline: when defined, a data pipeline executes to enrich, filter, translate or transform data
- Change Log: the Sync File (Change Log) is created for most jobs at this point
- Partitioning: the Sync File is read and the data is optionally partitionned (depending on the target system)
- Target: the Sync File (Change Log) data forwarded to the target system