SOLUTIONS PARTNERS BLOGS CONTACT

 Introduction
 Concepts
   Sync Agents
   Jobs Readers & Writers
   Synthetic CDC
 Security
 Connections 
     Overview
     Databases:
      SQL Server
      MySQL
      Oracle
      Teradata
      ODBC
     Drives/FTP:
      Local Path/UNC
      AWS S3 Bucket
      Azure Container/ADLS
      Google Drive
      FTP
     HTTP/S:
      HTTP/S Connection
      Rate Limiting
      OAuth Tokens
     Messaging:
      AWS SQS
      Azure Event Hub
      Azure Service Bus
      Google PubSub
      Kafka
      MSMQ
      RabbitMQ
     Big Data
      Google BigQuery
      Azure CosmosDB
 Jobs
     High Watermark Values
     Dynamic Jobs
     Job Readers 
      Database Reader
      Drive Reader
      HTTP/S Reader
      Messaging Consumer
      Big Data

     Job Writers 
      Create From Reader
      Database Target
      Drive Target
      HTTP/S Target
      Messaging Producer
      Big Data Target

     Triggers
     Data Pipelines
 DataZen Functions
 Resync & Replay
 Sync Agent API

High Watermark / Pointers

Some jobs support the ability to track the "last highest value" of a field from the data source so that future calls can retrieve only the data that changed. Normally, this value is a DateTime or Timestamp data type, or an integer (or long) value. For example, a database system may have a timestamp field that can be used for a High Watermark. Twitter offers an id field that contains a numeric value that keeps growing. A SharePoint List contains a LastModified field that can be used for this purpose.

Generally speaking a High Watermark is used as an optimization technique that limits how future data is retrieved so that only the changes are extracted. High Watermarks values are usually not necessary when the source system is a CDC stream itself or a messaging platform.

Specifying a Job Reader High Watermark

When creating or a updating a job reader, you can specify the High Watermark field to use in the Timestamp / DateTime field in the Replication Settings tab.

The initial values can be set manually if the intent is to avoid reading all source records the first time the job runs. To do so, click on the Set initial pointers (high value)... link.

This screen shows how to define the Last Read Value setting of a job. However, some jobs can also hold a last "deleted" pointer, seperately from the last "read" pointer. See the Database Job Reader section for more information.

Specifying an HTTP Job Reader High Watermark

When creating an HTTP job reader, you can specify the High Watermark field to use in the High Watermark field in the Capture Strategy tab. In order to set this field, you need to first set a Capture Strategy that uses a WINDOW operation (WINDOW READ or WINDOW READ + CDC).

The initial values can be set manually if the intent is to avoid reading all source records the first time the job runs. To do so, click on the set initial values... link.

When editing an HTTP Job Reader, the screen looks a bit different; however, the same fields are visible on the screen and can be modified the same way.

View/Editing High Watermark Values

To edit High Watermark values (last read or last deleted), select the desired job from the list of jobs in DataZen Manager. Shortly after clicking on it, the right panel shows most job settings, including the current High Watermark value in Last Read Pointer (in this example, a DateTime).

If the job holds a High Watermark, the Edit Pointers button on the right panel will be enabled; click on it.



This screen shows both the Last Read Pointer and Last Delete Pointer when available. You can manually edit the value. The value can be modified as follows:

  • Reset (null): resets the value to NULL; all available data will be read again
  • Date/Time Value: Selects a date/time value from a date picker
  • Numeric Value: Enter a numeric value
  • Custom Value: Free-form text

In some cases, this setting may be an array when a job holds multiple pointers. When entering a date as free-form, use the following notation: YYYY-MM-DD hh:mm:ss.nnn

If not set correctly, changing this value may cause the job to fail.






601 21st St Suite 300
Vero Beach, FL 32960
United States

(561) 921-8669
info@enzounified.com
terms of service
privacy policy

PRODUCTS

ENZO SERVER
ENZO DATAZEN

SOLUTIONS

SOLUTIONS OVERVIEW
LOW-CODE INTEGRATION
SHAREPOINT INTEGRATION
RABBITMQ INTEGRATION
HYBRID SHARDING
READ/WRITE PARQUET FILES
SQL SERVER HYBRID QUERIES

RESOURCES

ENZO AZURE VM
BLOGS & VIDEOS
IN THE NEWS
ENZO ADAPTERS
ONLINE DOCUMENTATION
TCO CALCULATOR

COMPANY

LEADERSHIP TEAM
PARTNERS


© 2023 - Enzo Unified