Dataset Materialization Overview
Observe powers its Observability Cloud with a process called materialization, where background transform queries incrementally compute and update datasets. These queries process incoming events and maintain an up-to-date dataset state, enabling fast queries without reprocessing all data from scratch. When a dataset definition is modified, this process—often termed re-acceleration or re-materialization—adapts to reflect the changes.
The Acceleration Window
The acceleration window is the time range for which a dataset is pre-processed and kept readily accessible. For example:
-
Upon creating or editing a dataset, Observe automatically accelerates the prior 7 days of data.
-
As new events arrive, the window extends forward to keep the dataset current.
Re-acceleration occurs when changes force Observe to recompute this materialized state beyond routine updates.
Dataset re-materialization in Observe refers to the process of recomputing and updating the state of a Dataset based on changes to its input data or configuration. This process is part of Dataset acceleration, which ensures that datasets are ready for fast and efficient querying.
How Re-Acceleration Works with Modified Dataset Definitions
When you modify a dataset’s definition—such as altering its OPAL pipeline, schema, or freshness settings—Observe triggers re-materialization to ensure the dataset aligns with the new configuration. Key triggers include:
-
Edits to the OPAL Pipeline: Changing the logic (e.g., adding a filter or aggregation) requires reprocessing the data to match the updated definition.
-
Schema Updates: Modifying fields or data types invalidates the prior materialized state, prompting a full recomputation.
-
Freshness Goal Changes: Tightening the goal (e.g., from 10 minutes to 1 minute) increases re-materialization frequency to meet the new standard.
-
Historical Data Expansion: Manually accelerating a broader time range (e.g., via “Accelerate Full Range”) re-materializes data outside the existing window.
During re-acceleration, transform queries reprocess affected data—newly ingested events, historical records, or both—consuming Observe Compute Credits (OCC) based on the volume and complexity of the operation.
When you modify a dataset definition—such as updating its OPAL pipeline or schema—Observe allows you to manually accelerate the dataset for up to 30 days of historical data without requiring intervention from Observe support. This flexibility lets you quickly re-materialize the dataset to reflect the changes over a substantial time range, ensuring your queries remain accurate and fast. To initiate this, you can use the “Accelerate Full Range” option or specify a custom range within that 30-day window, and the process will leverage available Observe Compute Credits (OCC) to complete the re-acceleration.
However, if you need to accelerate data beyond 30 days, we recommend contacting Observe support first. This step helps avoid unexpected costs, as re-materializing larger historical ranges can significantly increase OCC consumption due to the volume of data and processing required. Support can assist in assessing the scope, optimizing the process, and ensuring your budget aligns with the acceleration needs.
Attempting to accelerate beyond the 30 days you would be presented with a message similar to:
How to potentially avoid re-materialization when modifying a Dataset
Observe does have a new Edit-forward setting to help avoid unnecessary re-materialization of datasets. This feature can optimize performance by skipping re-materialization in certain cases. However, please note that this isn’t a guaranteed solution in all scenarios. Changes to the dataset schema may still trigger re-materialization. If you do need to make changes to an underlying dataset you can reach out to us via the Support Portal and we can enable Edit-forward on a per dataset.
Additionally, Datasets that use makeresource or incremental_aggregation transforms should not be edited-forward, as they could result in errors in the transformer. Currently investigating methods to allow edit-forward in some cases
