Time Series Configuration
How Time Series Works
Step has its own implementation of time series, designed to efficiently manage and visualize large datasets over extended periods. This feature powers data visualization in the Performance tab of the execution view and the Analytics view. By grouping data into collections with different resolutions, the system optimizes both storage and retrieval for short-term and long-term analysis, ensuring high performance and resource efficiency. By default, Step uses several collections with varying resolutions to store data:
- A main resolution with a custom resolution (per default 1 second).
- Aggregated collections that group data by minute, hour, day, and week.
Each lower-resolution collection ingests data from the collection below it. For example:
- Minute-level data is aggregated from the main collection.
- Daily data is aggregated from the hourly data, and so on.
This structure allows for efficient retrieval of large datasets. When visualizing data, Step will decide the ideal collection to be used.
Data Flushing
When multiple collections are in use, data is accumulated in memory and periodically flushed to the database for performance reasons. The flush process reduces the frequency of database writes by batching the data in memory before persisting it. The flush interval can be configured using the properties below, allowing for fine-tuned control over when the data is saved to the database.
Configuring Time Series
By default, all collections are enabled, and their flush period is set to match their resolution. These default settings can be customized using the following properties, depending on specific requirements.
# The resolution of the main resolution
timeseries.resolution.period=1000
# Specifies how frequently the main collection will flush data to storage.
timeseries.flush.period=2000
# Enable the minute collection and set its flush period.
timeseries.collections.minute.enabled=true
timeseries.collections.minute.flush.period=30000
# Enable the hour collection and set its flush period.
timeseries.collections.hour.enabled=true
timeseries.collections.hour.flush.period=60000
# Enable the day collection and set its flush period.
timeseries.collections.day.enabled=true
timeseries.collections.day.flush.period=3600000
# Enable the week collection and set its flush period.
timeseries.collections.week.enabled=true
timeseries.collections.week.flush.period=3600000
Time to Live (TTL)
Each collection has a Time to Live (TTL) setting, which defines how long the data remains available in that particular collection before it is purged. You can configure TTLs based on your specific data retention needs. By default, TTL settings are set to 0 for every collection, meaning the data will never be deleted. TTL settings ensure proper housekeeping by specifying how long data will last in each collection before being removed. This process helps manage large datasets efficiently over extended periods.
TTL can be updated in the Admin Settings page of the UI.
By configuring the time series feature correctly, you ensure that performance data is stored efficiently, retrieved quickly, and managed effectively.