Tuning a data profile

The defaults work, but matching a profile to your data's rhythm makes forecasts noticeably better. Here's how to choose each setting. (Settings are defined in Data profiles.)

Match interval to your cadence

The single most important setting is interval — the bucket, in seconds, that your observations are resampled onto. Set it to your data's real sampling period. If you record orders once an hour, use 3600. If you have one figure per day, use 86400.

Get this wrong in either direction and the model struggles:

  • Too fine — an interval shorter than your true cadence pads the series with filled-in buckets, spending the context window on noise and empty space instead of signal.
  • Too coarse — an interval longer than your cadence averages real structure away. An hourly demand pattern simply disappears if you bucket it to a day.

The hourly_orders profile uses interval: 3600 because store_metrics records one orders figure per hour. Start by matching the period at which your data actually arrives.

How much history: num_patches and min_patches

These two settings control the most and the least history involved in a prediction.

num_patches is the context window: the maximum number of patches of recent history the model reads. More patches means more past cycles in view, so more seasonality the model can pick up — raise it when you want it to "see" weekly or yearly structure. The gains taper off, though; past a point you are feeding in old history that no longer reflects the present, for more tokens and no better forecast.

min_patches is the floor: the minimum history a profile requires before it will serve at all. Below it, a predict call returns 409 "Not enough data for this data profile." Raise min_patches when you'd rather refuse a low-confidence forecast than return one built on a handful of points — useful for a brand-new series where an early guess would be misleading.

Think of it as a range: min_patches is "don't even try below this," and num_patches is "stop looking back past this."

Resolution vs. horizon: patch_len and patch_spacing

A patch is the chunk of points the model reads and writes at a time. patch_len sets how many points are in that chunk. patch_spacing sets the stride between the points you actually use — it downsamples the series and, in doing so, stretches how far ahead each generated step reaches.

The key relationship: the effective time step between forecast points is interval * patch_spacing seconds. With interval: 3600 and patch_spacing: 1, each step is one hour. Bump patch_spacing to 3 and each step covers three hours — the same number of generated points now reaches three times further into the future, at coarser resolution.

So this is a trade between detail and reach. Keep spacing tight when you need fine-grained near-term forecasts; widen it when you care about a longer horizon and can accept a coarser step.

Levels vs. growth: use_diffs

use_diffs (the log-diff transform) tells the model to work on relative change — log growth from point to point — instead of the raw level.

  • Turn it on for series that grow or compound multiplicatively: revenue, prices, user counts, anything with a trend or that "doubles" rather than "adds." Modeling growth keeps a rising series from being anchored to where it used to be.
  • Leave it off for bounded or stationary levels that hover in a fixed range: utilization percentage, temperature, a saturation rate. These have a natural level to return to, and diffing them just adds noise.

A quick test: if doubling the units of your series should double its typical jumps, it's multiplicative — turn use_diffs on.

Masking dead time: exclude windows

Some series have recurring stretches where nothing happens by design — a market that's closed overnight, a store outside business hours. If you leave that dead time in, the model treats those flat or zero stretches as real signal and learns a pattern that isn't there.

An exclude_window (a timezone plus a start and stop time-of-day) masks that recurring range so it's dropped from the series the model reads. Set one for the hours your series is structurally idle — say, masking overnight on a profile that reads market prices — so the open-hours behavior stays clean.

Tolerating gaps: fill_limit

fill_limit is the longest run of consecutive missing buckets that Temporis will forward-fill. A gap shorter than the limit is bridged by carrying the last value forward; a gap longer than the limit is left as a hole, and those rows are dropped.

  • Raise it to bridge longer outages — a sensor that occasionally drops offline for a while, where carrying the last reading forward is reasonable.
  • Lower it when you'd rather not invent data across big holes — when a long gap genuinely means "unknown" and a flat carried-forward line would be misleading.

Troubleshooting

SymptomLikely causeTry
409 "Not enough data"Too few points, or min_patches set too highIngest more history, or lower min_patches
Forecasts ignore the daily patterninterval too coarse, or context too shortMatch interval to your cadence; raise num_patches
Forecasts overshoot on a trending seriesModeling raw levels instead of growthEnable use_diffs
Weird values across a known closed periodDead time included in the seriesSet an exclude_window for that range
Rows dropped / series looks sparseGaps exceed fill_limitRaise fill_limit, or densify the data you send

Change one setting at a time and compare forecasts. Tuning is easier to reason about when each adjustment has a visible, isolated effect.

Related