PromQL

tags : Infrastructure,Database,Observability

Doubts ⚠

Placement of `by`, are these the same?

sum by (job)(rate(http_requests_total{job="node"}[5m]))

sum (rate(http_requests_total{job="node"}[5m]))  by (job)

what are modifiers (offset, bool)

TODO Range Vector vs Range Selector vs Range Query?

Different things?
PromQL query and HTTP query different things obviously

Terms	Description
Range Vector
Range Vector Selector
Range Query (HTTP)
Range function
Instant Vector
Instant Vector Selector
Instant Query (HTTP)

Instant vector

FAQ

Other Notes

One word of warning when using timestamps on exported metrics, you can easily run into staleness problems which will create gaps in Prometheus. Or if the timestamp is too far off, Prometheus will drop the data entirely. - the great superQ
https://github.com/cloudflare/pint
Why are Prometheus queries hard? - Blog

TSDB vs AWS Cloudwatch

Is there a cloudwatch equivalent for prometheus Counter?

Data Model

# stores values over time where the identifier stays the same.
<identifier> -> [(t0,v0), (t1,v1), ...]
t0 = int64, miliseconds unix timestamps
v0 = float64
# each of the following are totally different time series
prometheus_http_requests_total{code="200", handler="/api/v1/query", instance="localhost:9090", job="prometheus"}
prometheus_http_requests_total{code="200", handler="/graph", instance="localhost:9090", job="prometheus"}

We use metric name with labels for the identifier
Vertical writes: In a short timeframe, we update every current active time series not necessarily at the same time but in one scrape cycle.
On disk: Each two-hour block consists of a directory containing one or more chunk files that contain all time series samples for that window of time, as well as a metadata file and index file.
index file: indexes metric names and labels to time series in the chunk files.

Functions & Operators

Type	Sub Types	Operands (expression types)	Label dropping
Operators	Binary	Scalar & Instant Vectors	Does not drop labels, unless it’s the match label
	Aggregation	Instant vectors	drops based on use of `by` / `without`
Functions	-	Can be anything depending on the function
Modifiers/keywords	-	depends, Eg. `bool`, `on`, `ignoring`, `group-left` etc.

Operators

Binary

Types
- Arithmetic binary operators + - \* / % ^
- Comparison binary operators = != > < >= <==
- Logical/set binary operators and (intersection) | or (union) | unless (complement)
Works with both scalar and instant vectors; let & be some binary operator.
See official docs for proper info

Operation Type	Evaluation	Resultant	Impact
Arithmetic (`+,- etc.`)	Scalar & Scalar	Scalar	Scalar
	Scalar & Instant Vector	Instant vector	Metric name dropped
	Instant Vector & Instant Vector	Instant vector	Metric name dropped, non-matching entries dropped, group labels added

Comparison (`==,!=,>= etc.`)	Scalar & Scalar	Scalar	w `bool`: 0 or 1
	Scalar & Instant Vector	Instant vector	w/o `bool`: drop, w `bool`: 0 if false, 1 if true
	Instant Vector & Instant Vector	Instant vector	Filter, w/o `bool`: drop not matching, w `bool`: 0 if false, 1 if true

Logical (`and, or, etc.`)	Scalar & Scalar	DOES NOT APPLY
	Scalar & Instant Vector	DOES NOT APPLY
	Instant Vector & Instant Vector	Instant vector	Result depends on `or`, `and` / `unless`

Matching for instant vectors (on and ignoring)
- When doing instant vector x instant vector operation, we need to do matching of LHS & RHS
  - i.e For an operation to happen between instant-vector (s), they must “match”. To make things match, we can use on and ignoring as needed.
  - Simply using on and ignoring handles 1-1 matches, which is OK in most cases.
  - If we want n-1 / 1-n we need to use the group-right, group-left modifiers.
- matching happens based on two keywords: on and ignoring
  - These keywords are sort of part of the operation rather than the operand. i.e if you’re doing X / on(abc) Y it’s (X, / on(abc), Y). The on modifier says, only do / when things match on abc.
  - We provide the labels to match on to on/ignoring

Aggregation

only takes instant vectors as inputs and only return instant vectors as outputs. Eg. sum,min,max,stddev,stdvar

Modifiers

bool is a modifier, you usually use them right after the operatator (Eg. w/o bool: something==1 , w bool: something == bool 1)
on without a label param matches things on all

Functions

Takes argument of any promql type, gives output in any promql type.

`rate(v range_vector)`

It should be always used with counter variables as it calculates the per second increase of your counter in the specified time range.
It makes no sense of taking rate of a gauge variable.
It’s a good rule not never compare raw counters and always use rate(), rate makes use of all the datapoints returned by the range vector unlike delta because it returns the per-second average rate of increase of the time series in the range vector.
Use rate for alerts and slow-moving counters, a small dip in the rate can reset the FOR clause when alerting if using irate so prefer using rate.
irate=/instant rate is basically taking =rate of the last two samples; i.e looks back at the last two samples under a sliding window.
irate should only be used when graphing volatile, fast-moving counters. See comparison here.

When taking rate it’s advisable to take the range time-frame be 4x the scraping interval. Prometheus default scrape_interval is 15s so it should be minimum 1m because rate needs at least two data points to calculate the rate of increase.

More on rate

Querying

There are terms such as Instant Queries,Range Queries, Instant and Range vector selectors, Offset Modifiers, Subqueries, the time, start and end query parameters, grafana’s $__interval, steps and resolutions, which can be pretty intimidating at first to understand how all these relate. The documentation actually explains everything nicely but here’s a summary.

There are basically 2 ways you can select time series, they are called the time series selectors;

instant vector selectors
range vector selectors.

The offset modifier can be used to get historical instant or range vectors which can be useful for alerts when comparing against the past.

Instant vs Range vectors (Incorrect)

Instant vector returns the most recent value for any time series. This is how are we getting an instant vector at any(past,present) point in time even if the scrapes are happening at specific intervals.

Range query basically just dumps samples back from the current instant. The timestamps for different timeseries will mostly be different but the it’s interesting to notice that the samples mostly differ by the scrape_interval.

The API

Request and Response combinations (INCORRECT)

THIS PART MIXES range query with range vector , this is super wrong. A range query can return an instant vector/scalar other things. i.e ValScalar ValVector ValMatrix ValString

See https://pkg.go.dev/github.com/prometheus/client_golang/api/prometheus/v1#API for QueryRange and its return type.

i.e range query and range vector have no direct relation

Following needs fixing, i’ll comeback later.

Instant Vector

See the parameters here.

endpoint	query	notes	`resposeType`
`/query`	instant vector with `time`	The prometheus UI in console mode uses this.	vector
`/query`	instant vector with `time` and `offset`	Even if offset is set and values will be different, the timestamp of an instant vector result is always that of the `evaluation time`.	vector
`/query_range`	instant vector with `start`, `end` and `step`	The prometheus UI in graph mode and grafana uses this to get instant vectors to plot.	matrix
`/query_range`	instant vector with `start`, `end` , `step` and `offset`	Similar to `/query` with `offset`	matrix

Range Vector

See the parameters here.

endpoint	query	notes	`resposeType`
`/query`	range vector with `time`	-	matrix
`/query`	range vector with `time` and `offset`	The timestamps shown will be from the time of the `offset`	matrix
`/query_range`	range vector with `start`, `end` and `step`	Not Allowed	-
`/query_range`	range vector with `start`, `end` , `step` and `offset`	Not Allowed	-

Steps and Resolution

Chris’s Wiki blog/sysadmin/PrometheusQuerySteps

These are specific to range queries(/query_range).

Instant queries(/query) do not have a step parameter, as they are evaluated at a single point in time (the current time, or a custom time if set)

Resolution in promethus world can be measured in seconds, eg. 1ms resolution > 1s resolution. 1ms resolution will have way higher noise. When querying prometheus data, we can use the step parameter in /query_range to set our resolution, which then evaluates the given query at each step independent of the stored samples. So we can represent larger time ranges without being accurate enough.

Sometimes query step/query resolution/resolution step/evaluation step/=interval=(source) are used interchangbly. The=step=,=start=and=end= paratemeters determines how many points you will get back.

What if you specify a resolution of 5s for a time series scraped with scape_interval set to 1m, Prometheus doesn’t have more datapoints to evaluate!

This explanation is equally true for any value for the resolution. Sample timestamps for different time series can be at arbitrary intervals and not time-aligned with each other but you still need to be able to select multiple series and aggregate over them, so they need to be artificially aligned.

The way promql does this by having an independent evaluation interval (the resolution step that you chose) for the query as a whole, independent of the underlying details of the data. So scrape_interval=1m and step=5s, simply means that new data is put into the time series at every scrape_interval but query is evaluated at each step for every timeseries that is matched for that identifier. Functions like rate don’t know whether they’re being called as part of a range query, nor do they know what the step is.

promql engine basically runs an evaluation loop where we start at the range’s start timestamp, then increase the timestamp by interval on every iteration, and abort the loop when the timestamp becomes larger than the range’s end timestamp.

Example timeseries: [(t,v)] => [(1,1),(4,4),(7,7),(11,11),(15,15)]

Running the evaluation loop on this time series with start=0,=end=16= and step=5 will give us [(0,0),(5,4),(10,7),(15,15)]

PromQL selects the value of the sample that is the most recent before (or exactly at) the evaluation step timestamp, it never looks at “future” data, so for t=0=/=start=0, the result would simply be empty instead of 1.

Prometheus UI and Grafana

Grafana has a few parameters/options to access the prometheus api endpoints in a few different ways, it’s best to consult the official grafana prometheus datasource doc.

Since Grafana has to deal with graphs and not just the metrics, it has to take extra care when plotting things so that there is never more data points than the the number of has pixels the graph has etc. It has few options which can be easily confused with what parameters prometheus api takes in.

Tweaking these grafanap options really depends on what one wants to see in the graph, but there are some variables and options which are not well stated in the official documentation.

$__interval (seconds) : It is a global variable that is dinamically calculated based on the time range and the width of the graph(the number of pixels). There are the $__from and $__to variables to specify timerange, . The formula is $__interval = ( $__from - $__to)/resolution, resolution here means the display resolution(the width of the panel). For example, If the panel is 1099px wide and time range is 24h = 86400s, $__interval = 86400/1099 = 78s, 78s is then rounded up to 2m which is then set to the $__interval variable.
When using the prometheus data source, the value of the $__interval variable is used as the step parameter for the prometheus range queries(/query_range)
In the prometheus query editor we get another option called Resolution which is a ratio(1/1,=1/2=,…). Resolution here basically means how many pixels per datapoint, (1/1 > 1/10) This basically modifies the $__interval variable ergo the step.

Example,

If $__interval was 2m and Resolution was set to 1/1, step would be 2m.

If $__interval was 2m and Resolution was set to 1/2, step would be 2m=x2==4m.

The minimum value of step is (or can be) limited in various ways, both for the entire panel of queries (Min interval) and in the individual metrics queries (Min step)
In grafana you can make your rate() range intervals auto-adjust to fit the steps. One could use $__interval to choose an appropriate value such as rate(a_metric_name[$__interval]) for the range vector.
The range interval you likely want to use for rate() depends partly on your query step:
- Might want the rate() range interval of the query step
- Might want the rate() range somewhat more than the query step
- If we try to do something like rate(metric_name[3s]) where the step greater than the range of the vector, we’ll undersample and skip over some data! because now range durations only include one metric point and none of rate or irate can operate on one data point.
There is also the $__range,= $__{r} an g e_{s} = an d ‘$ __range_msvariables which is basicallyto - from` (CONFUSION!!!)
Takeaway, when working with prometheus and Grafana there are 3 different meanings of the same word resolution. The actual prometheus resolution/=step=(seconds), The panel width resolution and the Resolution ratio option of the prometheus datasource.

Issues

In console mode, Older react UI on time increment just increments by 10 mins, the new react UI does that by 30 mins. Sometime it increments the microseconds in the classic UI. weird.
This article points to some more complex senarios.

Stallness

At every resolution timestep, for every series the PromQL expression references, the last sample value is chosen (if it is not older than the staleness period of 5 minutes).
Note that there will be always one returned sample for every increment of the interval (5), even if it would be the same repeated value from the same sample (if “real” samples are less frequent than the eval step). The exception is when there is an explicit staleness marker or the last sample for a given series is >5min before the evaluation timestamp (promql.LookbackDelta = 5 * time.Minute).
https://www.robustperception.io/staleness-and-promql
https://github.com/prometheus/prometheus/pull/7011
https://github.com/prometheus/prometheus/issues/398
https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/

🐏 mogoz

Table of Contents

PromQL

Links

PromQL

Alerting

Read

Doubts ⚠

Placement of `by`, are these the same?

TODO Range Vector vs Range Selector vs Range Query?

Instant vector

Range Vector

Range Vector Selector

Range function

Instant vs Range Query (HTTP)

FAQ

Other Notes

TSDB vs AWS Cloudwatch

Data Model

Functions & Operators

Operators

Binary

Aggregation

Modifiers

Functions

`rate(v range_vector)`

Querying

Instant vs Range vectors (Incorrect)

The API

Request and Response combinations (INCORRECT)

Steps and Resolution

Prometheus UI and Grafana

Issues

Stallness

Subquery

Graph View

Backlinks

🐏 mogoz

Table of Contents

PromQL

Links §

PromQL §

Alerting §

Read §

Doubts ⚠ §

Placement of by, are these the same? §

TODO Range Vector vs Range Selector vs Range Query? §

Instant vector §

Range Vector §

Range Vector Selector §

Range function §

Instant vs Range Query (HTTP) §

FAQ §

Other Notes §

TSDB vs AWS Cloudwatch §

Data Model §

Functions & Operators §

Operators §

Binary §

Aggregation §

Modifiers §

Functions §

rate(v range_vector) §

Querying §

Instant vs Range vectors (Incorrect) §

The API §

Request and Response combinations (INCORRECT) §

Steps and Resolution §

Prometheus UI and Grafana §

Issues §

Stallness §

Subquery §

Graph View

Backlinks

Links

PromQL

Alerting

Read

Doubts ⚠

Placement of `by`, are these the same?

TODO Range Vector vs Range Selector vs Range Query?

Instant vector

Range Vector

Range Vector Selector

Range function

Instant vs Range Query (HTTP)

FAQ

Other Notes

TSDB vs AWS Cloudwatch

Data Model

Functions & Operators

Operators

Binary

Aggregation

Modifiers

Functions

`rate(v range_vector)`

Querying

Instant vs Range vectors (Incorrect)

The API

Request and Response combinations (INCORRECT)

Steps and Resolution

Prometheus UI and Grafana

Issues

Stallness

Subquery