Extra Metrics
There are metrics provided by re_data but are not computed by default in monitored tables. You can compute them by updating the configuration for the specific table or adding them to a metrics groups which are computed by default.
__ title rental_rate rating created_at
1 Chamber Italian 4.99 NC-17 2021-09-01T11:00:00
2 Grosse Wonderful 4.99 R 2021-09-01T12:00:00
3 Airport Pollock 4.99 R 2021-09-01T15:00:00
4 Bright Encounters 4.99 PG-13 2021-09-01T09:00:00
5 Academy Dinosaur 0.99 PG-13 2021-09-01T08:00:00
6 Ace Goldfinger 4.99 G 2021-09-01T10:00:00
7 Adaptation Holes 2.99 NC-17 2021-09-01T11:00:00
8 Affair Prejudice 2.99 G 2021-09-01T19:00:00
9 African Egg 2.99 G 2021-09-01T20:00:00
10 Agent Truman 2.99 PG 2021-09-01T07:00:00
11 Airplane Sierra 4.99 PG-13 2021-09-02T09:00:00
12 Alabama Devil 2.99 PG-13 2021-09-02T10:00:00
13 Aladdin Calendar 4.99 NC-17 2021-09-02T11:00:00
14 Alamo Videotape 0.99 G 2021-09-02T12:00:00
15 Alaska Phantom 0.99 PG 2021-09-02T13:00:00
16 Date Speed 0.99 R 2021-09-02T14:00:00
17 Ali Forever 4.99 PG 2021-09-02T15:00:00
18 Alice Fantasia 0.99 NC-17 2021-09-02T16:00:00
19 Alien Center 2.99 NC-17 2021-09-02T17:00:00
Extra Table Metricsโ
distinct_table_rowsโ
(source code)โ
This metric computes the distinct number of rows in the given table
time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
distinct_rows = 10
Extra Column Metricsโ
regex_match_expression
is resolved at runtime depending on the database in use.
match_regexโ
(source code)โ
Determines the count of values in a given column that matches the specified regex. Suppose we want to check if the rating column matches a specific regular expression pattern and we define it in our dbt_project.yml file.
regex
must be specified for this metric to work else a compiler exception would be raised.
vars:
re_data:monitored:
- tables:
- name: sample_table
time_filter: created_at
metrics:
column:
rating:
- match_regex:
regex: ([0-9]+)
select coalesce(
sum(
case when {{ regex_match_expression('rating', '([0-9]+)') }}
then 1
else 0
end
), 0
) from your_table
where created_at between time_window_start and time_window_end
------------------------------------------------------------------------
1 Chamber Italian 4.99 NC-17 2021-09-01T11:00:00
4 Bright Encounters 4.99 PG-13 2021-09-01T09:00:00
5 Academy Dinosaur 0.99 PG-13 2021-09-01T08:00:00
7 Adaptation Holes 2.99 NC-17 2021-09-01T11:00:00
match_regex = 4 where created_at is between 2021-09-01T00:00:00 and 2021-09-02T00:00:00
match_regex_percentโ
(source code)โ
Determines the percentage of values in a given column that matches the specified regex.
Suppose we use the same configuration for the match_regex metric above, we have
match_regex_percent = 40 where created_at is between 2021-09-01T00:00:00 and 2021-09-02T00:00:00
not_match_regexโ
(source code)โ
Determines the count of values in a given column that does not match the specified regex.
Suppose we pass in ([0-9]+) as our regex parameter,
not_match_regex = 6 where created_at is between 2021-09-01T00:00:00 and 2021-09-02T00:00:00
distinct_valuesโ
(source code)โ
Determines the count of values in a given column that are unique.
rating count
-----------------
PG-13 2
G 3
NC-17 2
PG 1
R 2
time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
distinct_values = 5. (PG)
duplicate_valuesโ
(source code)โ
Determines the count of values in a given column that are duplicated.
rating count
-----------------
PG-13 2
G 3
NC-17 2
PG 1
R 2
time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
duplicate_values = 4. (PG-13, G, NC-17, R)
duplicate_rowsโ
(source code)โ
Determines the count of rows in a given column that have values which are duplicates.
rating count
-----------------
PG-13 2
G 3
NC-17 2
PG 1
R 2
time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
duplicate_count = 9. (PG-13[2], G[3], NC-17[2], R[2])
unique_rowsโ
(source code)โ
Determines the count of rows in a given column that have values which are unique.
rating count
-----------------
PG-13 2
G 3
NC-17 2
PG 1
R 2
time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
distinct_count = 1 (PG)
approx_distinct_valuesโ
(source code)โ
Determines the approximate distinct count of values in a given column. This metric is useful in large tables where an approximation is sufficient and query performance is required.
Note: Postgres does not support for approximate count of distinct values unlike bigquery, snowflake and redshift.