Base metrics
Base metrics are computed for all monitored tables. If you would rather not compute some of them it's easy to change the base metrics list via the re_data:metrics_base
variable.
__ title rental_rate rating created_at1 Chamber Italian 4.99 NC-17 2021-09-01T11:00:002 Grosse Wonderful 4.99 R 2021-09-01T12:00:003 Airport Pollock 4.99 R 2021-09-01T15:00:004 Bright Encounters 4.99 PG-13 2021-09-01T09:00:005 Academy Dinosaur 0.99 PG-13 2021-09-01T08:00:006 Ace Goldfinger 4.99 G 2021-09-01T10:00:007 Adaptation Holes 2.99 NC-17 2021-09-01T11:00:008 Affair Prejudice 2.99 G 2021-09-01T19:00:009 African Egg 2.99 G 2021-09-01T20:00:0010 Agent Truman 2.99 PG 2021-09-01T07:00:0011 Airplane Sierra 4.99 PG-13 2021-09-02T09:00:0012 Alabama Devil 2.99 PG-13 2021-09-02T10:00:0013 Aladdin Calendar 4.99 NC-17 2021-09-02T11:00:0014 Alamo Videotape 0.99 G 2021-09-02T12:00:0015 Alaska Phantom 0.99 PG 2021-09-02T13:00:0016 Date Speed 0.99 R 2021-09-02T14:00:0017 Ali Forever 4.99 PG 2021-09-02T15:00:0018 Alice Fantasia 0.99 NC-17 2021-09-02T16:00:0019 Alien Center 2.99 NC-17 2021-09-02T17:00:00
Below is a list of currently available metrics and how they are computed internally by re_data:
#
Base table level metrics#
row_count(source code)#
Numbers of rows added to the table in a specific time range.
row_count = 10 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
#
freshness(source code)#
Information about the latest record in a given time frame. Suppose we calculate the freshness
metric in the table above for the time window [2021-09-01T00:00:00, 2021-09-02T00:00:00)
. We observe that the latest record
in that time frame appears in row 9 with created_at=2021-09-01T20:00:00
. freshness
is the difference between the end of the time window and the latest record in the time frame in seconds. For this example described, re_data would calculate freshness as:
2021-09-02T00:00:00 - 2021-09-01T20:00:00 = 14400
#
schema_changesInformation about schema changes in the monitored table.
Stored separately from the rest of the metrics in the re_data_schema_changes
model.
caution
Schema changes are metric different from the rest. Because information about schema changes is gathered by comparing schemas between re_data runs this metric doesn't filter changes to time-window specified and in fact, doesn't use time_window settings at all.
#
Base column level metrics#
min(source code)#
Minimal value appearing in a given numeric column.
min(rental_rate) = 0.99 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
#
max(source code)#
Maximal value appearing in a given numeric column.
max(rental_rate) = 4.99 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
#
avg(source code)#
Average of all values appearing in a given numeric column.
avg(rental_rate) = 3.79 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
#
stddev(source code)#
The standard deviation of all values appearing in a given numeric column.
stddev(rental_rate) = 1.3984117975602022 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
#
variance(source code)#
The variance of all values appearing in a given numeric column.
variance(rental_rate) = 1.9555555555555557 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
#
min_length(source code) #
Minimal length of all strings appearing in a given column.
min_length(rating) = 1 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
max_length #
Maximal length of all strings appearing in a given column
max_length(rating) = 5 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
#
avg_length(source code)#
The average length of all strings appearing in a given column
avg_length(rating) = 2.4 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
#
nulls_count(source code)#
A number of nulls in a given column.
nulls_count(rating) = 0 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
#
missing_count(source code)#
A number of nulls and empty string values in a given column for the specific time range.
missing_count(rating) = 0 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
#
missing_percent(source code)#
A percentage of nulls and empty string values in a given column for the specific time range.
missing_percent(rating) = 0 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00
#
nulls_percent(source code)#
A percentage of null values in a given column for the specific time range.
nulls_percent(rating) = 0 where time window is >= 2021-09-01T00:00:00 and < 2021-09-02T00:00:00