Monday, 11 August 2025

Approximation of the most frequent values

 Syntax: 

APPROX_TOP_K( expr, k, counters )

Arg1: the column name for which you want to find the most common values.

Arg2: if you want to see the top 10 most common values, then set k to 10

Arg3: Max no.of distinct values that can be tracked

 

Normal method to get most frequent value from the table,

    SELECT customer_id, COUNT(customer_id) FROM table_name GROUP BY 1;

Using estimation function,

    SELECT APPROX_TOP_K(customer_id,5,20) FROM table_name;

Benchmark: For 150,000,000 rows

  •     Normal method took, 12s
  •     Estimation function took, 5.6s

Also check for:

APPROX_TOP_K_ACCUMULATE , 
APPROX_TOP_K_COMBINE, 
APPROX_TOP_K_ESTIMATE

No comments:

Snowflake - Cost Optimization

Reduce auto-suspend to 60 seconds Reduce virtual warehouse size Ensure minimum clusters are set to 1 Consolidate warehouses Separate warehou...