Syntax:
APPROX_TOP_K( expr, k, counters )
Arg1: the column name for which you want to find the most common values.
Arg2: if you want to see the top 10 most common values, then set
k
to 10Arg3: Max no.of distinct values that can be tracked
Normal method to get most frequent value from the table,
SELECT customer_id, COUNT(customer_id) FROM table_name GROUP BY 1;
Using estimation function,
SELECT APPROX_TOP_K(customer_id,5,20) FROM table_name;
Benchmark: For 150,000,000 rows
- Normal method took, 12s
- Estimation function took, 5.6s
Also check for:
APPROX_TOP_K_ACCUMULATE ,
APPROX_TOP_K_COMBINE,
APPROX_TOP_K_ESTIMATE
No comments:
Post a Comment