A very common request is to get the probability distribution function of a discrete variable, in other words, to get the percentage of each value over the total. Snowflake has a built-in function called ratio_to_report() for this report. Let’s provide some examples based on Snowflake’s documentation.
Let’s create a temporary table for exhibition purposes:
create temporary table store_profit ( store_id integer, province varchar, profit numeric(11, 2)); insert into store_profit (store_id, province, profit) values (1, 'Ontario', 300), (2, 'Saskatchewan', 250), (3, 'Ontario', 450), (4, 'Ontario', null) -- hasn't opened yet, so no profit yet. ;
STORE_ID | PROVINCE | PROFIT |
1 | Ontario | 300 |
2 | Saskatchewan | 250 |
3 | Ontario | 450 |
4 | Ontario | NULL |
Let’s get the percentage of profit using the ratio_to_report function.
select store_id, profit, ratio_to_report(profit) over () as percent_profit from store_profit order by store_id;
STORE_ID | PROFIT | PERCENT_PROFIT |
1 | 300 | 0.3 |
2 | 250 | 0.25 |
3 | 450 | 0.45 |
4 | NULL | NULL |
Finally, let’s get the percentage of profit within each province generated by each store in that province:
select province, store_id, profit, ratio_to_report(profit) over (partition by province) as percent_profit from store_profit order by province, store_id;
PROVINCE | STORE_ID | PROFIT | PERCENT_PROFIT |
Ontario | 1 | 300 | 0.4 |
Ontario | 2 | 450 | 0.6 |
Ontario | 4 | NULL | NULL |
Saskatchewan | 2 | 250 | 1.0 |