我有一個 2 列和 N 行的表格,如下所示,
email | phone_num_list
----------------------
'a' | {"1", "2", "3"}
'a' | {"1", "4"}
'c' | {"5", "1", "6"}
'd' | {"3", "7", "1"}
wherephone_num_list是陣列型別。
我的要求是獲取最常用的電話號碼及其分數,其中分數=number of distinct email associated with the phone_num * total frequency of phone_num
因此,對于上面的示例,與 phone_num 關聯的不同電子郵件1是["a", "c", "d"]。
因此,phone_num 的分數1是3(i.e total distinct emails) * 4(i.e total frequency of 1)
所有 phone_num 的分數計算如下 -
phone_num | distinct emails associated | freq of phone_num | final_score
-------------------------------------------------------------------------
1 | ["a", "c", "d"] | 4 | 4*3 = 12
2 | ["a"] | 1 | 1*1 = 1
3 | ["a", "d"] | 2 | 2*2 = 4
4 | ["a"] | 1 | 1*1 = 1
5 | ["c"] | 1 | 1*1 = 1
6 | ["c"] | 1 | 1*1 = 1
7 | ["d"] | 1 | 1*1 = 1
我想要的輸出是 ->
phone | score
-------------
1 | 12
3 | 4
2 | 1
4 | 1
5 | 1
6 | 1
7 | 1
請幫助我在 PostgreSql 中查詢。謝謝
uj5u.com熱心網友回復:
準備測驗用例:
create temporary table t (email text, phone_num_list text[]);
insert into t(email, phone_num_list) values
('a', '{"1", "2", "3"}'),
('a', '{"1", "4"}'),
('c', '{"5", "1", "6"}'),
('d', '{"3", "7", "1"}');
將表格“標準化”為ntCTE,然后計算頻率乘以每個電話號碼的不同電子郵件數量。
with nt as (select email, unnest(phone_num_list) as phone from t)
select phone, count(*) * count(distinct email) as score
from nt group by phone order by score desc;
phone|score|
----- -----
1 | 12|
3 | 4|
5 | 1|
6 | 1|
7 | 1|
4 | 1|
2 | 1|
uj5u.com熱心網友回復:
您可以使用“unnest”功能。
select phone_number,
count(phone_number) * count(distinct email) as score
from
(
select email, unnest(phone_num_list) as phone_number
from t1
)z
group by 1
order by 2 desc
db-fiddle在這里:
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/344752.html
標籤:sql PostgreSQL
