我在 Apache Spark 中創建了一個查詢,希望將多行客戶資料匯總為一行,顯示他們打開的產品型別。所以資料看起來像這樣:
Customer Product
1 Savings
1 Checking
1 Auto
最終看起來像這樣:
Customer Product
1 Savings/Checking/Auto
查詢當前仍有多行。我試過 group by,但這并沒有顯示客戶擁有的多種產品,而是只顯示一種產品。
有沒有辦法做到這一點是 Apache Spark 或 SQL(與 apache 非常相似)?不幸的是,我沒有 MYSQL,也不認為它會為我安裝它。
SELECT
"ACCOUNT"."account_customerkey" AS "account_customerkey",
max(
concat(case when Savings=1 then ' Savings'end,
case when Checking=1 then ' Checking 'end,
case when CD=1 then ' CD /'end,
case when IRA=1 then ' IRA /'end,
case when StandardLoan=1 then ' SL /'end,
case when Auto=1 then ' Auto /'end,
case when Mortgage=1 then ' Mortgage /'end,
case when CreditCard=1 then ' CreditCard 'end)) AS Description
FROM "ACCOUNT" "ACCOUNT"
inner join (
SELECT
"ACCOUNT"."account_customerkey" AS "customerkey",
CASE WHEN "ACCOUNT"."account_producttype" = 'Savings' THEN 1 ELSE NULL END AS Savings,
CASE WHEN "ACCOUNT"."account_producttype" = 'Checking' THEN 1 ELSE NULL END AS Checking,
CASE WHEN "ACCOUNT"."account_producttype" = 'CD' THEN 1 ELSE NULL END AS CD,
CASE WHEN "ACCOUNT"."account_producttype" = 'IRA' THEN 1 ELSE NULL END AS IRA,
CASE WHEN "ACCOUNT"."account_producttype" = 'Standard Loan' THEN 1 ELSE NULL END AS StandardLoan,
CASE WHEN "ACCOUNT"."account_producttype" = 'Auto' THEN 1 ELSE NULL END AS Auto,
CASE WHEN "ACCOUNT"."account_producttype" = 'Mortgage' THEN 1 ELSE NULL END AS Mortgage,
CASE WHEN "ACCOUNT"."account_producttype" = 'Credit Card' THEN 1 ELSE NULL END AS CreditCard
FROM "ACCOUNT" "ACCOUNT"
)a on "account_customerkey" =a."customerkey"
GROUP BY
"ACCOUNT"."account_customerkey"
uj5u.com熱心網友回復:
請試試這個。
scala> df.show()
-------- --------
|Customer| Product|
-------- --------
| 1| Savings|
| 1|Checking|
| 1| Auto|
| 2| Savings|
| 2| Auto|
| 3|Checking|
-------- --------
scala> df.groupBy($"Customer").agg(collect_list($"Product").as("Product")).select($"Customer",concat_ws(",",$"Product").as("Product")).show(false)
-------- ---------------------
|Customer|Product |
-------- ---------------------
|1 |Savings,Checking,Auto|
|3 |Checking |
|2 |Savings,Auto |
-------- ---------------------
scala>
uj5u.com熱心網友回復:
請參閱https://docs.microsoft.com/en-us/azure/databricks/sql/language-manual/functions/collect_list 和相關函式
您需要使用collect_list可用于 SQL 或 %sql 的哪個。
%sql
select id, collect_list(num)
from t1
group by id
我用的是我自己的資料,你需要量身定做。只是以更原生的 SQL 形式進行演示。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/363125.html
