這是我現在擁有的資料框,假設總共有 4 天{1,2,3,4}:
------------- ---------- ------
| key | Time | Value|
------------- ---------- ------
| 1 | 1 | 1 |
| 1 | 2 | 2 |
| 1 | 4 | 3 |
| 2 | 2 | 4 |
| 2 | 3 | 5 |
------------- ---------- ------
而我想要的是
------------- ---------- ------
| key | Time | Value|
------------- ---------- ------
| 1 | 1 | 1 |
| 1 | 2 | 2 |
| 1 | 3 | null |
| 1 | 4 | 3 |
| 2 | 1 | null |
| 2 | 2 | 4 |
| 2 | 3 | 5 |
| 2 | 4 | null |
------------- ---------- ------
如果有一些方法可以幫助我得到這個?
uj5u.com熱心網友回復:
說df1是我們的主表:
--- ---- -----
|key|Time|Value|
--- ---- -----
|1 |1 |1 |
|1 |2 |2 |
|1 |4 |3 |
|2 |2 |4 |
|2 |3 |5 |
--- ---- -----
我們可以使用以下變換:
val data = df1
// we first group by and aggregate the values to a sequence between 1 and 4 (your number)
.groupBy("key")
.agg(sequence(lit(1), lit(4)).as("Time"))
// we explode the sequence, thus creating all 'Time' per 'key'
.withColumn("Time", explode(col("Time")))
// finally, we join with our main table on 'key' and 'Time'
.join(df1, Seq("key", "Time"), "left")
要獲得此輸出:
--- ---- -----
|key|Time|Value|
--- ---- -----
|1 |1 |1 |
|1 |2 |2 |
|1 |3 |null |
|1 |4 |3 |
|2 |1 |null |
|2 |2 |4 |
|2 |3 |5 |
|2 |4 |null |
--- ---- -----
這應該是你正在尋找的,祝你好運!
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/512152.html
標籤:斯卡拉阿帕奇火花
上一篇:表無資料時顯示空值
