所以我有這張表,我正在嘗試對其進行排序,以便“性別”列的值交替出現。下面是有問題的表格
-------------------- ------ ---
| Employee_Name|salary|Sex|
-------------------- ------ ---
| Adinolfi, Wilson K| 62506| M |
|Ait Sidi, Karthik...|104437| M |
| Akinkuolie, Sarah| 64955| F|
| Alagbe,Trina| 64991| F|
| Anderson, Carol | 50825| F|
| Anderson, Linda | 57568| F|
| Andreola, Colby| 95660| F|
| Athwal, Sam| 59365| M |
| Bachiochi, Linda| 47837| F|
| Bacong, Alejandro | 50178| M |
|Baczenski, Rachael | 54670| F|
| Barbara, Thomas| 47211| M |
| Barbossa, Hector| 92328| M |
|Barone, Francesco A| 58709| M |
| Barton, Nader| 52505| M |
| Bates, Norman| 57834| M |
| Beak, Kimberly | 70131| F|
| Beatrice, Courtney | 59026| F|
| Becker, Renee|110000| F|
| Becker, Scott| 53250| M |
-------------------- ------ ---
向我提出的問題是寫一個宣告,這樣:
--- -------
|sex|EMpName|
--- -------
| M |Kevin |
| F |Carol |
| M |Josh |
| F |Linda |
| M |Sam |
| F |Sam |
--- -------
請幫助,任何提示或概念將不勝感激。
uj5u.com熱心網友回復:
您需要添加一個“訂單”列才能獲得預期的結果。這是一個解決方案row_number。
from pyspark.sql import functions as F, Window
# assuming df is your dataframe
df.withColumn(
"ordering",
F.row_number().over(Window.partitionBy("sex").orderBy(F.lit(1))),
).orderBy("ordering", "sex").drop("ordering").show()
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/345675.html
