我在資料框中有一些資料,如下所示:
----------- -------- ----------- --------------------------------
| Noun| Pronoun| Adjective|Metadata |
----------- -------- ----------- --------------------------------
| Homer| Simpson|Engineer |{"Age": "50", "Country": "USA"} |
| Elon | Musk |King |{"Age": "45", "Country": "RSA"} |
| Bart | Lee |Cricketer |{"Age": "35", "Country": "AUS"} |
| Lisa | Jobs |Daughter |{"Age": "35", "Country": "IND"} |
| Joe | Root |Player |{"Age": "31", "Country": "ENG"} |
----------- -------- ----------- --------------------------------
我想將另一列(比如Adjective)中的值附加到該Metadata列中。這樣最終的資料幀將如下所示:
----------- -------- ----------- ------------------------------------------------------------
| Noun| Pronoun| Adjective|Metadata |
----------- -------- ----------- ------------------------------------------------------------
| Homer| Simpson|Engineer |{"Age": "50", "Country": "USA", "Adjective": "Engineer"} |
| Elon | Musk |King |{"Age": "45", "Country": "RSA", "Adjective": "King"} |
| Bart | Lee |Cricketer |{"Age": "35", "Country": "AUS", "Adjective": "Cricketer"} |
| Lisa | Jobs |Daughter |{"Age": "35", "Country": "IND", "Adjective": "Daughter"} |
| Joe | Root |Player |{"Age": "31", "Country": "ENG", "Adjective": "Player"} |
----------- -------- ----------- ------------------------------------------------------------
請建議如何實施。
uj5u.com熱心網友回復:
假設您的列Metadata包含 JSON 字串,您可以先將其轉換為MapTypewithfrom_json函式,然后添加您想要使用的列map_concat,最后使用to_json以下命令再次轉換為 JSON 字串:
val df2 = df.withColumn(
"Metadata",
from_json(col("Metadata"), lit("map<string,string>"))
).withColumn(
"Metadata",
to_json(map_concat(col("Metadata"), map(lit("Adjective"), col("Adjective"))))
)
df2.show(false)
// ----- ------- --------- ----------------------------------------------------
//|Noun |Pronoun|Adjective|Metadata |
// ----- ------- --------- ----------------------------------------------------
//|Homer|Simpson|Engineer |{"Age":"50","Country":"USA","Adjective":"Engineer"} |
//|Elon |Musk |King |{"Age":"45","Country":"RSA","Adjective":"King"} |
//|Bart |Lee |Cricketer|{"Age":"35","Country":"AUS","Adjective":"Cricketer"}|
//|Lisa |Jobs |Daughter |{"Age":"35","Country":"IND","Adjective":"Daughter"} |
//|Joe |Root |Player |{"Age":"31","Country":"ENG","Adjective":"Player"} |
// ----- ------- --------- ----------------------------------------------------
這也可以使用轉換為 StructType 而不是 MapType 來完成,但在這種情況下 map 更通用。
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/403043.html
標籤:
上一篇:使用Flink反序列化Protobufkafka訊息
下一篇:無法匯入本地發布的Scala插件
