我正在使用下面的代碼來收集一些資訊:
df = (
df
.select(
date_format(date_trunc('month', col("reference_date")), 'yyyy-MM-dd').alias("month"),
col("id"),
col("name"),
col("item_type"),
col("sub_group"),
col("latitude"),
col("longitude")
)
我的緯度和經度是帶點的值,如下所示:-30.130307 -51.2060018 但我必須將點替換為逗號。我已經嘗試了 .replace() 和 .regexp_replace() 但它們都沒有作業。你們能幫幫我嗎?
uj5u.com熱心網友回復:
以以下資料框為例。
df.show()
------------------- -------------------
| latitude| longitude|
------------------- -------------------
| 85.70708380916193| -68.05674981929877|
| 57.074495803252404|-42.648691976080215|
| 2.944303748172473| -62.66186439333423|
| 119.76923402031701|-114.41179457810185|
|-138.52573939229234| 54.38429596238362|
------------------- -------------------
您應該能夠使用spark.sql以下功能
from pyspark.sql import functions
df = df.withColumn("longitude", functions.regexp_replace('longitude',r'[.]',","))
df = df.withColumn("latitude", functions.regexp_replace('latitude',r'[.]',","))
df.show()
------------------- -------------------
| latitude| longitude|
------------------- -------------------
| 85,70708380916193| -68,05674981929877|
| 57,074495803252404|-42,648691976080215|
| 2,944303748172473| -62,66186439333423|
| 119,76923402031701|-114,41179457810185|
|-138,52573939229234| 54,38429596238362|
------------------- -------------------
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/420955.html
標籤:
