我有一個包含內容的資料框
scala> true_nomar.show(1)
-------- -------------- -------------------- ------ ------ --------------------
|category|topicUpPredict| topic|ciTrue|upTrue| normal|
-------- -------------- -------------------- ------ ------ --------------------
|the_thao| the_thao|[the_thao, the_gioi]| true| true| Khi các m?c s? m...|
-------- -------------- -------------------- ------ ------ --------------------
only showing top 1 row
但是當我全部顯示時,正常列的內容不是全文,其他列沒有內容
scala> true_nomar.show(1,false)
-------- -------------- -------------------- ------ ------ --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|category|topicUpPredict|topic |ciTrue|upTrue|normal |
-------- -------------- -------------------- ------ ------ --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Thích thú tr??c hai v? h?c trò ??c bi?t này, ?ng Eriksson nói: "Bóng ?á c?n nhi?u ng??i nh? là hai v? m?c s? Charles và Tim ?? t?o cho tr? em th?t nhi?u c? h?i ??n v?i bóng ?á”. Th?m chí Geoff Hurst, c?u ng?i sa|?i l?i, hai m?c s? Crosland và Smith cùng các con chiên s? c?u nguy?n cho ??i tuy?n Anh trong VCK World Cup 2006 mà tr??c m?t là c?u nguy?n cho ch?n th??ng c?a ti?n ??o Michael Owen s?m h?i ph?c.
-------- -------------- -------------------- ------ ------ --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
only showing top 1 row
uj5u.com熱心網友回復:
\r這很可能是由于文本某處嵌入了一個或多個回車 (CR) 符號(在 Scala 字串文字中)。當遇到 CR 時,終端將插入符號移動到行首,這會打亂輸出:
scala> "123\r456"
4560: String = 123
在這里,輸出應該是res0: String = 123...,但插入符號位置在之后重置123并456覆寫res。列印資料幀時也會發生同樣的情況:
scala> Seq(("baz", "foofoofoo\rbarbar")).toDF("cat", "normal").show(false)
--- ----------------
|cat|normal |
--- ----------------
barbar|ofoofoo
--- ----------------
如果您仔細查看輸出,您會發現結尾|是 ,所以它是全文,只是亂碼:
--------------------
--------------------
c?u ng?i sa|?i l?i,
--------------------
^
^
end of "normal" column
用于regexp_replace($"normal", "\r", "\\\\r")用轉義表示替換所有 CR \r:
scala> val df = Seq(("baz", "foofoofoo\rbarbar")).toDF("cat", "normal")
df: org.apache.spark.sql.DataFrame = [cat: string, normal: string]
scala> df.show(false)
--- ----------------
|cat|normal |
--- ----------------
barbar|ofoofoo
--- ----------------
scala> df.withColumn("normal", regexp_replace($"normal", "\r", "\\\\r")).show(false)
--- -----------------
|cat|normal |
--- -----------------
|baz|foofoofoo\rbarbar|
--- -----------------
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/477863.html
標籤:阿帕奇火花
