我的資料框架以String的形式回傳以下結果。
QueryResult{status=' success', finalSuccess=true, parseSuccess=true, allRows=[{"cbcnt": 0}], signature={"cbcnt": "number"}, info=N1qlMetrics{resultCount=1, errorCount=0, warningCount=0, mutationCount=0, sortCount=0, resultSize=11, elapsedTime='5. 080179ms', executionTime='4. 931124ms'}, profileInfo={}, errors=[], requestId='754d19f6-7ec1-4609-bf2a-54214d06c57c' 。clientContextId='542bc4c8-1a56-4afb-8c2f-63d81e681cb4'}。 |
QueryResult{status='success', finalSuccess=true, parseSuccess=true, allRows=[{"cbcnt": "2021-07-30T00:00:00-04:00"}], signature={"cbcnt": "String"}, info=N1qlMetrics{resultCount=1, errorCount=0, warningCount=0, mutationCount=0, sortCount=0, resultSize=11, elapsedTime='5. 080179ms', executionTime='4. 931124ms'}, profileInfo={}, errors=[], requestId='754d19f6-7ec1-4609-bf2a-54214d06c57c' 。clientContextId='542bc4c8-1a56-4afb-8c2f-63d81e681cb4'}.
我只是想
"cbcnt":0 <-- Numeric part of this
預期輸出
col
----
0
20210730
嘗試:
. withColumn("CbRes",regexp_extract($"Col", """"cbcnt":(S*d ) "", 1)
輸出
col
----
0
"2021-07-30 00:00:00 --<--additional"即將到來
uj5u.com熱心網友回復:
通過regex提取:
val value = "QueryResult{status='success', finalSuccess=true, parseSuccess=true, allRows=[{"cbcnt": 0}], signature={"cbcnt": "number"}, info=N1qlMetrics{resultCount=1, errorCount=0, warningCount=0, mutationCount=0, sortCount=0, resultSize=11, elapsedTime='5. 080179ms', executionTime='4.931124ms'}, profileInfo={}, errors=[], requestId='754d19f6-7ec1-4609-bf2a-54214d06c57c', clientContextId='542bc4c8-1a56-4afb-8c2f-63d81e681cb4'} |"
val regex = """"cbcnt":(d )" ".r.unanchored
val s"${regex(result)}" = value
println(result)
輸出:
0
uj5u.com熱心網友回復:
使用Pyspark函式regexp_extract:
from pyspark.sql import functions as F
df = <dataframe with一列"text",包含input資料">。
df.withColumn("col", F. regexp_extract("text", """cbcnt":(d )"", 1).show()
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/309108.html
標籤:
上一篇:NoClassDefFoundError:scala/collection/StringOps
下一篇:<p>在一個典型的Scala上界實體中</p> <preclass="lang-scalas-code-block"><codeclass
