將資料幀字符拆分為資料幀中的一小組字串列?
這是一個資料框,我需要在資料框串列中拆分為 10 10 個字符。
| contact_num |
| -------------------------------|
| 01111784885788634878 |
| 247782788869775178889785427889 |
| not available |
| 2478544756 |
預期輸出:
| contact_num |
| ------------------------------- --|
| [0111178488,5788634878] |
| [2477827888,6977517888,9785427889]|
| not available |
| [2478544756] |
uj5u.com熱心網友回復:
嘗試:
mask = df["contact_num"].str.contains(r"^\d{10,}$", regex=True)
df.loc[mask, "contact_num"] = df.loc[mask, "contact_num"].str.findall(r"\d{10}")
print(df)
印刷:
contact_num
0 [0111178488, 5788634878]
1 [2477827888, 6977517888, 9785427889]
2 not available
3 [2478544756]
uj5u.com熱心網友回復:
您可以將pandas apply與自定義函式一起使用(只是為了控制您正在做的事情,否則您可以以更 pythonic 和更少冗長的方式進行)。
import pandas as pd
# your data in array of json
data = [
{"contact_num": "01111784885788634878"},
{"contact_num": "247782788869775178889785427889"},
{"contact_num": "not available"},
{"contact_num": "2478544756"}
]
df = pd.DataFrame(data)
def split_func(row):
if row.contact_num.isnumeric(): # check if current value is numeric
return [row.contact_num[i:i 10] for i in range(0, len(row.contact_num),10)]
return row.contact_num # if not numeric, return current value unchanged
df.contact_num = df.apply(lambda x: split_func(x), axis=1) # apply function to each row
print(df)
輸出將是:
contact_num
0 [0111178488, 5788634878]
1 [2477827888, 6977517888, 9785427889]
2 not available
3 [2478544756]
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/533400.html
