我有一個熊貓資料框,它也有一個結構化的列:
sequences
-------------
[(1838, 2038)]
[]
[]
[(809, 1090)]
我需要逐行回圈,所以我也構建了回圈:
for index, row in df.iterrows():
true_anom_seq = json.loads(row['sequences'])
我想做的是創建一個嵌套回圈,[[1838, 2038], [], [], [809, 1090]]這樣我就可以遍歷它。問題是我寫的代碼給了我錯誤:
JSONDecodeError: Expecting value: line 1 column 2 (char 1)
我也嘗試列印row['sequences'][0],它給了我[,所以它將它作為字串讀取。
如何將此字串轉換為串列?
uj5u.com熱心網友回復:
import pandas as pd
import re
col = {'index': [1,2,3,4], 'sequence':['[(1838, 2038)]', '[]', '[]', '[(809, 1090)]']}
new_sequence = []
new_df = pd.DataFrame(col)
for index, row in new_df.iterrows():
one_item = []
true_anom_seq = re.findall(r'\d ', row['sequence'])
for match in true_anom_seq:
one_item.append(match)
new_sequence.append(one_item)
print(new_sequence)
uj5u.com熱心網友回復:
使用 ast.literal_eval 將字串轉換為 list/dict/...:
from ast import literal_eval
>>> literal_eval('[1,2,3]')
[1,2,3]
uj5u.com熱心網友回復:
無需遍歷資料框本身,也無需使用正則運算式。只需將 literal_eval 函式應用于列中的每一行sequence并將其包裝為串列:
from ast import literal_eval
import pandas as pd
col = {'index': [1,2,3,4], 'sequence':['[(1838, 2038)]', '[]', '[]', '[(809, 1090)]']}
new_sequence = []
new_df = pd.DataFrame(col)
list(new_df.sequence.apply(literal_eval))
[[(1838, 2038)], [], [], [(809, 1090)]]
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/412340.html
標籤:
下一篇:如何獲得一個干凈的陣列作為輸出
