我有一個資料框:
import pandas as pd
data = {'token_1': [['cat', 'run','today'],['dog', 'eat', 'meat']],
'token_2': [[ 'in', 'the' , 'morning','cat', 'run', 'today',
'very', 'quick'],['dog', 'eat', 'meat', 'chicken', 'from', 'bowl']]}
df = pd.DataFrame(data)
我需要從列token_1中查找單詞token_2并將它們的索引放入陣列中。然后獲取每行的索引串列,我期望這樣:
lst_indexes = [[3,4,5],
[0,1,2]]
uj5u.com熱心網友回復:
enumerate對索引使用串列推導:
L = [[i for i, x in enumerate(b) if x in a] for a, b in zip(df['token_1'], df['token_2'])]
print (L)
[[3, 4, 5], [0, 1, 2]]
uj5u.com熱心網友回復:
您可以使用字典/串列理解:
# first compute a dictionary of indices for efficiency
indices = [{w: i for i,w in enumerate(l)} for l in df['token_2']]
# then map the indices
[[d.get(x,None) for x in l] for d, l in zip(indices, df['token_1'])]
輸出:
[[3, 4, 5], [0, 1, 2]]
uj5u.com熱心網友回復:
您可以遍歷data字典并將值附加到新串列:
data = {'token_1': [['cat', 'run','today'],['dog', 'eat', 'meat']],
'token_2': [[ 'in', 'the' , 'morning','cat', 'run', 'today',
'very', 'quick'],['dog', 'eat', 'meat', 'chicken', 'from', 'bowl']]}
l = []
for i in range(len(data["token_1"])):
l.append([])
for j in range(len(data["token_1"][i])):
a = data["token_2"][i].index(data["token_1"][i][j])
if a!=-1:
l[i].append(a)
print(l)
請注意,其他解決方案看起來更加清晰易讀,這只是串列理解的替代方案
輸出:
[[3, 4, 5], [0, 1, 2]]
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/496101.html
