我有一個資料框:
Text
Background
Clinical
Method
Direct
Background
Direct
現在我想根據他們的第一個詞將它們分組在新列中,例如Background 屬于第 1 組Clinical 屬于第 2 組,并且像這樣。
預期輸出:
一個資料框:
Text Group
Background 1
Clinical 2
Method 3
Direct 4
Background 1
Direct 4
uj5u.com熱心網友回復:
嘗試這個:
import pandas as pd
text = ['Background', 'Clinical', 'Method', 'Direct', 'Background', 'Direct']
df = pd.DataFrame(text, columns=['Text'])
def create_idx_map():
idx = 1
values = {}
for item in list(df['Text']):
if item not in values:
values[item] = idx
idx = 1
return values
values = create_idx_map()
df['Group'] = [values[x] for x in list(df['Text'])]
print(df)
uj5u.com熱心網友回復:
想法:制作列的唯一值串列,您可以為該Text列Group分配此唯一串列中值的索引。代碼示例:
df = pd.DataFrame({"Text": ["Background", "Clinical", "Clinical", "Method", "Background"]})
# List of unique values of column `Text`
groups = list(df["Text"].unique())
# Assign each value in `Text` its index
# (you can write `groups.index(text) 1` when the first value shall be 1)
df["Group"] = df["Text"].map(lambda text: groups.index(text))
# Ouptut for df
print(df)
### Result:
Text Group
0 Background 0
1 Clinical 1
2 Clinical 1
3 Method 2
4 Background 0
uj5u.com熱心網友回復:
解決方案可能如下:
import pandas as pd
data = pd.DataFrame([["A B", 1], ["A C", 2], ["B A", 3], ["B C", 5]], columns=("name", "value"))
data.groupby(by=[x.split(" ")[0] for x in data.loc[:,"name"]])
您可以使用 選擇前幾個詞x.split(" ")[:NUMBER_OF_WORDS]。然后將所需的聚合應用于需要的物件
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/322030.html
上一篇:合并兩個資料幀(都具有多索引)
