我正在嘗試用一組值替換資料框中的多個分類變數。
我嘗試了以下代碼:
data['Gender'] = data['Gender'].replace(to_replace={"male","M","m","female","f","F"}, value={"Male","Male","Male","Female", "Female", "Female"}).
我希望每個 m、M 或男性都被男性取代。女性類別也一樣。
我收到錯誤:
ValueError:替換串列的長度必須匹配。期望 6 得到 2
uj5u.com熱心網友回復:
您的代碼的問題是您使用sets 作為replace()方法的引數。基數可能適用于to_replace,因為所有元素都是唯一的。因為value,set您定義的實際上是{"Male", "Female"},這與 的基數不匹配to_replace。即使基數匹配,sets 也不能保證順序,因此它不是適合手頭作業的資料結構。相反,如果您使用lists 或tuples,這將起作用:
data['Gender'] = data['Gender'].replace(to_replace=("male","M","m","female","f","F"), value=("Male","Male","Male","Female", "Female", "Female")).
盡管使用 adict可能會使代碼更易于閱讀,因為替換代碼寫得很近:
data["Gender"] = data["Gender"].replace({"m" : "Male", "M" : "Male", "male": "Male", "f": "Female", "F": "Female", "female": "Female"})
uj5u.com熱心網友回復:
這是一種方法。
import pandas as pd
import numpy as np
df = pd.DataFrame({'Gender': ['m', 'M', 'f', 'F', 'm']})
print(df)
Gender
0 m
1 M
2 f
3 F
4 m
replace_values = {'m' : 'Male', 'M' : 'Male', 'f':'Female','F':'Female'}
df = df.replace({"Gender": replace_values})
df
Gender
0 Male
1 Male
2 Female
3 Female
4 Male
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/534896.html
上一篇:Pythonpandasdataframe,列名顯示為字串,不能涉及
下一篇:CSV檔案的R讀取函式
