df_pm = dataset[["names","pop_mig"]].copy()
starring_letter = str(input("starring_letter:"))
-df_pm 是資料框。我想列出以starring_letter 為主角的名字,然后找出其中哪個pop_mig值最高。pop_mig是包含整數的列。
trythis = df_pm[df_pm["names"]== starring_letter in df_pm["names"]][df_pm[df_pm["pop_mig"]==df_pm["pop_mig"].max()]]
它回傳了錯誤:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
來自部分資料框的示例:
names pop_mig
0 Afghanistan 38991266
1 Albania 2891797
2 Algeria 43861044
3 Angola 32859859
4 Antigua and Barbuda 97929
.. ... ...
196 Vietnam 97418579
197 Western Sahara 591757
預期輸出:
starring_letter = C
output1 = China
output2 = China's pop_mig value
uj5u.com熱心網友回復:
這應該作業
df = df[df.names.str.startswith(input_letter.title())].nlargest(n=1,columns = ['pop_mig'])
uj5u.com熱心網友回復:
如果您計劃最終獲得任何字母,請計算每個字母的所有最大值的 DataFrame:
df2 = (df
.sort_values(by='pop_mig', ascending=False)
.groupby(df['names'].str[0].rename('letter'))
.first()
)
輸出:
names pop_mig
letter
A Algeria 43861044
V Vietnam 97418579
W Western Sahara 591757
或者,更有效一點:
df2 = df['pop_mig'].groupby(df['names'].str[0]).idxmax().rename_axis('letter').reset_index(name='idx')
df2 = df2.merge(df, left_on='idx', right_index=True).drop(columns='idx')
輸出:
letter names pop_mig
0 A Algeria 43861044
1 V Vietnam 97418579
2 W Western Sahara 591757
否則,對于每個需求的計算:
starting_letter = 'A'
idx = df.loc[df['names'].str.startswith(starting_letter), 'pop_mig'].idxmax()
out = df.loc[idx]
輸出:
names Algeria
pop_mig 43861044
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/446246.html
上一篇:如何在新列中連接兩列和一個變數
