我是這里的新手。英語不是我的母語,所以請原諒任何語法錯誤。我需要使用df.
# 1. Here we import pandas
import pandas as pd
# 2. Here we import numpy
import numpy as np
np.random.seed(0)
df = pd.DataFrame({'Age':[18, 21, 28, 19, 23, 22, 18, 24, 25, 20],
'Hair colour':['Blonde', 'Brown', 'Black', 'Blonde', 'Blonde', 'Black','Brown', 'Brown', 'Black', 'Black'],
'Length (in cm)':np.random.normal(175, 10, 10).round(1),
'Weight (in kg)':np.random.normal(70, 5, 10).round(1)},
index = ['Leon', 'Mirta', 'Nathan', 'Linda', 'Bandar', 'Violeta', 'Noah', 'Niji', 'Lucy', 'Mark'],)
我應該得到帶有名字的向量。
首先,我寫了BMI的函式:
# function
def BMI():
df['weight (in kg)'] / (df['Length']/100)**2
但是,我不知道我的下一步是什么。
你能告訴我如何找到每種頭發顏色的平均 BMI 嗎?
uj5u.com熱心網友回復:
您可以使用df.groupby()哪個是Pandas中的功能
對于您的特定情況,您可以使用
df.groupby('Hair colour').mean()['BMI']
給出輸出
Hair colour
Black 23.003356
Blonde 18.806844
Brown 23.271460
Name: BMI, dtype: float64
uj5u.com熱心網友回復:
您可以filter或groupby。
您的BMI功能沒有意義:
- 參考不存在的列
- 對它的回傳什么都不做,所以它被丟棄
過濾:
import pandas as pd
import numpy as np
np.random.seed(0)
df = pd.DataFrame({'Age':[18, 21, 28, 19, 23, 22, 18, 24, 25, 20],
'Hair colour':['Blonde', 'Brown', 'Black', 'Blonde',
'Blonde', 'Black','Brown', 'Brown', 'Black',
'Black'],
'Length (in cm)':np.random.normal(175, 10, 10).round(1),
'Weight (in kg)':np.random.normal(70, 5, 10).round(1)},
index = ['Leon', 'Mirta', 'Nathan', 'Linda', 'Bandar',
'Violeta', 'Noah', 'Niji', 'Lucy', 'Mark'],)
print(df)
# calculate BMI - not as function, using correct column names
df["BMI"] = df['Weight (in kg)'] / (df['Length (in cm)']/100)**2
print(df)
# filter to brown
brown = df[df["Hair colour"] == "Brown"]
print(brown)
print(brown["BMI"].mean())
輸出:
# calculated BMI
Age Hair colour Length (in cm) Weight (in kg) BMI
Leon 18 Blonde 192.6 70.7 19.059296
Mirta 21 Brown 179.0 77.3 24.125339
Nathan 28 Black 184.8 73.8 21.609884
Linda 19 Blonde 197.4 70.6 18.118006
Bandar 23 Blonde 193.7 72.2 19.243229
Violeta 22 Black 165.2 71.7 26.272359
Noah 18 Brown 184.5 77.5 22.767165
Niji 24 Brown 173.5 69.0 22.921875
Lucy 25 Black 174.0 71.6 23.649095
Mark 20 Black 179.1 65.7 20.482087
# filtered output
Age Hair colour Length (in cm) Weight (in kg) BMI
Mirta 21 Brown 179.0 77.3 24.125339
Noah 18 Brown 184.5 77.5 22.767165
Niji 24 Brown 173.5 69.0 22.921875
# avg BMI
23.271459786871446
Groupby:
# use groupby
grouped = df.groupby('Hair colour')
print(*grouped, sep="\n\n")
# https://stackoverflow.com/questions/51091331
print(grouped.get_group("Brown")["BMI"].mean())
Output:
# grouped output
('Black', Age Hair colour Length (in cm) Weight (in kg) BMI
Nathan 28 Black 184.8 73.8 21.609884
Violeta 22 Black 165.2 71.7 26.272359
Lucy 25 Black 174.0 71.6 23.649095
Mark 20 Black 179.1 65.7 20.482087)
('Blonde', Age Hair colour Length (in cm) Weight (in kg) BMI
Leon 18 Blonde 192.6 70.7 19.059296
Linda 19 Blonde 197.4 70.6 18.118006
Bandar 23 Blonde 193.7 72.2 19.243229)
('Brown', Age Hair colour Length (in cm) Weight (in kg) BMI
Mirta 21 Brown 179.0 77.3 24.125339
Noah 18 Brown 184.5 77.5 22.767165
Niji 24 Brown 173.5 69.0 22.921875)
# avg BMI
23.271459786871446
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/433568.html
標籤:Python python-3.x 数据框
下一篇:如何迭代每個組的列值并跟蹤總和
