假設我有一個包含列、電影 ID、標題、年份和評級的資料集。以下是我的資料的示例子集:
| movieId | title | rating | year |
|:-------:|:--------:|:------:|:----:|
| 1 | abc | 3.5 | 1995 |
| 1 | abc | 3 | 1995 |
| 1 | abc | 4 | 1995 |
| 1 | abc | 3 | 1995 |
| 1 | abc | 5 | 1995 |
| 1 | abc | 3.5 | 1995 |
| 1 | abc | 4.5 | 1995 |
| 1 | abc | 0.5 | 1995 |
| 1 | abc | 3.5 | 1995 |
| 1 | abc | 4.5 | 1995 |
| 1 | abc | 4 | 1995 |
| 1 | abc | 5 | 1995 |
| 1 | abc | 4.5 | 1995 |
| 1 | abc | 4 | 1995 |
| 1 | abc | 4 | 1995 |
| 1 | abc | 4 | 1995 |
| 1 | abc | 4 | 1995 |
| 1 | abc | 3 | 1995 |
| 1 | abc | 4 | 1995 |
| 1 | abc | 3.5 | 1995 |
| 1 | abc | 3 | 1995 |
| 1 | abc | 4 | 1995 |
| 1 | abc | 5 | 1995 |
| 1 | abc | 4.5 | 1995 |
| 1 | abc | 5 | 1995 |
| 2 | xyz | 3 | 2000 |
| 2 | xyz | 2 | 2000 |
| 2 | xyz | 3.5 | 2000 |
| 2 | xyz | 4 | 2000 |
| 2 | xyz | 3.5 | 2000 |
| 2 | xyz | 5 | 2000 |
| 2 | xyz | 3.5 | 2000 |
| 2 | xyz | 3 | 2000 |
| 2 | xyz | 3 | 2000 |
| 2 | xyz | 2 | 2000 |
| 2 | xyz | 3.5 | 2000 |
| 2 | xyz | 3 | 2000 |
| 2 | xyz | 3 | 2000 |
| 2 | xyz | 4 | 2000 |
| 2 | xyz | 2 | 2000 |
| 2 | xyz | 3.5 | 2000 |
| 2 | xyz | 1 | 2000 |
| 3 | pqr | 3 | 1997 |
| 3 | pqr | 2 | 1997 |
| 3 | pqr | 3.5 | 1997 |
| 3 | pqr | 3.5 | 1997 |
| 3 | pqr | 3 | 1997 |
| 3 | pqr | 3 | 1997 |
| 3 | pqr | 3 | 1997 |
| 3 | pqr | 3 | 1997 |
| 3 | pqr | 4.5 | 1997 |
| 3 | pqr | 3.5 | 1997 |
| 3 | pqr | 4 | 1997 |
| 3 | pqr | 1.5 | 1997 |
| 3 | pqr | 2 | 1997 |
| 3 | pqr | 2 | 1997 |
| 3 | pqr | 2.5 | 1997 |
| 4 | def | 3 | 1999 |
| 4 | def | 2.5 | 1999 |
| 4 | def | 2.5 | 1999 |
| 4 | def | 0.5 | 1999 |
| 4 | def | 2 | 1999 |
| 4 | def | 3 | 1999 |
| 5 | movie123 | 4 | 2006 |
| 5 | movie123 | 4 | 2006 |
| 5 | movie123 | 3 | 2006 |
| 5 | movie123 | 1.5 | 2006 |
| 5 | movie123 | 3 | 2006 |
| 5 | movie123 | 2 | 2006 |
| 5 | movie123 | 2.5 | 2006 |
| 5 | movie123 | 3 | 2006 |
| 5 | movie123 | 4 | 2006 |
| 5 | movie123 | 0.5 | 2006 |
| 5 | movie123 | 1 | 2006 |
| 5 | movie123 | 3.5 | 2006 |
| 5 | movie123 | 2 | 2006 |
| 5 | movie123 | 3 | 2006 |
| 5 | movie123 | 1.5 | 2006 |
| 5 | movie123 | 2.5 | 2006 |
| 5 | movie123 | 4 | 2006 |
| 5 | movie123 | 4 | 2006 |
| 5 | movie123 | 3.5 | 2006 |
| 5 | movie123 | 3 | 2006 |
| 6 | movie456 | 4 | 2012 |
| 6 | movie456 | 3.5 | 2012 |
| 6 | movie456 | 3.5 | 2012 |
| 6 | movie456 | 4 | 2012 |
| 6 | movie456 | 5 | 2012 |
| 6 | movie456 | 2.5 | 2012 |
| 6 | movie456 | 4 | 2012 |
| 6 | movie456 | 4 | 2012 |
| 6 | movie456 | 3.5 | 2012 |
| 6 | movie456 | 5 | 2012 |
| 6 | movie456 | 2 | 2012 |
| 6 | movie456 | 4 | 2012 |
我想通過定義一個函式來計算整個資料集的平均值、計數、最小評分和平均評分。所以,我首先計算每部電影的評分數量和平均值。
avg_rating = df.groupby(['movieId','year'])['ratings'].agg([('Count','size'), ('Mean','mean')]).sort_values(by='Mean',ascending=False)
由于某些電影的評論數量可能較少但評級較高,而其他電影的評論數量較多且評級較高,因此分析可能會出現偏差。所以,我想計算加權平均值,為此我定義了一個函式。
# R = average for the movie (mean) = (Rating)
# v = number of ratings/reviews for the movie = (votes)
# m = minimum reviews required to be listed in the Top 250 movie list
# C = the mean rating across the whole report
def weighted_rating(R, v, m, C):
return (v/(v m))*R (m/(v m))*C
avg_rating= avg_rating.assign(wr = weighted_rating(mean, count, 500, mean(mean)))
當我運行上面的最后一行代碼時,我收到一個錯誤:name 'mean' is not defined
列計數仍然存在相同的錯誤:未定義名稱“計數”
如何解決此錯誤,以便我的最終輸出包含列 movieId、year、count、mean 和 wr?
uj5u.com熱心網友回復:
您需要使用來自您的欄位df:
avg_rating = avg_rating.assign(
wr=weighted_rating(avg_rating['Mean'], avg_rating['Count'], 500, mean(avg_rating['Mean'])))
所以最終的代碼可能是這樣的:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import (division, absolute_import, print_function,
unicode_literals)
import pandas as pd
from statistics import mean
def weighted_rating(R, v, m, C):
return (v / (v m)) * R (m / (v m)) * C
def main():
df = pd.read_csv('movies.csv')
avg_rating = df.groupby(['movieId', 'year'])['rating'].agg(
[('Count', 'size'), ('Mean', 'mean')]).sort_values(by='Mean', ascending=False)
avg_rating = avg_rating.assign(
wr=weighted_rating(avg_rating['Mean'], avg_rating['Count'], 500, mean(avg_rating['Mean'])))
print(avg_rating)
if __name__ == '__main__':
main()
使用 CSV:
movieId,title,rating,year
1,abc,3.5,1995
1,abc,3,1995
1,abc,4,1995
1,abc,3,1995
1,abc,5,1995
1,abc,3.5,1995
1,abc,4.5,1995
1,abc,0.5,1995
1,abc,3.5,1995
1,abc,4.5,1995
1,abc,4,1995
1,abc,5,1995
1,abc,4.5,1995
1,abc,4,1995
1,abc,4,1995
1,abc,4,1995
1,abc,4,1995
1,abc,3,1995
1,abc,4,1995
1,abc,3.5,1995
1,abc,3,1995
1,abc,4,1995
1,abc,5,1995
1,abc,4.5,1995
1,abc,5,1995
2,xyz,3,2000
2,xyz,2,2000
2,xyz,3.5,2000
2,xyz,4,2000
2,xyz,3.5,2000
2,xyz,5,2000
2,xyz,3.5,2000
2,xyz,3,2000
2,xyz,3,2000
2,xyz,2,2000
2,xyz,3.5,2000
2,xyz,3,2000
2,xyz,3,2000
2,xyz,4,2000
2,xyz,2,2000
2,xyz,3.5,2000
2,xyz,1,2000
3,pqr,3,1997
3,pqr,2,1997
3,pqr,3.5,1997
3,pqr,3.5,1997
3,pqr,3,1997
3,pqr,3,1997
3,pqr,3,1997
3,pqr,3,1997
3,pqr,4.5,1997
3,pqr,3.5,1997
3,pqr,4,1997
3,pqr,1.5,1997
3,pqr,2,1997
3,pqr,2,1997
3,pqr,2.5,1997
4,def,3,1999
4,def,2.5,1999
4,def,2.5,1999
4,def,0.5,1999
4,def,2,1999
4,def,3,1999
5,movie12,4,2006
5,movie12,4,2006
5,movie12,3,2006
5,movie12,1.5,2006
5,movie12,3,2006
5,movie12,2,2006
5,movie12,2.5,2006
5,movie12,3,2006
5,movie12,4,2006
5,movie12,0.5,2006
5,movie12,1,2006
5,movie12,3.5,2006
5,movie12,2,2006
5,movie12,3,2006
5,movie12,1.5,2006
5,movie12,2.5,2006
5,movie12,4,2006
5,movie12,4,2006
5,movie12,3.5,2006
5,movie12,3,2006
6,movie45,4,2012
6,movie45,3.5,2012
6,movie45,3.5,2012
6,movie45,4,2012
6,movie45,5,2012
6,movie45,2.5,2012
6,movie45,4,2012
6,movie45,4,2012
6,movie45,3.5,2012
6,movie45,5,2012
6,movie45,2,2012
6,movie45,4,2012
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/351963.html
上一篇:轉換錯誤
