如何在Python中使用zip從串列中輸入和取平均值-有解無憂

我有一組包含以下資料的串列（在 Python 中）：

['425842', '2008', 'Monday', 23:30:00', '10'] 
['425843', '2008', 'Tuesday', 23:30:00', '9'] 
['425844', '2009', 'Monday', 23:30:00', '2'] 
['425845', '2009', 'Monday', 23:30:00', '3'] 
['425846', '2010', 'Monday', 23:30:00', '2'] 
['425847', '2010', 'Monday', 23:30:00', '10'] 
['425848', '2010', 'Tuesday', 23:30:00', '10']

我想根據年份計算最后一列（索引5）的值的平均值，例如：

[2008, 9.5]
[2009, 2.5]
[2010, 7.3]

我試圖通過 Python 內置的 zip 函式來實作，但是這個函式是由 interator 生成的。你能幫我解決一下嗎？

uj5u.com熱心網友回復：

使用pandas按年份對資料進行分組，然后取第 5 列中的值的平均值。

data = [
    ['425842', '2008', 'Monday', '23:30:00', '10'], 
    ['425843', '2008', 'Tuesday', '23:30:00', '9'], 
    ['425844', '2009', 'Monday', '23:30:00', '2'], 
    ['425845', '2009', 'Monday', '23:30:00', '3'], 
    ['425846', '2010', 'Monday', '23:30:00', '2'], 
    ['425847', '2010', 'Monday', '23:30:00', '10'], 
    ['425848', '2010', 'Tuesday', '23:30:00', '10'],
]
import pandas as pd
df = pd.DataFrame(data, columns=["id", "year", "day","time","value"])
df["value"] = pd.to_numeric(df["value"])
print(df.groupby("year")["value"].mean())

uj5u.com熱心網友回復：

zip在這里根本沒有幫助；你可能想建立一個字典來收集每年的總數，這樣你就可以平均它們。

data = [
    ['425842', '2008', 'Monday', '23:30:00', '10'], 
    ['425843', '2008', 'Tuesday', '23:30:00', '9'], 
    ['425844', '2009', 'Monday', '23:30:00', '2'], 
    ['425845', '2009', 'Monday', '23:30:00', '3'], 
    ['425846', '2010', 'Monday', '23:30:00', '2'], 
    ['425847', '2010', 'Monday', '23:30:00', '10'], 
    ['425848', '2010', 'Tuesday', '23:30:00', '10'],
]

year_totals = {year: [] for year in set(year for _, year, _, _, _ in data)}
for _, year, _, _, value in data:
    year_totals[year].append(int(value))

averages = {y: sum(t) / len(t) for y, t in year_totals.items()}

print(averages)  # {'2010': 7.333333333333333, '2008': 9.5, '2009': 2.5}

uj5u.com熱心網友回復：

這應該有效：

data = [['425842', '2008', 'Monday', '23:30:00', '10'],
['425843', '2008', 'Tuesday', '23:30:00', '9'],
['425844', '2009', 'Monday', '23:30:00', '2'],
['425845', '2009', 'Monday', '23:30:00', '3'],
['425846', '2010', 'Monday', '23:30:00', '2'], 
['425847', '2010', 'Monday', '23:30:00', '10'], 
['425848', '2010', 'Tuesday', '23:30:00', '10']]
sums = {}
for i in data:
    if i[1] not in sums:
        sums[i[1]] = [int(i[-1])]
    else:
        sums[i[1]].append(int(i[-1]))
sums = {i: sum(sums[i]) / len(sums[i]) for i in sums}
output = [[i, sums[i]] for i in sums]

的價值output：

[['2008', 9.5], ['2009', 2.5], ['2010', 7.333333333333333]]

uj5u.com熱心網友回復：

您可以使用itertools.groupby按年份對串列進行分組并計算每個組的平均值：

data = [['425842', '2008', 'Monday', '23:30:00', '10'],
        ['425843', '2008', 'Tuesday', '23:30:00', '9'],
        ['425844', '2009', 'Monday', '23:30:00', '2'],
        ['425845', '2009', 'Monday', '23:30:00', '3'],
        ['425846', '2010', 'Monday', '23:30:00', '2'],
        ['425847', '2010', 'Monday', '23:30:00', '10'],
        ['425848', '2010', 'Tuesday', '23:30:00', '10']]

groups = {int(key): list(map(lambda x: int(x[4]), value)) for key, value in
          itertools.groupby(data, lambda x: x[1])}

averages = {key: sum(value) / len(value) for key, value in groups.items()}

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/445822.html

標籤：Python 列表循环字典

上一篇：試圖根據求和條件創建一個for回圈，但無法正常作業

下一篇：如何在React中重新排序地圖功能