回圈創建多個串列，具有不同的命名-有解無憂

我想創建一個創建多個串列的回圈，名稱不同。我有一個資料框，其中包含一個我試圖根據月份過濾的 excel 檔案。（理想情況下，串列應該是 1、2、3 等。）每個月都應該創建一個串列最后我需要再次遍歷這些串列以計算平均值并計算 len 如果您有任何問題，請告訴我。

import pandas as pd

#read data
excel = 'short data.xlsx'
data = pd.read_excel(excel, parse_dates=['Closed Date Time'])
df = pd.DataFrame(data)
# data.info()
#Format / delete time from date column
data['Closed Date Time'] = pd.to_datetime(data['Closed Date Time'])
df['Close_Date'] = data['Closed Date Time'].dt.date
df['Close_Date'] = pd.to_datetime(data['Close_Date'])

#loop to create multiple lists
times = 12

for _ in range(times):
    if times <= 9:
        month = df[df['Close_Date'].dt.strftime('%Y-%m') == f'2018-0{times}']
    month = df[df['Close_Date'].dt.strftime('%Y-%m') == f'2018-{times}']

示例資料 [ 回圈創建多個串列，具有不同的命名

uj5u.com熱心網友回復：

創建具有不同名稱的串列可能是完全錯誤的想法。

您應該使用 sublistst（和索引而不是名稱）創建單個串列或使用名稱作為鍵的單個字典。或者更好的是，您應該DataFrame使用所有值（在行或列中）創建單個。它將對接下來的計算更有用。

而這一切可能都不需要for-loop。

但我認為你可以用不同的方式來做。您可以使用創建列Year Month

df['Year_Month'] = df['Close_Date'].dt.strftime('%Y-%m')

之后使用groupby()每個月執行函式而不使用for-loops。

averages = df.groupby('Year_Month').mean()

sizes = df.groupby('Year_Month').size()

帶有示例資料的最小作業代碼：

import pandas as pd

#df = pd.read_excel('short data.xlsx', parse_dates=['Closed Date Time'])

data = {
    'Closed Date Time': ['2022.10.25 01:00', '2022.10.24 01:00', '2018.10.25 01:00', '2018.10.24 01:00', '2018.10.23 01:00'],
    'Price': [1, 2, 3, 4, 5],
    'User': ['A','A','A','B','C'],
}

df = pd.DataFrame(data)
print(df)

df['Closed Date Time'] = pd.to_datetime(df['Closed Date Time'])
df['Close_Date'] = df['Closed Date Time'].dt.date
df['Close_Date'] = pd.to_datetime(df['Close_Date'])

df['Year_Month'] = df['Close_Date'].dt.strftime('%Y-%m')
print(df)

print('\n--- averages ---\n')
averages = df.groupby('Year_Month').mean()
print(averages)

print('\n--- sizes ---\n')
sizes = df.groupby('Year_Month').size()
print(sizes)

結果：

   Closed Date Time  Price
0  2022.10.25 01:00      1
1  2022.10.24 01:00      2
2  2018.10.25 01:00      3
3  2018.10.24 01:00      4
4  2018.10.23 01:00      5

     Closed Date Time  Price Close_Date Year_Month
0 2022-10-25 01:00:00      1 2022-10-25    2022-10
1 2022-10-24 01:00:00      2 2022-10-24    2022-10
2 2018-10-25 01:00:00      3 2018-10-25    2018-10
3 2018-10-24 01:00:00      4 2018-10-24    2018-10
4 2018-10-23 01:00:00      5 2018-10-23    2018-10

--- averages ---

            Price
Year_Month       
2018-10       4.0
2022-10       1.5

--- sizes ---

Year_Month
2018-10    3
2022-10    2
dtype: int64

編輯：

data = df.groupby('Year_Month').agg({'Price':['mean','size']})
print(data)

結果：

           Price     
            mean size
Year_Month           
2018-10      4.0    3
2022-10      1.5    2

編輯：

.groupby()具有和.apply()執行更復雜功能的示例。

后來它使用.to_dict()和.plot()

import pandas as pd

#df = pd.read_excel('short data.xlsx', parse_dates=['Closed Date Time'])

data = {
    'Closed Date Time': ['2022.10.25 01:00', '2022.10.24 01:00', '2018.10.25 01:00', '2018.10.24 01:00', '2018.10.23 01:00'],
    'Price': [1, 2, 3, 4, 5],
    'User': ['A','A','A','B','C'],
}

df = pd.DataFrame(data)
#print(df)

df['Closed Date Time'] = pd.to_datetime(df['Closed Date Time'])
df['Close_Date'] = df['Closed Date Time'].dt.date
df['Close_Date'] = pd.to_datetime(df['Close_Date'])

df['Year_Month'] = df['Close_Date'].dt.strftime('%Y-%m')
#print(df)

def calculate(group):
    #print(group)
    #print(group['Price'].mean())
    #print(group['User'].unique().size)
    result = {
        'Mean':  group['Price'].mean(),
        'Users': group['User'].unique().size,
        'Div':   group['Price'].mean()/group['User'].unique().size
    }
    return pd.Series(result)
    
data = df.groupby('Year_Month').apply(calculate)

print(data)

print('--- dict ---')
print(data.to_dict())
#print(data.to_dict('dict'))

print('--- records ---')
print(data.to_dict('records'))

print('--- list ---')
print(data.to_dict('list'))

print('--- index ---')
print(data.to_dict('index'))

import matplotlib.pyplot as plt
data.plot(kind='bar', rot=0)
plt.show()

結果：

            Mean  Users       Div
Year_Month                       
2018-10      4.0    3.0  1.333333
2022-10      1.5    1.0  1.500000
--- dict ---
{'Mean': {'2018-10': 4.0, '2022-10': 1.5}, 'Users': {'2018-10': 3.0, '2022-10': 1.0}, 'Div': {'2018-10': 1.3333333333333333, '2022-10': 1.5}}
--- records ---
[{'Mean': 4.0, 'Users': 3.0, 'Div': 1.3333333333333333}, {'Mean': 1.5, 'Users': 1.0, 'Div': 1.5}]
--- list ---
{'Mean': [4.0, 1.5], 'Users': [3.0, 1.0], 'Div': [1.3333333333333333, 1.5]}
--- index ---
{'2018-10': {'Mean': 4.0, 'Users': 3.0, 'Div': 1.3333333333333333}, '2022-10': {'Mean': 1.5, 'Users': 1.0, 'Div': 1.5}}

回圈創建多個串列，具有不同的命名

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/519803.html

標籤：Python熊猫列表循环条件语句

上一篇：求和遞回函式將值回傳給變數

下一篇：在R中撰寫回圈函式