我如何用pandas.groupby()總結時間戳-有解無憂

我在腳本中有一個檢測到的類的日志（detection.csv）

HP,0:00:08 
Kellogs,0:02:03 
Rayban,0:00:25 
Skechers,0:00:09 
Rayban,0:04:26 
Skechers,0:02:34 
HP,0:00:57 
Rayban,0:00:14 
HP,0:00:02 
HP,0:00:08 
Kellogs,0:02:06 
Rayban,0:00:26 
Skechers,0:00:10

問題是有沒有一種方法可以使用 pandas.groupby() 方法或任何其他方法來總結檢測到的類的持續時間

注意：兩列都是字串格式

當我使用 pandas.groupby() 方法時，結果沒有總結

我如何用 pandas.groupby() 總結時間戳

OverallCode:

import numpy as np
import pandas as pd


csvdata=[]
with open('result2.txt','r ') as myfile:
 for lines in myfile:
  line=myfile.read()
  line=line.replace('  ',',')
  csvdata.append(line)

#print(csvdata)

with open('detection.csv','w') as newfile:
 for i in range(len(csvdata)):
  line=csvdata[i]
  newfile.write(line)
  newfile.close()

df=pd.read_csv('detection.csv',names=['class', 'timestamp'],header=None)

#ndf=df.groupby(['class'])['timestamp'].sum()
#print(ndf)


df['timestamp'] = pd.to_timedelta(df['timestamp'])

def format_timedelta(x):
    ts = x.total_seconds()
    hours, remainder = divmod(ts, 3600)
    minutes, seconds = divmod(remainder, 60)
    return ('{}:{:02d}:{:02d}').format(int(hours), int(minutes), int(seconds)) 
        
df1 = df.groupby('class')['timestamp'].sum().apply(format_timedelta).reset_index()
print (df1)

uj5u.com熱心網友回復：

是的，可以將列轉換為 timedeltas byto_timedelta和聚合sum：

df['time'] = pd.to_timedelta(df['time'])

df1 = df.groupby('company', as_index=False)['time'].sum()
print (df1)
    company            time
0        HP 0 days 00:01:15
1   Kellogs 0 days 00:04:09
2    Rayban 0 days 00:05:31
3  Skechers 0 days 00:02:53

對于原始格式使用自定義函式：

df['time'] = pd.to_timedelta(df['time'])

def format_timedelta(x):
    ts = x.total_seconds()
    hours, remainder = divmod(ts, 3600)
    minutes, seconds = divmod(remainder, 60)
    return ('{}:{:02d}:{:02d}').format(int(hours), int(minutes), int(seconds)) 
        
df1 = df.groupby('company')['time'].sum().apply(format_timedelta).reset_index()
print (df1)
    company     time
0        HP  0:01:15
1   Kellogs  0:04:09
2    Rayban  0:05:31
3  Skechers  0:02:53

編輯：您可以簡化您的代碼：

csvdata=[]
with open('result2.txt','r ') as myfile:
 for lines in myfile:
  line=myfile.read()
  line=line.replace('  ',',')
  csvdata.append(line)

#print(csvdata)

with open('detection.csv','w') as newfile:
 for i in range(len(csvdata)):
  line=csvdata[i]
  newfile.write(line)
  newfile.close()

df=pd.read_csv('result2.csv',names=['class', 'timestamp'],header=None)

到：

#convert txt with tab separator
df=pd.read_csv('result2.txt',names=['class', 'timestamp'],header=None, sep='\t')

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/387398.html

標籤：Python 蟒蛇-3.x 熊猫数据框 python-2.7

上一篇：31歲，我從央企離職了......

下一篇：Python腳本無法呼叫JiraRESTAPI