我想在實際故障 (1) 前 3 天將健康標簽 (0) 重新標記為故障標簽 (1),就像他們在附加鏈接中所做的那樣:
uj5u.com熱心網友回復:
這是一個有效的解決方案。注意:為了清楚起見,我更愿意完全重寫它。
import pandas as pd
import numpy as np
import datetime
from datetime import date, timedelta
df = pd.read_excel('content/failure.xlsx')
df['date'] = pd.to_datetime(df['date'])
df = df.sort_values(by="date").reindex_like(df) # We need to rebuild the index after sorting
d = datetime.timedelta(days = 3)
failed=df[df['failure']==1] # disks that actually failed
for hdd in failed['serial_number']: # for each hdd
ind=df.index[df['serial_number']==hdd] # look for that patricular hdd in df
# Note: the last element of ind corresponds to the failure date for hdd
failure_date=df.iloc[ind[-1],1] # [rows,column==1] --> (:,'date')
for i in ind[:-1]:
if (failure_date - df.iloc[i,1]).days <= d.days:
df.iloc[i,3]=1 # set failure to 1
print('hdd: A')
print(df[df['serial_number']=='A'])
print('hdd: C')
print(df[df['serial_number']=='C'])
print('hdd: H')
print(df[df['serial_number']=='H'])
導致:
hdd: A
model date serial_number failure smart5 smart187
1 M 2014-01-01 A 0 0 60
5 M 2014-01-02 A 0 0 140
7 M 2014-01-03 A 1 0 180
11 M 2014-01-04 A 1 0 260
12 M 2014-01-05 A 1 0 280
16 M 2014-01-06 A 1 0 360
hdd: C
model date serial_number failure smart5 smart187
2 M 2014-01-01 C 0 0 80
4 M 2014-01-02 C 1 0 120
8 M 2014-01-03 C 1 0 200
10 M 2014-01-04 C 1 0 240
13 M 2014-01-05 C 1 0 300
hdd: H
model date serial_number failure smart5 smart187
0 M 2014-01-01 H 0 0 40
3 M 2014-01-02 H 0 0 100
6 M 2014-01-03 H 0 0 160
9 M 2014-01-04 H 1 0 220
14 M 2014-01-05 H 1 0 320
15 M 2014-01-06 H 1 0 340
17 M 2014-01-07 H 1 0 400
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/363572.html
上一篇:按每分鐘的最佳值搜索
