如何將 的值轉換Years_in_service為其相應的十進制/浮點值?
例如,'5 year(s), 7 month(s), 3 day(s)'十進制值為5.59
import pandas as pd
import numpy as np
data = {'ID':['A1001','A5001','B1001','D5115','K4910'],
'Years_in_service': ['5 year(s), 7 month(s), 3 day(s)', '16 year(s), 0 month(s), 25 day(s)',
'7 year(s), 0 month(s), 2 day(s)', '0 year(s), 11 month(s), 23 day(s)','1 year(s), 0 month(s), 6 day(s)'],
'Age': [45, 59,21,18,35]}
df = pd.DataFrame(data)
df
目前我只能提取年份(請參閱下面的嘗試)
df['Years_in_service'].str[:2].astype(float)
請出示您的完整代碼,感謝您的嘗試。
uj5u.com熱心網友回復:
這是一種方法:
def convert_dates(y,m,d):
return round(int(y) int(m)/12 int(d)/365.25, 2)
df['date_float'] = df['Years_in_service'].apply(lambda x: convert_dates(*[int(i) for i in x.split(' ') if i.isnumeric()]))
print(df)
ID Years_in_service Age date_float
0 A1001 5 year(s), 7 month(s), 3 day(s) 45 5.59
1 A5001 16 year(s), 0 month(s), 25 day(s) 59 16.07
2 B1001 7 year(s), 0 month(s), 2 day(s) 21 7.01
3 D5115 0 year(s), 11 month(s), 23 day(s) 18 0.98
4 K4910 1 year(s), 0 month(s), 6 day(s) 35 1.02
筆記:
*[int(i) for i in x.split(' ') if i.isnumeric()]<- 此運算式解包串列并將數字作為引數傳遞給convert_dates函式。
uj5u.com熱心網友回復:
這個怎么樣?
后:
import pandas as pd
import numpy as np
data = {'ID':['A1001','A5001','B1001','D5115','K4910'],
'Years_in_service': ['5 year(s), 7 month(s), 3 day(s)', '16 year(s), 0 month(s), 25 day(s)',
'7 year(s), 0 month(s), 2 day(s)', '0 year(s), 11 month(s), 23 day(s)','1 year(s), 0 month(s), 6 day(s)'],
'Age': [45, 59,21,18,35]}
df = pd.DataFrame(data)
做這個:
returnlist = []
for each in df['Years_in_service']:
years, months, days = [float(i.strip().split(' ')[0]) for i in each.split(',')]
returnlist.append(years months/12 days/365.25)
for each in returnlist:
print (f'Years in service: {each:.2f}')
# Result:
# Years in service: 5.59
# Years in service: 16.07
# Years in service: 7.01
# Years in service: 0.98
# Years in service: 1.02
您可以像這樣使其更緊湊(但可讀性更低)。我不認為有計算上的好處,但無論如何,這就是這個想法:
for each in df['Years_in_service']:
returnlist.append(np.sum(np.array([1, 1/12, 1/365.25])*np.array([float(i.strip().split(' ')[0]) for i in each.split(',')])))
uj5u.com熱心網友回復:
如果您不關心年/月精度,并且年/月/日始終存在并且按此順序,您可以通過平均轉換因子和extractall3 個數字:dividesum
df['Total'] = (pd.to_numeric(df['Years_in_service'].str.extractall('(\d )')[0])
.unstack().div([1, 12, 365.25]).sum(axis=1)
.round(2) # optional
)
輸出:
ID Years_in_service Age Total
0 A1001 5 year(s), 7 month(s), 3 day(s) 45 5.59
1 A5001 16 year(s), 0 month(s), 25 day(s) 59 16.07
2 B1001 7 year(s), 0 month(s), 2 day(s) 21 7.01
3 D5115 0 year(s), 11 month(s), 23 day(s) 18 0.98
4 K4910 1 year(s), 0 month(s), 6 day(s) 35 1.02
uj5u.com熱心網友回復:
i=1
for name in ['month','day']:
df[name] = [date.split(',')[i].split(' ')[1] for date in df['Years_in_service']]
df[name]=df[name].astype('float64')
i =1
df['years']=[date.split(',')[0].split(' ')[0] for date in df['Years_in_service']]
df['years'] = df['years'].astype('float64')
i=12
for name in ['month','day']:
df[name] = [x/i for x in df[name]]
i=365.25
l=[]
for i in range(len(df.index)):
l.append(round((df.iloc[i,3] df.iloc[i,4] df.iloc[i,5]),2))
df['decimal_date'] =l
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/532777.html
