我只是想知道是否有任何其他方法可以從列中提取年份并為其分配兩個新列,其中一列用于季節,一列用于年份?
我嘗試了這種方法,它似乎有效,但僅適用于年份和選定的行:
year = df['premiered'].str.findall('(\d{4})').str.get(0)
df1 = df.assign(year = year.values)
輸出:
|premiered||year|
|----------||---|
|Spring 1998||1998|
|Spring 2001||2001|
|Fall 2016||NaN|
|Fall 2016||NaN|
uj5u.com熱心網友回復:
Series.str.split與expand選項一起使用:
expand:將拆分的字串展開為單獨的列。
df[['season', 'year']] = df['premiered'].str.split(expand=True)
# premiered season year
# 0 Spring 1998 Spring 1998
# 1 Spring 2001 Spring 2001
# 2 Fall 2016 Fall 2016
# 3 Fall 2016 Fall 2016
或Series.str.extract與正則運算式一起使用:
(\w )-- 捕獲 1 個單詞字符\s*-- 0 個空格(\d )-- 捕獲 1 位
df[['season', 'year']] = df['premiered'].str.extract('(\w )\s*(\d )')
# premiered season year
# 0 Spring 1998 Spring 1998
# 1 Spring 2001 Spring 2001
# 2 Fall 2016 Fall 2016
# 3 Fall 2016 Fall 2016
將新year列轉換為數字也是一個好主意:
df['year'] = df['year'].astype(int)
uj5u.com熱心網友回復:
您可以使用拆分功能
data = { 'premiered' : ['Spring 1998', 'Spring 2001', 'Fall 2016', 'Fall 2016']}
df = pd.DataFrame(data)
df['year'] = df['premiered'].apply(lambda x : x.split(' ')[1])
df
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/441252.html
