我有一個巨大的熊貓資料框,如下所示。
d = {'id1': ['85643', '85644','discription','8564312','8564314','85645','discription','85646','8564318','85647','85648','85649','85655','56731','34566','78931','78931'],'ID': ['G-00001', 'G-00001','G-00002','G-00002','G-00002','G-00001','G-00001','G-00001','G-00001','G-00001','G-00002','G-00002','G-00002','G-00002','G-00003','G-00003','G-00003'],'col1': [np.nan, np.nan,5,np.nan,4,np.nan,60,np.nan,0,6,3,2,4,32,3,1,23],'Goal': [np.nan, 56,78,np.nan,89,73,np.nan ,np.nan ,np.nan, np.nan, np.nan, 34,np.nan, 7, 84,np.nan,5 ],'col2': [793, np.nan,8,np.nan,43,np.nan,610,np.nan,0,16,23,72,48,3,28,5,3],'col3': [500, np.nan,89,np.nan,44,np.nan,60,np.nan,5,6,3,2,4,13,12,14,98],'Date': ['2021-06-13', '2021-06-13','2021-06-14','2021-06-13','2021-06-14','2021-06-15','2021-06-15','2021-06-13','2021-06-16','2021-06-13','2021-06-13','2021-06-13','2021-06-16','2021-05-23','2021-05-13','2021-03-26','2021-05-13']}
df = pd.DataFrame(data=d)
df
id1 ID col1 Goal col2 col3 Date
0 85643 G-00001 NaN NaN 793.000 500.000 2021-06-13
1 85644 G-00001 NaN 56.0000 NaN NaN 2021-06-13
2 discription G-00002 5.0000 78.0000 8.0000 89.0000 2021-06-14
3 8564312 G-00002 NaN NaN NaN NaN 2021-06-13
4 8564314 G-00002 4.0000 89.0000 43.0000 44.0000 2021-06-14
5 85645 G-00001 NaN 73.0000 NaN NaN 2021-06-15
6 discription G-00001 60.0000 NaN 610.000 60.0000 2021-06-15
7 85646 G-00001 NaN NaN NaN NaN 2021-06-13
8 8564318 G-00001 0.0000 NaN 0.0000 5.0000 2021-06-16
9 85647 G-00001 6.0000 NaN 16.0000 6.0000 2021-06-13
10 85648 G-00002 3.0000 NaN 23.0000 3.0000 2021-06-13
11 85649 G-00002 2.0000 34.0000 72.0000 2.0000 2021-06-13
12 85655 G-00002 4.0000 NaN 48.0000 4.0000 2021-06-16
13 56731 G-00002 32.0000 7.0000 3.0000 13.0000 2021-05-23
14 34566 G-00003 3.0000 84.0000 28.0000 12.0000 2021-05-13
15 78931 G-00003 1.0000 NaN 5.0000 14.0000 2021-03-26
16 78931 G-00003 23.0000 5.0000 3.0000 98.0000 2021-05-13
所以我想在滿足以下條件時填寫“col1”、“col2”和“col3”列。當“id1”列有單詞“discription”時,我想使用“id1”“discription”行中的值填充“col1”、“col2”和“col3”的上行值。
例如索引 2 有單詞“描述”,我想使用“描述”中的“col1”、“col2”和“col3”中的值為“col1”、“col2”和“col3”填充索引 1 中的 na 值線。我不想為“col1”、“col2”和“col3”填寫所有其他空值。
所以最終的資料框應該如下所示。
id1 ID col1 Goal col2 col3 Date
0 85643 G-00001 NaN NaN 793.000 500.000 2021-06-13
1 85644 G-00001 5.0000 56.0000 8.0000 89.0000 2021-06-13
2 discription G-00002 5.0000 78.0000 8.0000 89.0000 2021-06-14
3 8564312 G-00002 NaN NaN NaN NaN 2021-06-13
4 8564314 G-00002 4.0000 89.0000 43.0000 44.0000 2021-06-14
5 85645 G-00001 60.0000 73.0000 610.000 60.0000 2021-06-15
6 discription G-00001 60.0000 NaN 610.000 60.0000 2021-06-15
7 85646 G-00001 NaN NaN NaN NaN 2021-06-13
8 8564318 G-00001 0.0000 NaN 0.0000 5.0000 2021-06-16
9 85647 G-00001 6.0000 NaN 16.0000 6.0000 2021-06-13
10 85648 G-00002 3.0000 NaN 23.0000 3.0000 2021-06-13
11 85649 G-00002 2.0000 34.0000 72.0000 2.0000 2021-06-13
12 85655 G-00002 4.0000 NaN 48.0000 4.0000 2021-06-16
13 56731 G-00002 32.0000 7.0000 3.0000 13.0000 2021-05-23
14 34566 G-00003 3.0000 84.0000 28.0000 12.0000 2021-05-13
15 78931 G-00003 1.0000 NaN 5.0000 14.0000 2021-03-26
16 78931 G-00003 23.0000 5.0000 3.0000 98.0000 2021-05-13
有可能在python中做到這一點嗎?任何建議將不勝感激。提前致謝!
uj5u.com熱心網友回復:
discription屏蔽in 中沒有的行id1,然后shift將屏蔽的資料幀向上一個單位并填充所需列中的 nan 值
c = ['col1', 'col2', 'col3']
mask = df['id1'].eq('discription')
df[c] = df[c].fillna(df[c].where(mask).shift(-1))
id1 ID col1 Goal col2 col3 Date
0 85643 G-00001 NaN NaN 793.0 500.0 2021-06-13
1 85644 G-00001 5.0 56.0 8.0 89.0 2021-06-13
2 discription G-00002 5.0 78.0 8.0 89.0 2021-06-14
3 8564312 G-00002 NaN NaN NaN NaN 2021-06-13
4 8564314 G-00002 4.0 89.0 43.0 44.0 2021-06-14
5 85645 G-00001 60.0 73.0 610.0 60.0 2021-06-15
6 discription G-00001 60.0 NaN 610.0 60.0 2021-06-15
7 85646 G-00001 NaN NaN NaN NaN 2021-06-13
8 8564318 G-00001 0.0 NaN 0.0 5.0 2021-06-16
9 85647 G-00001 6.0 NaN 16.0 6.0 2021-06-13
10 85648 G-00002 3.0 NaN 23.0 3.0 2021-06-13
11 85649 G-00002 2.0 34.0 72.0 2.0 2021-06-13
12 85655 G-00002 4.0 NaN 48.0 4.0 2021-06-16
13 56731 G-00002 32.0 7.0 3.0 13.0 2021-05-23
14 34566 G-00003 3.0 84.0 28.0 12.0 2021-05-13
15 78931 G-00003 1.0 NaN 5.0 14.0 2021-03-26
16 78931 G-00003 23.0 5.0 3.0 98.0 2021-05-13
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/426271.html
