假設我的 pandas 資料框如下所示:
lst = [45.45454545454545, 45.45454545454545, 45.45454545454545, 45.45454545454545, 45.45454545454545, 36.36363636363637, 36.36363636363637, 36.36363636363637, 27.27272727272727, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 27.27272727272727, 0.0, 0.0, 27.27272727272727, 0.0, 0.0, 0.0, 0.0, 27.27272727272727, 0.0, 0.0, 0.0, 36.36363636363637, 0.0, 27.27272727272727, 0.0, 27.27272727272727, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 27.27272727272727, 27.27272727272727, 54.54545454545454, 27.27272727272727, 36.36363636363637, 36.36363636363637, 54.54545454545454, 36.36363636363637, 45.45454545454545, 45.45454545454545, 36.36363636363637, 36.36363636363637, 45.45454545454545, 45.45454545454545, 36.36363636363637, 45.45454545454545, 36.36363636363637, 45.45454545454545, 36.36363636363637, 45.45454545454545, 36.36363636363637, 36.36363636363637, 36.36363636363637, 0.0, 36.36363636363637, 27.27272727272727, 0.0, 36.36363636363637, 0.0, 36.36363636363637, 36.36363636363637, 0.0, 0.0, 27.27272727272727, 0.0, 36.36363636363637, 0.0, 0.0, 0.0, 0.0, 36.36363636363637, 36.36363636363637, 0.0, 36.36363636363637, 36.36363636363637, 27.27272727272727, 27.27272727272727, 36.36363636363637, 36.36363636363637, 36.36363636363637, 36.36363636363637, 0.0, 27.27272727272727, 0.0, 0.0, 0.0, 27.27272727272727, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 27.27272727272727, 36.36363636363637, 0.0, 0.0, 0.0, 0.0, 0.0]
df = pd.DataFrame(lst,columns =['%'])
df.index.name='Time/ps'
df
現在,我想知道第一次出現在什么“時間/ps”時,“%”減少到 0.0 ,但前提是 0(零)彼此下方至少有5 個連續的行。我試圖用這段代碼來做它,它部分作業:
for k, v in df[df['%'] == 0.000000].groupby((df['%'] != 0.000000).cumsum()):
print(f'[group {k}]')
print(v)
print('\n')
但是,問題是當“%”列中有 0.0 時,我不知道如何獲取資訊,彼此之間至少有 5 個連續的行。此代碼列印所有出現的事件,我可以滾動它,但我想自動執行它。我想要的輸出看起來像這樣:Time/ps: 9
謝謝你的建議
uj5u.com熱心網友回復:
天真的答案可能是獲取您的代碼并執行以下操作:
for k, v in df[df['%'] == 0.000000].groupby((df['%'] != 0.000000).cumsum()):
if len(v) > 5:
print("Time/ps:", k)
break
更好的方法可能是這樣的:
df[df['%'] == 0.000000].groupby((df['%'] != 0.000000).cumsum()).filter(lambda x: len(x) > 5)
我給你拿了 groupby 代碼,然后用 filter 過濾掉長度小于 5 的組。
這給出了這個資料框:
Time/ps
9 0.0
10 0.0
11 0.0
12 0.0
13 0.0
14 0.0
15 0.0
16 0.0
17 0.0
35 0.0
36 0.0
37 0.0
38 0.0
39 0.0
40 0.0
41 0.0
99 0.0
100 0.0
101 0.0
102 0.0
103 0.0
104 0.0
105 0.0
106 0.0
你說你想要第一次出現:
index = df[df['%'] == 0.000000].groupby((df['%'] != 0.000000).cumsum()).filter(lambda x: len(x) > 5).iloc[0].name
print('Time/ps:', index)
# Time/ps: 9
uj5u.com熱心網友回復:
如果所有值都是非負數,您可以使用長度為 5 的滾動視窗總和并使用 找到第一個零argmin:
k = 5
df.index[df['%'].rolling(k).sum().argmin() - k 1]
(如果可能有負值,需要.abs()先做rolling())
uj5u.com熱心網友回復:
import pandas as pd
import numpy as np
number = 0.0 #45.45454545454545
df = pd.DataFrame(lst, columns=['p']) #use more convenient name of column
df['dif'] = np.abs(df.p.diff(1)) np.abs(df.p.diff(2)) np.abs(df.p.diff(3)) np.abs(df.p.diff(4))
positions = df.index[(df.p==number) & (df.dif==0.0)]
first_index = positions[0] - 4
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/492218.html
標籤:Python python-3.x 熊猫 数据框
