Pythonpandas根據日期時間條件選擇行-有解無憂

這是示例模擬資料的代碼。實際資料可以有不同的開始和結束日期。

import pandas as pd
import numpy as np  

dates = pd.date_range("20100121", periods=3653)   
df = pd.DataFrame(np.random.randn(3653, 1), index=dates, columns=list("A"))    
dfb=df.resample('B').apply(lambda x:x[-1])

從 dfb 中，我想選擇包含該月所有日期的值的行。在 dfb 中，2010 年 1 月和 2020 年 1 月的資料不完整。所以我想要從 2010 年 2 月到 2019 年 12 月的資料。

對于這個特定的資料集，我可以做

df_out=dfb['2010-02':'2019-12']

但請幫助我更好的解決方案

編輯——這個問題似乎有很多混亂。我想省略不以該月的第一天開始的行和不以該月的最后一天結束的行。希望這很清楚。

uj5u.com熱心網友回復：

當您說“更好”的解決方案時-我假設您的意思是根據輸入資料使范圍動態化。

好的，因為您提到您的資料在開始日期之后是連續的 - 可以安全地假設日期按升序排序。考慮到這一點，請考慮以下代碼：

import pandas as pd
import numpy as np  
from datetime import date, timedelta

dates = pd.date_range("20100121", periods=3653)
df = pd.DataFrame(np.random.randn(3653, 1), index=dates, columns=list("A"))
print(df)
dfb=df.resample('B').apply(lambda x:x[-1])

# fd is the first index in your dataframe
fd = df.index[0]
first_day_of_next_month = fd
# checks if the first month data is incomplete, i.e. does not start with date = 1
if ( fd.day != 1 ):
   new_month = fd.month   1
   if ( fd.month == 12 ):
      new_month = 1
   first_day_of_next_month = fd.replace(day=1).replace(month=new_month)
else:
   first_day_of_next_month = fd

# ld is the last index in your dataframe
ld = df.index[-1]
# computes the next day
next_day = ld   timedelta(days=1)
if ( next_day.month > ld.month ):
   last_day_of_prev_month = ld  # keeps the index if month is changed
else:
   last_day_of_prev_month = ld.replace(day=1) - timedelta(days=1)


df_out=dfb[first_day_of_next_month:last_day_of_prev_month]

還有另一種使用方式，dateutil.relativedelta但您需要安裝python-dateutil模塊。上述解決方案試圖在不使用任何額外模塊的情況下做到這一點。

uj5u.com熱心網友回復：

我假設在一般情況下，表格是按時間順序排列的（如果不使用 .sort_index）。這個想法是從日期中提取年份和月份，并僅選擇（年，月）不等于第一行和最后一行的行。

dfb['year'] = dfb.index.year  # col#1
dfb['month'] = dfb.index.month  # col#2

first_month = (dfb['year']==dfb.iloc[0, 1])  & (dfb['month']==dfb.iloc[0, 2])   
last_month  = (dfb['year']==dfb.iloc[-1, 1]) & (dfb['month']==dfb.iloc[-1, 2]) 

dfb = dfb.loc[(~first_month) & (~last_month)]
dfb = dfb.drop(['year', 'month'], axis=1)

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/432177.html

標籤：Python 熊猫数据框约会时间

上一篇：回圈遍歷資料框

下一篇：兩個應用程式腳本專案輸出不同的toLocaleTimeString()（en-USPT&Eastern）