我有一個代碼,用于檢查所有日期的不同列,>= "2022-03-01" and i <= "2024-12-31然后將其附加到串列中ext=[]。
我想要的是能夠提取有關位于同一行的更多資訊。
我的代碼:
from pandas import *
data = read_csv("Book1.csv")
# converting column data to list
D_EXT_1 = data['D_EXT_1'].tolist()
D_INT_1 = data['D_INT_1'].tolist()
D_EXT_2 = data['D_EXT_2'].tolist()
D_INT_2 = data['D_INT_2'].tolist()
D_EXT_3 = data['D_EXT_3'].tolist()
D_INT_3 = data['D_INT_3'].tolist()
D_EXT_4 = data['D_EXT_4'].tolist()
D_INT_4 = data['D_INT_4'].tolist()
D_EXT_5 = data['D_EXT_5'].tolist()
D_INT_5 = data['D_INT_5'].toList()
D_EXT_6 = data['D_EXT_6'].toList()
D_INT_6 = data['D_INT_6'].toList()
ext = []
ext = [i for i in D_INT_1 D_INT_2 D_INT_3 D_INT_4 D_INT_5 D_INT_6 if i >= "2022-03-01" and i <= "2024-12-31"]
print(*ext, sep="\n")
資料示例:
NAME,ADRESS,D_INT_1,D_EXT_1,D_INT_2,D_EXT_2
ALEX,h4n1p8,2020-01-01,2024-01-01,2023-02-02,2020-01-01
我的代碼將用這些資料列印什么:
2024-01-01
期望的輸出:
Alex, 2024-01-01
根據 not_speshal -> data.head().to_dict() 的要求
{'EMPL. NO': {0: 5}, "NOM A L'EMPLACEMENT": {0: 'C010 - HOPITAL REGIONAL DE RIMOUSKI/CENTRE SERVEUR OPTILAB'}, 'ADRESSE': {0: '150 AVENUE ROULEAU'}, 'VILLE': {0: 'RIMOUSKI'}, 'PROV': {0: 'QC'}, 'OBJET NO': {0: 67}, "EMPLACEMENT DE L'APPAREIL": {0: 'CHAUFFERIE'}, 'RBQ 2018': {0: nan}, "DESCRIPTION DE L'APPAREIL": {0: 'CHAUDIERE AQUA. A VAPEUR'}, 'MANUFACTURIER': {0: 'MIURA'}, 'DIMENSIONS': {0: nan}, 'MAWP': {0: 170}, 'SVP': {0: 150}, 'DERNIERE INSP. EXT.': {0: '2019-05-29'}, 'FREQ. EXT.': {0: 12}, 'DERNIERE INSP. INT.': {0: '2020-06-03'}, 'FREQ. INT.': {0: 12}, 'D_EXT_1': {0: '2020-05-29'}, 'D_INT_1': {0: '2021-06-03'}, 'D_EXT_2': {0: '2021-05-29'}, 'D_INT_2': {0: '2022-06-03'}, 'D_EXT_3': {0: '2022-05-29'}, 'D_INT_3': {0: '2023-06-03'}, 'D_EXT_4': {0: '2023-05-29'}, 'D_INT_4': {0: '2024-06-03'}, 'D_EXT_5': {0: '2024-05-29'}, 'D_INT_5': {0: '2025-06-03'}, 'D_EXT_6': {0: '2025-05-29'}, 'D_INT_6': {0: '2026-06-03'}}
uj5u.com熱心網友回復:
從...開始
import pandas as pd
cols = [prefix str(i) for prefix in ['D_EXT_','D_INT_'] for i in range(1,7)]
data = pd.read_csv("Book1.csv")
for col in cols:
data.loc[:,col] = pd.to_datetime(data.loc[:,col])
然后使用
ext = data[
(
data.loc[:,cols].ge(pd.to_datetime("2022-03-01"))\
& data.loc[:,cols].le(pd.to_datetime("2024-12-13"))\
).any(axis=1)
]
編輯:雖然不清楚您想要的日期是否在所需范圍內,但要獲得(我理解)您所要求的內容,請使用
# assuming
import numpy as np
import pandas as pd
# and
cols = [prefix str(i) for prefix in ['D_EXT_','D_INT_'] for i in range(1,7)]
ext = data[
np.concatenate(
(
np.setdiff1d(data.columns,cols),
np.array(
(data.loc[:,cols].gt(pd.to_datetime("2022-03-01"))\
& data.loc[:,cols].lt(pd.to_datetime("2024-12-13"))\
).idxmax(axis=1)
)
),
axis=None
)]
哪里cols是如上
uj5u.com熱心網友回復:
IIUC,嘗試:
columns = ['D_EXT_1', 'D_EXT_2', 'D_EXT_3', 'D_EXT_4', 'D_EXT_5', 'D_EXT_6', 'D_INT_1', 'D_INT_2', 'D_INT_3', 'D_INT_4', 'D_INT_5', 'D_INT_6']
data[columns] = data[columns].apply(pd.to_datetime)
output = data[((data[columns]>="2022-03-01")&(data[columns]<="2024-12-31")).any(axis=1)]
這將回傳列串列中任何日期在2022-03-01和之間的所有行2024-12-31
uj5u.com熱心網友回復:
似乎您只想獲取至少一個日期在 ["2022-03-01", "2024-12-31"] 范圍內的行,對嗎?
首先,使用DataFrame.apply 將所有日期列轉換為日期時間pandas.to_datetime。
import pandas as pd
date_cols = ['D_EXT_1', 'D_EXT_2', 'D_EXT_3', 'D_EXT_4', 'D_EXT_5', 'D_EXT_6', 'D_INT_1', 'D_INT_2', 'D_INT_3', 'D_INT_4', 'D_INT_5', 'D_INT_6']
data[date_cols] = data[date_cols].apply(pd.to_datetime)
然后為所需范圍內的所有日期創建一個 2D 布爾掩碼
is_between_dates = (data[date_cols] > "2022-03-01") & (data[datecols] <= "2024-12-31")
# print(is_between_dates) to clearly understand what it represents
最后,選擇包含至少一個 True 值的行,這意味著該行中至少有一個日期屬于該日期范圍。這可以在 2D 布爾掩碼上使用DataFrame.anywith來實作axis=1,is_between_dates。
# again, print(is_between_dates.any(axis=1)) to see
data = data[is_between_dates.any(axis=1)]
uj5u.com熱心網友回復:
使用melt重新格式化您的資料幀是易于搜索:
df = pd.read_csv('Book1.csv').melt(['NAME', 'ADRESS']) \
.astype({'value': 'datetime64'}) \
.query("'2022-03-01' <= value & value <= '2024-12-31'")
此時您的資料框如下所示:
>>> df
NAME ADRESS variable value
1 ALEX h4n1p8 D_EXT_1 2024-01-01
2 ALEX h4n1p8 D_INT_2 2023-02-02
現在很容易得到NAME一個約會物件:
>>> df.loc[df['value'] == '2024-01-01', 'NAME']
1 ALEX
Name: NAME, dtype: object
# OR
>>> df.loc[df['value'] == '2024-01-01', 'NAME'].tolist()
['ALEX']
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/353665.html
