主頁 > 作業系統 > 基于布爾列計算具有滾動最大值的列

基于布爾列計算具有滾動最大值的列

2021-11-28 09:33:58 作業系統

我正在嘗試在不使用for回圈(使用 OHLC 股票資料)的情況下執行新的列計算,該回圈表示基于True另一列中的滾動最高點我已經將“交易的第一小時”時間視窗表示為 9:30 和 10:30 之間。在此期間,當我們遍歷行時,如果"first hour of trading" == True,并且當前最高價 >= 前一個最高價,則使用新的最高價。如果我們不在“交易的第一小時”內,只需將之前的值提前。

因此,使用這個可重現的資料集:

import numpy as np
import pandas as pd
from datetime import datetime

# Create a reproducible, static dataframe.
# 1 minute SPY data. Skip to the bottom...
df = pd.DataFrame([
    {
        "time": "2021-10-26 9:30",
        "open": "457.2",
        "high": "457.29",
        "low": "456.78",
        "close": "456.9383",
        "volume": "594142"
    },
    {
        "time": "2021-10-26 9:31",
        "open": "456.94",
        "high": "457.07",
        "low": "456.8",
        "close": "456.995",
        "volume": "194061"
    },
    {
        "time": "2021-10-26 9:32",
        "open": "456.99",
        "high": "457.22",
        "low": "456.84",
        "close": "457.21",
        "volume": "186114"
    },
    {
        "time": "2021-10-26 9:33",
        "open": "457.22",
        "high": "457.45",
        "low": "457.2011",
        "close": "457.308",
        "volume": "294158"
    },
    {
        "time": "2021-10-26 9:34",
        "open": "457.31",
        "high": "457.4",
        "low": "457.25",
        "close": "457.32",
        "volume": "172574"
    },
    {
        "time": "2021-10-26 9:35",
        "open": "457.31",
        "high": "457.48",
        "low": "457.18",
        "close": "457.44",
        "volume": "396668"
    },
    {
        "time": "2021-10-26 9:36",
        "open": "457.48",
        "high": "457.6511",
        "low": "457.44",
        "close": "457.57",
        "volume": "186777"
    },
    {
        "time": "2021-10-26 9:37",
        "open": "457.5699",
        "high": "457.73",
        "low": "457.5699",
        "close": "457.69",
        "volume": "187596"
    },
    {
        "time": "2021-10-26 9:38",
        "open": "457.7",
        "high": "457.73",
        "low": "457.54",
        "close": "457.63",
        "volume": "185570"
    },
    {
        "time": "2021-10-26 9:39",
        "open": "457.63",
        "high": "457.64",
        "low": "457.31",
        "close": "457.59",
        "volume": "164707"
    },
    {
        "time": "2021-10-26 9:40",
        "open": "457.59",
        "high": "457.72",
        "low": "457.46",
        "close": "457.7199",
        "volume": "167438"
    },
    {
        "time": "2021-10-26 9:41",
        "open": "457.72",
        "high": "457.8",
        "low": "457.68",
        "close": "457.72",
        "volume": "199951"
    },
    {
        "time": "2021-10-26 9:42",
        "open": "457.73",
        "high": "457.74",
        "low": "457.6",
        "close": "457.62",
        "volume": "152134"
    },
    {
        "time": "2021-10-26 9:43",
        "open": "457.6",
        "high": "457.65",
        "low": "457.45",
        "close": "457.5077",
        "volume": "142530"
    },
    {
        "time": "2021-10-26 9:44",
        "open": "457.51",
        "high": "457.64",
        "low": "457.4001",
        "close": "457.61",
        "volume": "122575"
    },
    {
        "time": "2021-10-26 9:45",
        "open": "457.61",
        "high": "457.76",
        "low": "457.58",
        "close": "457.75",
        "volume": "119886"
    },
    {
        "time": "2021-10-26 9:46",
        "open": "457.74",
        "high": "457.75",
        "low": "457.37",
        "close": "457.38",
        "volume": "183157"
    },
    {
        "time": "2021-10-26 9:47",
        "open": "457.42",
        "high": "457.49",
        "low": "457.37",
        "close": "457.44",
        "volume": "128542"
    },
    {
        "time": "2021-10-26 9:48",
        "open": "457.43",
        "high": "457.49",
        "low": "457.33",
        "close": "457.44",
        "volume": "154181"
    },
    {
        "time": "2021-10-26 9:49",
        "open": "457.43",
        "high": "457.5898",
        "low": "457.42",
        "close": "457.47",
        "volume": "163063"
    },
    {
        "time": "2021-10-26 9:50",
        "open": "457.45",
        "high": "457.59",
        "low": "457.44",
        "close": "457.555",
        "volume": "96229"
    },
    {
        "time": "2021-10-26 9:51",
        "open": "457.56",
        "high": "457.61",
        "low": "457.31",
        "close": "457.4217",
        "volume": "110380"
    },
    {
        "time": "2021-10-26 9:52",
        "open": "457.42",
        "high": "457.56",
        "low": "457.42",
        "close": "457.47",
        "volume": "107518"
    },
    {
        "time": "2021-10-26 9:53",
        "open": "457.475",
        "high": "457.51",
        "low": "457.4",
        "close": "457.48",
        "volume": "78062"
    },
    {
        "time": "2021-10-26 9:54",
        "open": "457.49",
        "high": "457.57",
        "low": "457.42",
        "close": "457.46",
        "volume": "133883"
    },
    {
        "time": "2021-10-26 9:55",
        "open": "457.47",
        "high": "457.56",
        "low": "457.45",
        "close": "457.51",
        "volume": "98998"
    },
    {
        "time": "2021-10-26 9:56",
        "open": "457.51",
        "high": "457.54",
        "low": "457.43",
        "close": "457.43",
        "volume": "110237"
    },
    {
        "time": "2021-10-26 9:57",
        "open": "457.43",
        "high": "457.65",
        "low": "457.375",
        "close": "457.65",
        "volume": "98794"
    },
    {
        "time": "2021-10-26 9:58",
        "open": "457.66",
        "high": "457.69",
        "low": "457.35",
        "close": "457.45",
        "volume": "262154"
    },
    {
        "time": "2021-10-26 9:59",
        "open": "457.45",
        "high": "457.47",
        "low": "457.33",
        "close": "457.4",
        "volume": "74685"
    },
    {
        "time": "2021-10-26 10:00",
        "open": "457.41",
        "high": "457.48",
        "low": "457.18",
        "close": "457.38",
        "volume": "166617"
    },
    {
        "time": "2021-10-26 10:01",
        "open": "457.39",
        "high": "457.7",
        "low": "457.39",
        "close": "457.5",
        "volume": "265649"
    },
    {
        "time": "2021-10-26 10:02",
        "open": "457.51",
        "high": "457.57",
        "low": "457.39",
        "close": "457.53",
        "volume": "131947"
    },
    {
        "time": "2021-10-26 10:03",
        "open": "457.53",
        "high": "457.54",
        "low": "457.4",
        "close": "457.51",
        "volume": "80111"
    },
    {
        "time": "2021-10-26 10:04",
        "open": "457.51",
        "high": "457.62",
        "low": "457.5",
        "close": "457.6101",
        "volume": "117174"
    },
    {
        "time": "2021-10-26 10:05",
        "open": "457.621",
        "high": "457.64",
        "low": "457.51",
        "close": "457.58",
        "volume": "168758"
    },
    {
        "time": "2021-10-26 10:06",
        "open": "457.58",
        "high": "457.64",
        "low": "457.46",
        "close": "457.61",
        "volume": "84076"
    },
    {
        "time": "2021-10-26 10:07",
        "open": "457.62",
        "high": "457.7401",
        "low": "457.62",
        "close": "457.66",
        "volume": "125156"
    },
    {
        "time": "2021-10-26 10:08",
        "open": "457.665",
        "high": "457.69",
        "low": "457.5",
        "close": "457.67",
        "volume": "116919"
    },
    {
        "time": "2021-10-26 10:09",
        "open": "457.69",
        "high": "457.72",
        "low": "457.5",
        "close": "457.57",
        "volume": "102551"
    },
    {
        "time": "2021-10-26 10:10",
        "open": "457.56",
        "high": "457.75",
        "low": "457.56",
        "close": "457.7",
        "volume": "109165"
    },
    {
        "time": "2021-10-26 10:11",
        "open": "457.7",
        "high": "457.725",
        "low": "457.63",
        "close": "457.66",
        "volume": "146209"
    },
    {
        "time": "2021-10-26 10:12",
        "open": "457.665",
        "high": "457.88",
        "low": "457.64",
        "close": "457.86",
        "volume": "210620"
    },
    {
        "time": "2021-10-26 10:13",
        "open": "457.855",
        "high": "457.96",
        "low": "457.83",
        "close": "457.95",
        "volume": "159975"
    },
    {
        "time": "2021-10-26 10:14",
        "open": "457.95",
        "high": "458.02",
        "low": "457.93",
        "close": "457.95",
        "volume": "152042"
    },
    {
        "time": "2021-10-26 10:15",
        "open": "457.96",
        "high": "458.15",
        "low": "457.96",
        "close": "458.08",
        "volume": "146047"
    },
    {
        "time": "2021-10-26 10:16",
        "open": "458.085",
        "high": "458.17",
        "low": "457.99",
        "close": "458.15",
        "volume": "100732"
    },
    {
        "time": "2021-10-26 10:17",
        "open": "458.17",
        "high": "458.33",
        "low": "458.155",
        "close": "458.245",
        "volume": "235072"
    },
    {
        "time": "2021-10-26 10:18",
        "open": "458.25",
        "high": "458.29",
        "low": "458.14",
        "close": "458.16",
        "volume": "422002"
    },
    {
        "time": "2021-10-26 10:19",
        "open": "458.17",
        "high": "458.2801",
        "low": "458.1699",
        "close": "458.28",
        "volume": "114611"
    },
    {
        "time": "2021-10-26 10:20",
        "open": "458.29",
        "high": "458.39",
        "low": "458.24",
        "close": "458.37",
        "volume": "241797"
    },
    {
        "time": "2021-10-26 10:21",
        "open": "458.37",
        "high": "458.42",
        "low": "458.31",
        "close": "458.345",
        "volume": "124824"
    },
    {
        "time": "2021-10-26 10:22",
        "open": "458.33",
        "high": "458.49",
        "low": "458.33",
        "close": "458.47",
        "volume": "132125"
    }
])

...以及可以低于上述df的代碼:

# Convert df to numeric and time to datetime re: the .csv to .json
# converter tool I used online...
df[['open','high','low','close','volume']] = df[['open','high','low','close','volume']].apply(pd.to_numeric)
df['time'] = pd.to_datetime(df['time'])

# When will the first hour of trading be?
startOfFirstHourTrading = datetime.strptime("0930", "%H%M").time()
endOfFirstHourTrading = datetime.strptime("1030", "%H%M").time()

# ...then using those times, define a boolean column denoting whether
# we're currently in the first hour of trading or not
df['isFirstHourTrading'] = np.where((df['time'].dt.time >= startOfFirstHourTrading) & (df['time'].dt.time < endOfFirstHourTrading), True, False)

df['FirstHourHigh'] = 0

# WORKING EXAMPLE
# Iterate through the df
for i in range(len(df)):

    # If we're not currently within the first hour of trading, just
    # bring forward the last value
    if df['isFirstHourTrading'].iloc[i] == False:
        df['FirstHourHigh'].iloc[i] = df['FirstHourHigh'].iloc[i-1]
        continue

    # ... otherwise if we ARE in the first hour of trading, keep track
    # of the rolling highest high during the first hour
    df['FirstHourHigh'].iloc[i] = max(df['high'].iloc[i], df['high'].iloc[i-1])

# Export the correct answer dataset to compare the next function
# to below
df.to_csv("correct_answers.csv", index=False)

# NON-WORKING EXAMPLE
# What I'd like to do is NOT use a for loop to do the above. I envision
# we can use np.where() and maybe a groupby() here? But I don't know how yet. 
# Psuedo might look something like:
df['FirstHourHigh'] = df['high'].rolling(window=DYNAMIC_START_OF_FIRST_HOUR_TRADING_ROW?).max()
# ... but obviously doesn't work, and doesn't take into account if if df['isFirstHourTrading'] == True which it should

# Cross check for matches with correct_answers.csv
df.to_csv("correct_answers2.csv", index=False)

...你會看到我已經使用for回圈實作了我想要的東西,然后比較我正在尋找的東西。我希望實作與for回圈相同的功能,但不使用 for 回圈,僅使用 1 行列計算。有任何想法嗎?謝謝!

uj5u.com熱心網友回復:

首先將時間作為時間,然后在第一個交易小時屏蔽值,獲得它們的累積最大值,并將最高值向前移動:

>>> df['time'] = pd.to_datetime(df['time'])
>>> first_hour = (df['time'].dt.hour * 60   df['time'].dt.minute).between(9.5 * 60, 10.5 * 60)
>>> df['high'].where(first_hour).cummax().ffill()

這是它在一個較小的例子中所做的,它實際上也有第一個訂單之外的幾天和日期:

>>> df
                 time  high
0 2021-10-26 09:30:00   525
1 2021-10-26 10:00:00   504
2 2021-10-26 10:30:00   550
3 2021-10-26 11:00:00   567
4 2021-10-27 09:30:00   520
5 2021-10-27 10:00:00   576
6 2021-10-27 10:30:00   508
7 2021-10-27 11:00:00   532
>>> df['high'].where((df['time'].dt.hour * 60   df['time'].dt.minute).between(9.5 * 60, 10.5 * 60)).cummax().ffill()
0    525.0
1    525.0
2    550.0
3    550.0
4    550.0
5    576.0
6    576.0
7    576.0

uj5u.com熱心網友回復:

首先,您可能有很多不符合 標準的資料"first hour of trading" == True,對吧?如果是這種情況,我們可以先過濾掉這些資料,這樣我們就不需要處理其他情況了。

firstHourTrading = df[df["isFirstHourTrading"] == True]

如果這不是一個選項,那也沒關系。我們可以通過使用 apply 并使用與您在 for 回圈中使用的演算法相同的前一行來立即提高回圈的性能。

df.apply(lambda row: your_business_logic(row, row.shift()))

或者cummax,如果滿足您的需要,您可以嘗試使用pandas 的函式:https : //pandas.pydata.org/docs/reference/api/pandas.DataFrame.cummax.html

uj5u.com熱心網友回復:

pandas cummax 函式可用于 isFirstHourTrading 高列,如下所示:

df["FirstHourHigh"] = (df["isFirstHourTrading"]*df["high"]).cummax()

轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/367973.html

標籤:Python 熊猫

上一篇:從.csv讀取URL串列以使用Python、BeautifulSoup、Pandas進行抓取

下一篇:如何在訓練模型時修復記憶體錯誤?

標籤雲
其他(157675) Python(38076) JavaScript(25376) Java(17977) C(15215) 區塊鏈(8255) C#(7972) AI(7469) 爪哇(7425) MySQL(7132) html(6777) 基礎類(6313) sql(6102) 熊猫(6058) PHP(5869) 数组(5741) R(5409) Linux(5327) 反应(5209) 腳本語言(PerlPython)(5129) 非技術區(4971) Android(4554) 数据框(4311) css(4259) 节点.js(4032) C語言(3288) json(3245) 列表(3129) 扑(3119) C++語言(3117) 安卓(2998) 打字稿(2995) VBA(2789) Java相關(2746) 疑難問題(2699) 细绳(2522) 單片機工控(2479) iOS(2429) ASP.NET(2402) MongoDB(2323) 麻木的(2285) 正则表达式(2254) 字典(2211) 循环(2198) 迅速(2185) 擅长(2169) 镖(2155) 功能(1967) .NET技术(1958) Web開發(1951) python-3.x(1918) HtmlCss(1915) 弹簧靴(1913) C++(1909) xml(1889) PostgreSQL(1872) .NETCore(1853) 谷歌表格(1846) Unity3D(1843) for循环(1842)

熱門瀏覽
  • CA和證書

    1、在 CentOS7 中使用 gpg 創建 RSA 非對稱密鑰對 gpg --gen-key #Centos上生成公鑰/密鑰對(存放在家目錄.gnupg/) 2、將 CentOS7 匯出的公鑰,拷貝到 CentOS8 中,在 CentOS8 中使用 CentOS7 的公鑰加密一個檔案 gpg -a ......

    uj5u.com 2020-09-10 00:09:53 more
  • Kubernetes K8S之資源控制器Job和CronJob詳解

    Kubernetes的資源控制器Job和CronJob詳解與示例 ......

    uj5u.com 2020-09-10 00:10:45 more
  • VMware下安裝CentOS

    VMware下安裝CentOS 一、軟硬體準備 1 Centos鏡像準備 1.1 CentOS鏡像下載地址 下載地址 1.2 CentOS鏡像下載程序 點擊下載地址進入如下圖的網站,選擇需要下載的版本,這里選擇的是Centos8,點擊如圖所示。 決定選擇Centos8后,選擇想要的鏡像源進行下載,此 ......

    uj5u.com 2020-09-10 00:12:10 more
  • 如何使用Grep命令查找多個字串

    如何使用Grep 命令查找多個字串 大家好,我是良許! 今天向大家介紹一個非常有用的技巧,那就是使用 grep 命令查找多個字串。 簡單介紹一下,grep 命令可以理解為是一個功能強大的命令列工具,可以用它在一個或多個輸入檔案中搜索與正則運算式相匹配的文本,然后再將每個匹配的文本用標準輸出的格式 ......

    uj5u.com 2020-09-10 00:12:28 more
  • git配置http代理

    git配置http代理 經常遇到克隆 github 慢的問題,這里記錄一下幾種配置 git 代理的方法,解決 clone github 過慢。 目錄 git配置代理 git單獨配置github代理 git配置全域代理 配置終端環境變數 git配置代理 主要使用 git config 命令 git單獨 ......

    uj5u.com 2020-09-10 00:12:33 more
  • Linux npm install 裝包時提示Error EACCES permission denied解

    npm install 裝包時提示Error EACCES permission denied解決辦法 ......

    uj5u.com 2020-09-10 00:12:53 more
  • Centos 7下安裝nginx,使用yum install nginx,提示沒有可用的軟體包

    Centos 7下安裝nginx,使用yum install nginx,提示沒有可用的軟體包。 18 (flaskApi) [root@67 flaskDemo]# yum -y install nginx 19 已加載插件:fastestmirror, langpacks 20 Loading ......

    uj5u.com 2020-09-10 00:13:13 more
  • Linux查看服務器暴力破解ssh IP

    在公網的服務器上經常遇到別人爆破你服務器的22埠,用來挖礦或者干其他嘿嘿嘿的事情~ 這種情況下正確的做法是: 修改默認ssh的22埠 使用設定密鑰登錄或者白名單ip登錄 建議服務器密碼為復雜密碼 創建普通用戶登錄服務器(root權限過大) 建立堡壘機,實作統一管理服務器 統計爆破IP [root ......

    uj5u.com 2020-09-10 00:13:17 more
  • CentOS 7系統常見快捷鍵操作方式

    Linux系統中一些常見的快捷方式,可有效提高操作效率,在某些時刻也能避免操作失誤帶來的問題。 ......

    uj5u.com 2020-09-10 00:13:31 more
  • CentOS 7作業系統目錄結構介紹

    作業系統存在著大量的資料檔案資訊,相應檔案資訊會存在于系統相應目錄中,為了更好的管理資料資訊,會將系統進行一些目錄規劃,不同目錄存放不同的資源。 ......

    uj5u.com 2020-09-10 00:13:35 more
最新发布
  • vim的常用命令

    Vim的6種基本模式 1. 普通模式在普通模式中,用的編輯器命令,比如移動游標,洗掉文本等等。這也是Vim啟動后的默認模式。這正好和許多新用戶期待的操作方式相反(大多數編輯器默認模式為插入模式)。 2. 插入模式在這個模式中,大多數按鍵都會向文本緩沖中插入文本。大多數新用戶希望文本編輯器編輯程序中一 ......

    uj5u.com 2023-04-20 08:43:21 more
  • vim的常用命令

    Vim的6種基本模式 1. 普通模式在普通模式中,用的編輯器命令,比如移動游標,洗掉文本等等。這也是Vim啟動后的默認模式。這正好和許多新用戶期待的操作方式相反(大多數編輯器默認模式為插入模式)。 2. 插入模式在這個模式中,大多數按鍵都會向文本緩沖中插入文本。大多數新用戶希望文本編輯器編輯程序中一 ......

    uj5u.com 2023-04-20 08:42:36 more
  • docker學習

    ###Docker概述 真實專案部署環境可能非常復雜,傳統發布專案一個只需要一個jar包,運行環境需要單獨部署。而通過Docker可將jar包和相關環境(如jdk,redis,Hadoop...)等打包到docker鏡像里,將鏡像發布到Docker倉庫,部署時下載發布的鏡像,直接運行發布的鏡像即可。 ......

    uj5u.com 2023-04-19 09:26:53 more
  • 設定Windows主機的瀏覽器為wls2的默認瀏覽器

    這里以Chrome為例。 1. 準備作業 wsl是可以使用Windows主機上安裝的exe程式,出于安全考慮,默認情況下改功能是無法使用。要使用的話,終端需要以管理員權限啟動。 我這里以Windows Terminal為例,介紹如何默認使用管理員權限打開終端,具體操作如下圖所示: 2. 操作 wsl ......

    uj5u.com 2023-04-19 09:25:49 more
  • docker學習

    ###Docker概述 真實專案部署環境可能非常復雜,傳統發布專案一個只需要一個jar包,運行環境需要單獨部署。而通過Docker可將jar包和相關環境(如jdk,redis,Hadoop...)等打包到docker鏡像里,將鏡像發布到Docker倉庫,部署時下載發布的鏡像,直接運行發布的鏡像即可。 ......

    uj5u.com 2023-04-19 09:19:04 more
  • Linux學習筆記

    IP地址和主機名 IP地址 ifconfig可以用來查詢本機的IP地址,如果不能使用,可以通過install net-tools安裝。 Centos系統下ens33表示主網卡;inet后表示IP地址;lo表示本地回環網卡; 127.0.0.1表示代指本機;0.0.0.0可以用于代指本機,同時在放行設 ......

    uj5u.com 2023-04-18 06:52:01 more
  • 解決linux系統的kdump服務無法啟動的問題

    問題:專案麒麟系統服務器的kdump服務無法啟動,沒有相關日志無法定位問題。 1、查看服務狀態是關閉的,重啟系統也無法啟動 systemctl status kdump 2、修改grub引數,修改“crashkernel”為“512M(有的機器數值太大太小都會導致報錯,建議從128M開始試,或者加個 ......

    uj5u.com 2023-04-12 09:59:50 more
  • 解決linux系統的kdump服務無法啟動的問題

    問題:專案麒麟系統服務器的kdump服務無法啟動,沒有相關日志無法定位問題。 1、查看服務狀態是關閉的,重啟系統也無法啟動 systemctl status kdump 2、修改grub引數,修改“crashkernel”為“512M(有的機器數值太大太小都會導致報錯,建議從128M開始試,或者加個 ......

    uj5u.com 2023-04-12 09:59:01 more
  • 你是不是暴露了?

    作者:袁首京 原創文章,轉載時請保留此宣告,并給出原文連接。 如果您是計算機相關從業人員,那么應該經歷不止一次網路安全專項檢查了,你肯定是收到過資訊系統技術檢測報告,要求你加強風險監測,確保你提供的系統服務堅實可靠了。 沒檢測到問題還好,檢測到問題的話,有些處理起來還是挺麻煩的,尤其是線上正在運行的 ......

    uj5u.com 2023-04-05 16:52:56 more
  • 細節拉滿,80 張圖帶你一步一步推演 slab 記憶體池的設計與實作

    1. 前文回顧 在之前的幾篇記憶體管理系列文章中,筆者帶大家從宏觀角度完整地梳理了一遍 Linux 記憶體分配的整個鏈路,本文的主題依然是記憶體分配,這一次我們會從微觀的角度來探秘一下 Linux 內核中用于零散小記憶體塊分配的記憶體池 —— slab 分配器。 在本小節中,筆者還是按照以往的風格先帶大家簡單 ......

    uj5u.com 2023-04-05 16:44:11 more