我當前的表格如下所示,每個患者都有他們到醫院的就診開始日期和結束日期,并且在 admin_startdate 和 admin_enddate 之間給他們服用藥物。例如,前兩行表示患者 PT1 在 01/01 至 01/31 訪問期間進行了兩次給藥,一次在 01/08 和 01/10 之間,另一次在 01/12 和 01/23 之間。
ptid visit_start_date visit_end_date admin_startdate admin_enddate
PT1 2018-01-01 2018-01-31 2018-01-08 2018-01-10
PT1 2018-01-01 2018-01-31 2018-01-12 2018-01-23
PT2 2018-01-02 2018-01-18 2018-01-06 2018-01-11
PT2 2018-01-02 2018-01-18 2018-01-14 2018-01-17
我想要實作的是將過于接近的藥物管理集中在一起,例如,前一個的結束日期<= 2 days是新的開始日期,并將其稱為一整集,如下所示:
ptid visit_start_date visit_end_date admin_startdate admin_enddate episode_startdate episode_enddate
PT1 2018-01-01 2018-01-31 2018-01-08 2018-01-10 2018-01-08 2018-01-23
PT1 2018-01-01 2018-01-31 2018-01-12 2018-01-23 2018-01-08 2018-01-23
PT2 2018-01-02 2018-01-18 2018-01-06 2018-01-11 2018-01-06 2018-01-11
PT2 2018-01-02 2018-01-18 2018-01-14 2018-01-17 2018-01-14 2018-01-17
episode_startdate您可以看到 PT1 的兩個主管部門與同一個和混為一談episode_enddate,而 PT2 的兩個主管部門被視為兩個獨立的插曲。
我很難弄清楚如何在 PostgreSQL (Redshift) 中做到這一點。
uj5u.com熱心網友回復:
這項作業在 Postgres 14 中。未針對 Redshift 進行測驗。
SELECT ptid, visit_start_date, visit_end_date, admin_startdate, admin_enddate
, min(admin_startdate) OVER (PARTITION BY visit_id, admin) AS episode_startdate
, max(admin_enddate) OVER (PARTITION BY visit_id, admin) AS episode_enddate
FROM (
SELECT *, count(*) FILTER (WHERE gap) OVER (PARTITION BY visit_id ORDER BY admin_startdate) AS admin
FROM (
SELECT *, admin_startdate - lag(admin_enddate) OVER (PARTITION BY visit_id ORDER BY admin_startdate) > 2 AS gap
FROM (
SELECT *, dense_rank() OVER (ORDER BY ptid, visit_start_date, visit_end_date) AS visit_id -- optional, to simplify
FROM tbl
) sub1
) sub2
) sub3
db<>在這里擺弄
最里面的子查詢sub1只是計算一個唯一的visit_id- 它應該真的在你的表中,而不是(ptid, visit_start_date, visit_end_date )一遍又一遍地重復。考慮至少規范化你的設計。
下一個子查詢sub2檢查與同一磁區中的前一行之間是否存在大于兩天的間隙。
然后子查詢sub3計算這些間隔以識別不同的管理周期 ( admin)
在外部SELECT和每個min(admin_startdate)管理max(admin_enddate)期間產生所需的劇集日期。
請參閱(帶有指向更多內容的各種鏈接):
- 如何將時間戳分組為島嶼(基于任意間隙)?
uj5u.com熱心網友回復:
CREATE TABLE tb1 AS (
SELECT *, admin_startdate - lag(admin_enddate) OVER (PARTITION BY visit_id ORDER BY admin_startdate) > 2 AS gap
FROM (
SELECT *, dense_rank() OVER (ORDER BY ptid, visit_start_date, visit_end_date) AS visit_id -- optional, to simplify
FROM tbl
) sub1
) ;
CREATE TABLE tb2 AS (
SELECT *, count(*) OVER (PARTITION BY visit_id ORDER BY admin_startdate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS admin
FROM tb1
WHERE gap is True
)
;
CREATE TABLE tb3 AS (
SELECT tb1.ptid, tb1.visit_start_date, tb1.visit_end_date, tb1.admin_startdate, tb1.admin_enddate, tb1.visit_id, tb1.gap,
CASE WHEN tb2.admin is NULL THEN 0 else tb2.admin END AS admin
FROM tb1
LEFT JOIN tb2
ON tb1.ptid = tb2.ptid AND tb1.visit_start_date = tb2.visit_start_date AND tb1.visit_end_date = tb2.visit_end_date AND tb1.admin_startdate = tb2.admin_startdate AND tb1.admin_enddate = tb2.admin_enddate AND tb1.visit_id = tb2.visit_id
)
;
CREATE TABLE tb4 AS (
SELECT ptid, visit_start_date, visit_end_date, admin_startdate, admin_enddate
, min(admin_startdate) OVER (PARTITION BY visit_id, admin) AS episode_startdate
, max(admin_enddate) OVER (PARTITION BY visit_id, admin) AS episode_enddate
FROM tb3
)
這是根據 Erwin 對 Redshift 的回答改編的更丑陋的版本,它不支持 FILTER 操作。至少在db fiddle上正確測驗
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/464770.html
標籤:sql PostgreSQL 亚马逊红移
上一篇:如何分組然后添加一個值
下一篇:如何計算每個月的運行總額?
