我面臨的問題是如何從 Teradata ANSI SQL 中重疊的多個時間段中找到不同的時間段。
例如,附表包含多個重疊的時間段,如何在 Teradata SQL 中將這些時間段合并為 3 個唯一時間段???
我想我可以用回圈函式在 python 中做到這一點,但不知道如何在 SQL 中做到這一點
| ID | 開始日期 | 結束日期 |
|---|---|---|
| 001 | 2005-01-01 | 2006-01-01 |
| 001 | 2005-01-01 | 2007-01-01 |
| 001 | 2008-01-01 | 2008-06-01 |
| 001 | 2008-04-01 | 2008-12-01 |
| 001 | 2010-01-01 | 2010-05-01 |
| 001 | 2010-04-01 | 2010-12-01 |
| 001 | 2010-11-01 | 2012-01-01 |
我的預期結果是:
| ID | 開始日期 | 結束日期 |
|---|---|---|
| 001 | 2005-01-01 | 2007-01-01 |
| 001 | 2008-01-01 | 2008-12-01 |
| 001 | 2010-01-01 | 2012-01-01 |
uj5u.com熱心網友回復:
從 Oracle 12 開始,您可以使用MATCH_RECOGNIZE來執行逐行比較:
SELECT *
FROM table_name
MATCH_RECOGNIZE(
PARTITION BY id
ORDER BY start_date
MEASURES
FIRST(start_date) AS start_date,
MAX(end_date) AS end_date
ONE ROW PER MATCH
PATTERN (overlapping_ranges* last_range)
DEFINE overlapping_ranges AS NEXT(start_date) <= MAX(end_date)
)
其中,對于樣本資料:
CREATE TABLE table_name (ID, Start_Date, End_Date) AS
SELECT '001', DATE '2005-01-01', DATE '2006-01-01' FROM DUAL UNION ALL
SELECT '001', DATE '2005-01-01', DATE '2007-01-01' FROM DUAL UNION ALL
SELECT '001', DATE '2008-01-01', DATE '2008-06-01' FROM DUAL UNION ALL
SELECT '001', DATE '2008-04-01', DATE '2008-12-01' FROM DUAL UNION ALL
SELECT '001', DATE '2010-01-01', DATE '2010-05-01' FROM DUAL UNION ALL
SELECT '001', DATE '2010-04-01', DATE '2010-12-01' FROM DUAL UNION ALL
SELECT '001', DATE '2010-11-01', DATE '2012-01-01' FROM DUAL;
輸出:
ID 開始日期 結束日期 001 2005-01-01 00:00:00 2007-01-01 00:00:00 001 2008-01-01 00:00:00 2008-12-01 00:00:00 001 2010-01-01 00:00:00 2012-01-01 00:00:00
db<>在這里擺弄
更新:替代查詢
SELECT id,
start_date,
end_date
FROM (
SELECT id,
dt,
SUM(cnt) OVER (PARTITION BY id ORDER BY dt) AS grp,
cnt
FROM (
SELECT ID,
dt,
SUM(type) OVER (PARTITION BY id ORDER BY dt, ROWNUM) * type AS cnt
FROM table_name
UNPIVOT (dt FOR type IN (start_date AS 1, end_date AS -1))
)
WHERE cnt IN (1,0)
)
PIVOT (MAX(dt) FOR cnt IN (1 AS start_date, 0 AS end_date))
或者,一個不使用UNPIVOT, PIVOTorROWNUM并且在 Oracle 和 PostgreSQL 中都可以作業的等價物:
SELECT id,
MAX(CASE cnt WHEN 1 THEN dt END) AS start_date,
MAX(CASE cnt WHEN 0 THEN dt END) AS end_date
FROM (
SELECT id,
dt,
SUM(cnt) OVER (PARTITION BY id ORDER BY dt) AS grp,
cnt
FROM (
SELECT ID,
dt,
SUM(type) OVER (PARTITION BY id ORDER BY dt, rn) * type AS cnt
FROM (
SELECT r.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY dt ASC, type DESC) AS rn
FROM (
SELECT id, 1 AS type, start_date AS dt FROM table_name
UNION ALL
SELECT id, -1 AS type, end_date AS dt FROM table_name
) r
) p
) s
WHERE cnt IN (1,0)
) t
GROUP BY id, grp
更新 2:另一種選擇
SELECT id,
MIN(start_date) AS start_date,
MAX(end_Date) AS end_date
FROM (
SELECT t.*,
SUM(CASE WHEN start_date <= prev_max THEN 0 ELSE 1 END)
OVER (PARTITION BY id ORDER BY start_date) AS grp
FROM (
SELECT t.*,
MAX(end_date) OVER (
PARTITION BY id ORDER BY start_date
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
) AS prev_max
FROM table_name t
) t
) t
GROUP BY id, grp
db<>fiddle Oracle PostgreSQL
uj5u.com熱心網友回復:
這是一個空白和孤島問題。嘗試這個:
with u as
(select ID, start_date, end_date,
case
when start_date <= lag(end_date) over(partition by ID order by start_date, end_date) then 0
else 1 end as grp
from table_name),
v as
(select ID, start_date, end_date,
sum(grp) over(partition by ID order by start_date, end_date) as island
from u)
select ID, min(start_date) as start_Date, max(end_date) as end_date
from v
group by ID, island;
小提琴
start_date基本上,您可以通過比較當前行與end_date前一行(按 start_date、end_date 排序)來識別“島嶼” ,如果它在它之前,那么它就是同一個島。然后你可以滾動sum()來獲得島嶼號碼。最后從每個島上選擇 min(start_date) 和 max(end_date) 以獲得所需的輸出。
uj5u.com熱心網友回復:
這可能有效,功能稍有改變,我在 Dbeaver 中嘗試過:
select ID,Start_Date,End_Date
from
(
select t.*,
dense_rank () over(partition by extract (year from Start_Date) order BY End_Date desc) drnk
from testing_123 t
) temp
where temp.drnk = 1
ORDER BY Start_Date;
uj5u.com熱心網友回復:
嘗試這個
WITH a as (
SELECT
ID,
LEFT(Start_Date, 4) as Year,
MIN(Start_Date) as New_Start_Date
FROM
TAB1
GROUP BY
ID,
LEFT(Start_Date, 4)
), b as (
SELECT
a.ID,
Year,
New_Start_Date,
End_Date
FROM
a
LEFT JOIN
TAB1
ON LEFT(a.New_Start_Date, 4) = LEFT(TAB1.Start_Date, 4)
)
select
ID,
New_Start_Date as Start_Date,
MAX(End_Date)
from
b
GROUP BY
ID,
New_Start_Date;
示例:https ://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=97f91b68c635aebfb752538cdd752ace
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/447533.html
上一篇:PhpSpreadsheetDate::PHPToExcel()添加時間
下一篇:根據語言環境正確排序月份和日期
