目錄
- 1、開窗函式的格式
- 2、視窗范圍圖例
- 3、連續登錄問題:
- 4、 限制時間段內登錄次數問題:
1、開窗函式的格式
FUNCTION_NAME([argument_list])
OVER (
[PARTITION BY window_partition,…]
[ORDER BY window_ordering, … [ASC|DESC]])
[ { ROWS | RANGE } BETWEEN frame_start AND frame_end ] );
FUNCTION_NAME:函式名稱,如row_number()、sum()、first_value()等,
argument_list:函式的引數串列,
PARTITION BY:根據window_partition(磁區欄位)進行磁區,該子句也被稱為查詢磁區子句,類似于group by,都是將資料按照邊界值進行分組,而OVER之前的函式在每一個分組之內進行,如果超出了分組,則函式會重新計算,
ORDER BY:將各個磁區內的資料,根據window_ordering(排序欄位)進行排序,ORDER BY子句會對輸入的資料強制排序(視窗函式是SQL陳述句最后執行的函式,因此可以把SQL結果集想象成輸入資料),ORDER BY子句對于諸如row_number(),lead(),lag()等函式是必須的,如果資料無序,這些函式的結果就沒有意義,
ROWS和RANGE分別表示選擇前后幾行、選擇資料范圍,
2、視窗范圍圖例

注釋:
PRECEDING:往前
FOLLOWING:往后
CURRENT ROW:當前行
UNBOUNDED:起點
UNBOUNDED PRECEDING 表示從前面的起點
UNBOUNDED FOLLOWING:表示到后面的終點
常用的范圍:
01:ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
02:ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
03:ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING,n PRECEDING m FOLLOWING:表示視窗的范圍是[(當前行的行數)- n, (當前行的行數)+ m] row,
04:ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
3、連續登錄問題:
資料準備:
±–±---------+
|id |login_date|
±–±---------+
|01 |2021-02-28|
|01 |2021-03-01|
|01 |2021-03-02|
|01 |2021-03-04|
|01 |2021-03-05|
|01 |2021-03-06|
|01 |2021-03-08|
|02 |2021-03-01|
|02 |2021-03-02|
|02 |2021-03-03|
|02 |2021-03-06|
|03 |2021-03-06|
±–±---------+
統計連續登錄天數超過3天的用戶,輸出資訊包括:用戶id,登錄天數,起始時間,結束時間;
方法1:
SELECT
t2.id,
count(1) as login_times,
min(t2.login_date) as start_date,
max(t2.login_date) as end_date
FROM
(
SELECT
t1.id,
t1.login_date,
date_sub(t1.login_date,rn) as diff_date
FROM
(
SELECT
id,
login_date,
row_number() over(partition by id order by login_date asc) as rn
FROM data
) t1
) t2
group by t2.id, t2.diff_date
having login_times >= 3;
+---+-----------+----------+----------+
|id |login_times|start_date|end_date |
+---+-----------+----------+----------+
| 01|3 |2021-02-28|2021-03-02|
| 01|3 |2021-03-04|2021-03-06|
| 02|3 |2021-03-01|2021-03-03|
+---+-----------+----------+----------+
方法2:
SELECT
id,
lag_login_date,
login_date,lead_login_date
FROM
(SELECT
id,
login_date,
lag(login_date,1,login_date) over(partition by id order by login_date) as lag_login_date,
lead(login_date,1,login_date) over(partition by id order by login_date) as lead_login_date
FROM data
) t1
where datediff(login_date,lag_login_date) =1 and datediff(lead_login_date,login_date) =1;
+---+--------------+----------+---------------+
|id |lag_login_date|login_date|lead_login_date|
+---+--------------+----------+---------------+
|01 |2018-02-28 |2018-03-01|2018-03-02 |
|01 |2018-03-04 |2018-03-05|2018-03-06 |
|02 |2018-03-01 |2018-03-02|2018-03-03 |
+---+--------------+----------+---------------+
方法3:
SELECT
id,
lag_login_date,
login_date
FROM
(SELECT
id,
login_date,
lag(login_date,2,login_date) over(partition by id order by login_date) as lag_login_date,
--lead(login_date,1,login_date) over(partition by id order by login_date) as lead_login_date
FROM data
) t1
where datediff(login_date,lag_login_date) =2
+---+--------------+----------+
|id |lag_login_date|login_date|
+---+--------------+----------+
|01 |2018-02-28 |2018-03-02|
|01 |2018-03-04 |2018-03-06|
|02 |2018-03-01 |2018-03-03|
+---+--------------+----------+
4、 限制時間段內登錄次數問題:
有一張用戶購買記錄表.現在我們需要找出所有的特殊用戶.特殊用戶的定義如下:
在當前購買時間的近7天內(含當天)購買次數超過3次(含),且近7天的購買總金額超過1000的用戶即為特殊用戶.
資料準備:
±–±---------+
|user_id |lbuy_date|amount|
±–±---------±-----------+
|101 |2021-01-01|1000|
|101 |2021-01-02|2000|
|102|2021-10-01|10|
|102 |2021-10-02|700|
|102 |2021-10-07|200|
|103 |2021-11-07|500|
|103 |2021-11-08|500|
|103 |2021-11-20|500|
|104|2021-03-01|10|
|104|2021-03-05|200|
|104|2021-03-09|800|
|104 |2021-03-09|800|
±–±---------+
實作:
---sql邏輯
select
distinct user_id
from
(
select
user_id
,buy_date
,count(1) over(PARTITION by user_id order by datediff(buy_date,'2021-01-01') RANGE between 6 PRECEDING and CURRENT row) as cnt
,sum(amount) over(PARTITION by user_id order by datediff(buy_date,'2021-01-01') RANGE between 6 PRECEDING and CURRENT row) as amount
from test.aaa001
)t1
where cnt>=3 and amount>1000
;
user_id
104
106
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/433387.html
標籤:其他
