我正在嘗試在此表上添加一列并停留了一會兒
| ID | 第一類 | 日期 | 資料1 |
|---|---|---|---|
| 一個 | 1 | 2022-05-30 | 21 |
| 乙 | 2 | 2022-05-21 | 15 |
| 一個 | 2 | 2022-05-02 | 33 |
| 一個 | 1 | 2022-02-11 | 3 |
| 乙 | 2 | 2022-05-01 | 19 |
| 一個 | 1 | 2022-05-15 | 無效的 |
| 一個 | 1 | 2022-05-20 | 11 |
| 一個 | 2 | 2022-04-20 | 22 |
至
| ID | 第一類 | 日期 | 資料1 | Picked_Data |
|---|---|---|---|---|
| 一個 | 1 | 2022-05-30 | 21 | 11 |
| 乙 | 2 | 2022-05-21 | 15 | 19 |
| 一個 | 2 | 2022-05-02 | 33 | 22 |
| 一個 | 1 | 2022-02-11 | 3 | 一些數字或空 |
| 乙 | 2 | 2022-05-01 | 19 | 一些數字或空 |
| 一個 | 1 | 2022-05-15 | 無效的 | 一些數字或空 |
| 一個 | 1 | 2022-05-20 | 11 | 一些數字或空 |
| 一個 | 2 | 2022-04-20 | 22 | 一些數字或空 |
邏輯是按 Category1 和 ID 進行磁區,然后選擇過去 28 天內最新的 none null 值。如果不存在資料,則為空
對于第一行,ID = A 和類別 1,它將選擇第 7 行,因為它們屬于同一類別,ID 并且日期差為 <= 28。它跳過第 4 行和第 6 行,因為日期太遠且為空價值。
我試過查詢這個
select first_value(Data1) over (partition bty Category1 order by case when Data1 is not null and Date between Date - Inteverval 28 DAY and Date then 1 else 2) as Picked_Data
但它選擇了不正確的行,我猜是這個查詢
Date between Date - Inteverval 28 DAY and Date
沒有選擇正確的日期.. 誰能給我建議/建議我如何處理這個查詢?
uj5u.com熱心網友回復:
考慮以下方法
select *,
first_value(data1 ignore nulls) over past_28_days as picked_data
from your_table
window past_28_days as (
partition by id, category_1
order by unix_date(date)
range between 29 preceding and 1 preceding
)
如果應用于您問題中的樣本資料 - 輸出是

uj5u.com熱心網友回復:
考慮以下方法:
with sample_data as (
select 'A' as ID, 1 as category_1, date('2022-05-30') as date, 21 as data1,
union all select 'B' as ID, 2 as category_1, date('2022-05-21') as date, 15 as data1,
union all select 'A' as ID, 2 as category_1, date('2022-05-02') as date, 33 as data1,
union all select 'A' as ID, 1 as category_1, date('2022-02-11') as date, 3 as data1,
union all select 'B' as ID, 2 as category_1, date('2022-05-01') as date, 19 as data1,
union all select 'A' as ID, 1 as category_1, date('2022-05-15') as date, NULL as data1,
union all select 'A' as ID, 1 as category_1, date('2022-05-20') as date, 11 as data1,
union all select 'A' as ID, 2 as category_1, date('2022-04-20') as date, 22 as data1,
),
with_next_data as (
select *,
lag(date) over (partition by ID,category_1 order by date) as next_date,
lag(data1) over (partition by ID,category_1 order by date) as next_data,
from sample_data
)
select
id,
category_1,
date,
data1,
if(date_diff(date, next_date,day) <= 28, next_data, null) as picked_data
from with_next_data
輸出:

轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/483437.html
