我有一個時間序列資料表,如 SQL 中的資料,其格式如下:
注意:每個customer_ID可以有多個transaction_ID
| 客戶ID | 交易ID | 時間戳 | 值1 | 值2 |
|---|---|---|---|---|
| 1 | 1 | 01/01/2022 17:00:00 | 1 | 空值 |
| 1 | 1 | 01/01/2022 17:05:00 | 空值 | 富 |
| 1 | 1 | 01/01/2022 17:10:00 | 空值 | 酒吧 |
| 1 | 1 | 01/01/2022 17:15:00 | 2 | 空值 |
| 1 | 2 | 01/01/2022 17:20:00 | 空值 | 狼 |
我想基于此資料創建一個遵循以下格式的視圖:
| 客戶ID | 交易ID | 時間戳 | 值1 | 值2 |
|---|---|---|---|---|
| 1 | 1 | 01/01/2022 17:00:00 | 1 | 空值 |
| 1 | 1 | 01/01/2022 17:05:00 | 1 | 富 |
| 1 | 1 | 01/01/2022 17:10:00 | 1 | 酒吧 |
| 1 | 1 | 01/01/2022 17:15:00 | 2 | 酒吧 |
| 1 | 2 | 01/01/2022 17:20:00 | 空值 | 狼 |
本質上,我想“滾動”資料,以便 Value1 和 Value2 是給定時間戳的該 transaction_id 的最新值。
我嘗試過諸如 PARTITION BY OVER 陳述句之類的東西,但它會將值連接到一個串列中,而不是給出最近的值(對于字串)或它們的總和(對于數值)。
SELECT * FROM
(SELECT
transaction_id,
timestamp,
STRING_AGG(Value1) OVER(PARTITION BY transaction_id) AS Value1,
STRING_AGG(Value2) OVER(PARTITION BY transaction_id) AS Value2
FROM Database;
uj5u.com熱心網友回復:
傳統方式(使用子查詢作為列運算式):
select
t1.Customer_Id
, t1.transaction_id
, t1.Timestamp
, (select t2.value1
from testable t2
where t2.Customer_Id=t1.Customer_Id
and t2.transaction_Id=T1.transaction_Id
and t2.TimeStamp=(select max(TimeStamp) as V1_TS
from testable t3
where t3.Customer_Id=t2.Customer_Id
and t3.transaction_Id=t2.transaction_Id
and t3.Value1 is not null
)
) LastVal1
, (select t2.value2
from testable t2
where t2.Customer_Id=t1.Customer_Id
and t2.transaction_Id=T1.transaction_Id
and t2.TimeStamp=(select max(TimeStamp) as V1_TS
from testable t3
where t3.Customer_Id=t2.Customer_Id
and t3.transaction_Id=t2.transaction_Id
and t3.Value2 is not null
)
) LastVal2
from testable t1
uj5u.com熱心網友回復:
要創建每行與替代品應來自的行共享的公共值,您可以首先使用sum()計算累積總和的視窗版本。使用value1 IS NOT NULL(or value2 IS NOT NULL),它作為數值背景關系中的布爾運算式將被隱式轉換為MySQL0或1MySQL。(在其他風格的 SQL(以及 MySQL 本身)中,這可以通過使用CASE運算式顯式完成。)
然后,您可以使用此公共值進行磁區,并使用first_value()視窗函式獲取替代值,該值將在時間戳的順序中,共享公共值的行的第一行,即在磁區中。
類似的東西:
SELECT customer_id,
transaction_id,
timestamp,
first_value(value1) OVER (PARTITION BY transaction_id,
cv1
ORDER BY timestamp ASC) AS value1,
first_value(value2) OVER (PARTITION BY transaction_id,
cv2
ORDER BY timestamp ASC) AS value2
FROM (SELECT customer_id,
transaction_id,
timestamp,
value1,
value2,
sum(value1 IS NOT NULL) OVER (PARTITION BY transaction_id
ORDER BY timestamp ASC) AS cv1,
sum(value2 IS NOT NULL) OVER (PARTITION BY transaction_id
ORDER BY timestamp ASC) AS cv2
FROM elbat) x;
(未經測驗,因為您未能將示例作為可消耗的 DDL 和 DML 交付,因為您應該擁有。)
uj5u.com熱心網友回復:
一個舊方法(也適用于 MySql 7.x),是使用變數。
select Customer_ID, transaction_ID, Timestamp, Value1, Value2 from ( select Customer_ID , Timestamp , case when (value1 is not null or transaction_ID != @transId) and @val1 := value1 then value1 else @val1 end as Value1 , case when (value2 is not null or transaction_ID != @transId) and @val2 := value2 then value2 else @val2 end as Value2 , @transId := transaction_ID as transaction_ID from your_table cross join (select @transId:=0, @val1:=0, @val2:='') vals order by Customer_ID, transaction_ID, Timestamp ) q
| 客戶ID | 交易ID | 時間戳 | 值1 | 值2 |
|---|---|---|---|---|
| 1 | 1 | 2022-01-01 17:00:00 | 1 | 空值 |
| 1 | 1 | 2022-01-01 17:05:00 | 1 | 富 |
| 1 | 1 | 2022-01-01 17:10:00 | 1 | 酒吧 |
| 1 | 1 | 2022-01-01 17:15:00 | 2 | 酒吧 |
| 1 | 2 | 2022-01-01 17:20:00 | 空值 | 狼 |
旁注,子查詢中的排序順序在此方法中很重要。
另一種方法是使用相關子查詢
select Customer_ID, transaction_ID, Timestamp
, coalesce(Value1, (select t2.Value1
from your_table t2
where t2.transaction_ID = t.transaction_ID
and t2.Value1 is not null
and t2.Timestamp < t.Timestamp
order by t2.Timestamp desc
limit 1
)) as Value1
, coalesce(Value2, (select t2.Value2
from your_table t2
where t2.transaction_ID = t.transaction_ID
and t2.Value2 is not null
and t2.Timestamp < t.Timestamp
order by t2.Timestamp desc
limit 1
)) as Value2
from your_table t
order by Customer_ID, transaction_ID, Timestamp;
關于db<>fiddle 的演示在這里
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/403170.html
標籤:
上一篇:使用AdventureWorksDW2016表獲取分2列的前(100)名男性和女性的年收入百分比
下一篇:SQL/使用值更改鏈接
