如何在MySQL中匯總類似時間序列的表以獲得不為空的最新值？-有解無憂

我有一個時間序列資料表，如 SQL 中的資料，其格式如下：

注意：每個customer_ID可以有多個transaction_ID

客戶ID	交易ID	時間戳	值1	值2
1	1	01/01/2022 17:00:00	1	空值
1	1	01/01/2022 17:05:00	空值	富
1	1	01/01/2022 17:10:00	空值	酒吧
1	1	01/01/2022 17:15:00	2	空值
1	2	01/01/2022 17:20:00	空值	狼

我想基于此資料創建一個遵循以下格式的視圖：

客戶ID	交易ID	時間戳	值1	值2
1	1	01/01/2022 17:00:00	1	空值
1	1	01/01/2022 17:05:00	1	富
1	1	01/01/2022 17:10:00	1	酒吧
1	1	01/01/2022 17:15:00	2	酒吧
1	2	01/01/2022 17:20:00	空值	狼

本質上，我想“滾動”資料，以便 Value1 和 Value2 是給定時間戳的該 transaction_id 的最新值。

我嘗試過諸如 PARTITION BY OVER 陳述句之類的東西，但它會將值連接到一個串列中，而不是給出最近的值（對于字串）或它們的總和（對于數值）。

SELECT * FROM
(SELECT
transaction_id,
timestamp,
STRING_AGG(Value1) OVER(PARTITION BY transaction_id) AS Value1,
STRING_AGG(Value2) OVER(PARTITION BY transaction_id) AS Value2
FROM Database;

uj5u.com熱心網友回復：

傳統方式（使用子查詢作為列運算式）：

select
  t1.Customer_Id
  , t1.transaction_id
  , t1.Timestamp
  , (select t2.value1
        from testable t2 
        where t2.Customer_Id=t1.Customer_Id
        and t2.transaction_Id=T1.transaction_Id
        and t2.TimeStamp=(select max(TimeStamp) as V1_TS
                         from testable t3
                         where t3.Customer_Id=t2.Customer_Id
                         and t3.transaction_Id=t2.transaction_Id
                         and t3.Value1 is not null
                         )
       ) LastVal1
  , (select t2.value2
        from testable t2 
        where t2.Customer_Id=t1.Customer_Id
        and t2.transaction_Id=T1.transaction_Id
        and t2.TimeStamp=(select max(TimeStamp) as V1_TS
                         from testable t3
                         where t3.Customer_Id=t2.Customer_Id
                         and t3.transaction_Id=t2.transaction_Id
                         and t3.Value2 is not null
                         )
       ) LastVal2
  from testable t1

uj5u.com熱心網友回復：

要創建每行與替代品應來自的行共享的公共值，您可以首先使用sum()計算累積總和的視窗版本。使用value1 IS NOT NULL(or value2 IS NOT NULL)，它作為數值背景關系中的布爾運算式將被隱式轉換為MySQL0或1MySQL。（在其他風格的 SQL（以及 MySQL 本身）中，這可以通過使用CASE運算式顯式完成。）

然后，您可以使用此公共值進行磁區，并使用first_value()視窗函式獲取替代值，該值將在時間戳的順序中，共享公共值的行的第一行，即在磁區中。

類似的東西：

SELECT customer_id,
       transaction_id,
       timestamp,
       first_value(value1) OVER (PARTITION BY transaction_id,
                                              cv1
                                 ORDER BY timestamp ASC) AS value1,
       first_value(value2) OVER (PARTITION BY transaction_id,
                                              cv2
                                 ORDER BY timestamp ASC) AS value2
       FROM (SELECT customer_id,
                    transaction_id,
                    timestamp,
                    value1,
                    value2,
                    sum(value1 IS NOT NULL) OVER (PARTITION BY transaction_id
                                                  ORDER BY timestamp ASC) AS cv1,
                    sum(value2 IS NOT NULL) OVER (PARTITION BY transaction_id
                                                  ORDER BY timestamp ASC) AS cv2
                    FROM elbat) x;

（未經測驗，因為您未能將示例作為可消耗的 DDL 和 DML 交付，因為您應該擁有。）

uj5u.com熱心網友回復：

一個舊方法（也適用于 MySql 7.x），是使用變數。

select Customer_ID, transaction_ID, Timestamp, Value1, Value2
from
(
  select Customer_ID
  , Timestamp
  , case 
    when (value1 is not null or transaction_ID != @transId) 
     and @val1 := value1
    then value1
    else @val1
    end as Value1
  , case 
    when (value2 is not null or transaction_ID != @transId) 
     and @val2 := value2
    then value2
    else @val2
    end as Value2
  , @transId := transaction_ID as transaction_ID
  from your_table
  cross join (select @transId:=0, @val1:=0, @val2:='') vals
  order by Customer_ID, transaction_ID, Timestamp
) q

客戶ID	交易ID	時間戳	值1	值2
1	1	2022-01-01 17:00:00	1	空值
1	1	2022-01-01 17:05:00	1	富
1	1	2022-01-01 17:10:00	1	酒吧
1	1	2022-01-01 17:15:00	2	酒吧
1	2	2022-01-01 17:20:00	空值	狼

旁注，子查詢中的排序順序在此方法中很重要。

另一種方法是使用相關子查詢

select Customer_ID, transaction_ID, Timestamp
, coalesce(Value1, (select t2.Value1 
   from your_table t2 
   where t2.transaction_ID = t.transaction_ID
   and t2.Value1 is not null
   and t2.Timestamp < t.Timestamp
   order by t2.Timestamp desc
   limit 1
  )) as Value1
, coalesce(Value2, (select t2.Value2
   from your_table t2 
   where t2.transaction_ID = t.transaction_ID
   and t2.Value2 is not null
   and t2.Timestamp < t.Timestamp
   order by t2.Timestamp desc
   limit 1
  )) as Value2
from your_table t
order by Customer_ID, transaction_ID, Timestamp;

關于db<>fiddle 的演示在這里

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/403170.html

標籤：

上一篇：使用AdventureWorksDW2016表獲取分2列的前(100)名男性和女性的年收入百分比

下一篇：SQL/使用值更改鏈接