我在 SQL Server 中有一個表,其中包含從 2022 年 2 月 10 日到 2022 年 3 月 10 日的客戶交易。
我想找到在最多連續三天至少有 5 筆交易的客戶
例如,下表的輸出應該是CustomerId = 2 and customerid=3
| ID | 客戶ID | 交易日期 |
|---|---|---|
| 1 | 1 | 2022-03-01 |
| 2 | 1 | 2022_03_01 |
| 3 | 1 | 2022_03_05 |
| 4 | 1 | 2022_03_07 |
| 5 | 1 | 2022_03_07 |
| 6 | 2 | 2022_03_05 |
| 7 | 2 | 2022_03_05 |
| 8 | 2 | 2022_03_06 |
| 9 | 2 | 2022_03_06 |
| 10 | 2 | 2022_03_07 |
| 1 | 3 | 2022-03-01 |
| 2 | 3 | 2022_03_01 |
| 3 | 3 | 2022_03_01 |
| 4 | 3 | 2022_03_03 |
| 5 | 3 | 2022_03_03 |
我試過這個查詢,但它對大表沒有很好的性能:
select distinct p1.customerid
from trntbl p1
join trntbl p2 on p2.id <> p1.id
and p2.customerid = p1.customerid
and p2.TransactionDate >= p1.TransactionDate
and p2.TransactionDate < date_add(day, 3, p1.prchasedate)
group by p1.customerid, p1.id
having count(*) >= 4
uj5u.com熱心網友回復:
如果客戶必須連續三天完成交易(意味著一天交易 5 次,那么接下來兩天什么都不算),那么這可以通過兩個 self join 來完成:
with cte as
(select CustomerId, Transactiondate, count(*) ct
from table_name
group by CustomerId, Transactiondate)
select distinct t1.CustomerId
from cte t1 inner join cte t2
on t1.Transactiondate = dateadd(day, 1, t2.Transactiondate)
and t1.CustomerId = t2.CustomerId
inner join cte t3
on t2.Transactiondate = dateadd(day, 1, t3.Transactiondate)
and t3.CustomerId = t2.CustomerId
;
小提琴
uj5u.com熱心網友回復:
盡管這是一個孤島問題,但您可以采取一些捷徑。
您可以按日期對其進行分組,然后獲取前第 2 行,并僅按前 2 行恰好相隔兩天的行進行過濾。
SELECT DISTINCT
CustomerId
FROM (
SELECT
t.CustomerId,
v.Date,
Prev2 = LAG(v.Date, 2) OVER (PARTITION BY t.CustomerId ORDER BY v.Date)
FROM YourTable t
CROSS APPLY (VALUES( CAST(Transactiondate AS date) )) v(Date)
GROUP BY
t.CustomerId,
v.Date
) t
WHERE DATEDIFF(day, t.Prev2, t.Date) = 2
db<>小提琴
如果基表每個日期最多只有一行,那么您可以放棄GROUP BY.
uj5u.com熱心網友回復:
這實際上是一個間隙和孤島問題,您可以通過使用分析視窗函式從連續天數中減去連續的row_number來解決,然后在首先借助數字表“填補”任何間隙之后進行分組。
with numbers as (select top(20) Row_Number() over(order by (select null))-1 n from master.dbo.spt_values),
dRanges as (
select customerId,
Min(Transactiondate) CustStartDate,
Max(Transactiondate) CustEndDate
from t
group by CustomerId
), dates as (
select *
from dranges r
outer apply (
select DateAdd(day,n,r.CustStartDate) SeqDate
from numbers n
where DateAdd(day,n,r.CustStartDate) < = r.CustEndDate
)d
), q as (
select customerId, transactiondate, Count(*) qty
from t
group by CustomerId, Transactiondate
), g as (
select d.CustomerId, d.SeqDate, IsNull(q.qty,0)Qty,
DateAdd(day, - row_number() over (partition by d.customerid order by d.SeqDate), d.SeqDate) as dGrp
from dates d
left join q on q.Transactiondate = d.SeqDate and q.CustomerId = d.CustomerId
)
select customerId
from g
group by CustomerId, dGrp
having Count(*) <= 3 and Sum(qty) >= 5
DB<>小提琴
uj5u.com熱心網友回復:
您可以使用 datediff 函式并驗證日期差異的總和是否在 3 和 5 之間(假設差異的最大值僅為 1),因為日期可能是唯一的(例如 customerid 2 可以將交易日期設為 5, 2022 年 3 月的 6、7、8、9),這也應該考慮在內。
declare @tbl table(id int identity,customerid int,transactiondate date)
insert into @tbl(customerid,transactiondate)
values(1,'2022-03-01')
,(1,'2022-03-01')
,(1,'2022-03-05')
,(1,'2022-03-07')
,(1,'2022-03-07')
,(2,'2022-03-05')
,(2,'2022-03-05')
,(2,'2022-03-06')
,(2,'2022-03-06')
,(2,'2022-03-07')
select customerid from (
select *
,SUM(datediff)over(partition by customerid order by transactiondate)[sum]
,max(datediff)over(partition by customerid order by transactiondate)[max]
from(
select customerid , transactiondate,
DATEDIFF(DAY
,
case when LEAD(transactiondate,1)over(partition by customerid order by transactiondate)
is null then
LAG(transactiondate,1,transactiondate)
over(partition by customerid order by transactiondate)
else
transactiondate end
, case when LEAD(transactiondate,1)over(partition by customerid order by transactiondate)
is null then
transactiondate
else
LEAD(transactiondate,1,transactiondate)
over(partition by customerid order by transactiondate)end) as [datediff]
,ROW_NUMBER()over(partition by customerid order by transactiondate)rownum
from @tbl
)t
)t1
where t1.rownum = 5
and t1.max = 1
and t1.sum between 3 and 5
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/443781.html
