我正在嘗試撰寫一個查詢,其目的是對多個連續行進行分組以供將來處理。這種分組的規則是:
- 每行都有一個段識別符號和相應的權重。
- 必須將盡可能多的連續段組合在一起,前提是它們的總重量不超過指定的閾值。
- 如果某個段的權重超過指定的閾值,則生成的組將僅代表該段。
下面是一個例子:
SET NOCOUNT ON;
DROP TABLE IF EXISTS #tmpIncoming;
CREATE TABLE #tmpIncoming
(
[Segment] int NOT NULL,
[Weight] int NOT NULL
);
INSERT INTO #tmpIncoming VALUES
( 1, 25),
( 2, 45),
( 3, 20),
( 4, 30),
( 5, 50),
( 6, 21),
( 7, 110);
DECLARE @nMaxChunkSize int = 100;
-- BEGIN: suboptimal
DROP TABLE IF EXISTS #tmpResult;
CREATE TABLE #tmpResult
(
[MinSegment] int NOT NULL,
[MaxSegment] int NOT NULL,
[Weight] int NOT NULL
);
DECLARE cur CURSOR LOCAL READ_ONLY FORWARD_ONLY STATIC FOR
SELECT
[Segment],
[Weight]
FROM
#tmpIncoming
ORDER BY
[Segment];
OPEN cur;
DECLARE @nMinSegment int = 0, @nMaxSegment int = 0;
DECLARE @nWeightSoFar int = 0;
WHILE (1=1)
BEGIN
DECLARE @nSegment int, @nWeight int;
FETCH NEXT FROM cur INTO @nSegment, @nWeight;
IF (@@FETCH_STATUS <> 0)
BREAK;
IF (@nWeightSoFar @nWeight > @nMaxChunkSize)
BEGIN
INSERT INTO #tmpResult ([MinSegment], [MaxSegment], [Weight])
VALUES (@nMinSegment, @nMaxSegment, @nWeightSoFar);
SET @nMinSegment = @nSegment;
SET @nMaxSegment = @nSegment;
SET @nWeightSoFar = @nWeight;
END
ELSE
BEGIN
IF (@nMinSegment = 0)
SET @nMinSegment = @nSegment;
SET @nMaxSegment = @nSegment;
SET @nWeightSoFar = @nWeightSoFar @nWeight;
END;
END;
CLOSE cur;
DEALLOCATE cur;
IF (@nWeightSoFar > 0)
INSERT INTO #tmpResult ([MinSegment], [MaxSegment], [Weight])
VALUES (@nMinSegment, @nMaxSegment, @nWeightSoFar);
SELECT * FROM #tmpResult;
DROP TABLE IF EXISTS #tmpResult;
-- END: suboptimal
DROP TABLE IF EXISTS #tmpIncoming;
我只能想到使用游標變數的次優實作。誰能推薦一種更好的方法,最好只有一個 SELECT 和一些 CTE?
uj5u.com熱心網友回復:
您可以使用遞回按順序回圈遍歷權重,一旦超過塊大小,就重置。
DECLARE @nMaxChunkSize int = 100;
;WITH x AS
(
SELECT Segment,
Weight,
rn = ROW_NUMBER() OVER (ORDER BY Segment)
FROM #tmpIncoming
),
cte AS
(
SELECT Segment, Weight, rn, total = Weight, flip = 0
FROM x
WHERE rn = 1
UNION ALL
SELECT x.Segment, x.Weight, x.rn, total = CASE
WHEN x.Weight cte.Total > @nMaxChunkSize
THEN x.Weight ELSE x.Weight cte.Total END,
flip = flip CASE
WHEN x.Weight cte.Total > @nMaxChunkSize
THEN 1 ELSE 0 END
FROM x JOIN cte
ON x.rn = cte.rn 1
)
SELECT MinSegment = MIN(Segment),
MaxSegment = MAX(Segment),
Weight = MAX(total)
FROM cte
GROUP BY flip
ORDER BY MinSegment
OPTION (MAXRECURSION 0);
結果:
| 最小分段 | 最大段 | 重量 |
|---|---|---|
| 1 | 3 | 90 |
| 4 | 5 | 80 |
| 6 | 6 | 21 |
| 7 | 7 | 110 |
- 示例資料庫<>小提琴
另一種產生相同結果但可能更容易分解/遵循的方法(盡管可以說代碼很快變得和原始代碼一樣冗長):
DECLARE @nMaxChunkSize int = 100;
;WITH x AS
(
SELECT Segment,
Weight,
rn = ROW_NUMBER() OVER (ORDER BY Segment)
FROM #tmpIncoming
),
cte AS
(
SELECT Segment, Weight, rn, Total = Weight
FROM x
WHERE rn = 1
UNION ALL
SELECT x.Segment, x.Weight, x.rn, Total = CASE
WHEN x.Weight cte.Total > @nMaxChunkSize
THEN x.Weight ELSE x.Weight cte.Total END
FROM x JOIN cte ON x.rn = cte.rn 1
)
SELECT MinSegment = MIN(Segment),
MaxSegment = MAX(Segment),
Weight = MAX(Total)
FROM
(
SELECT Segment, Total,
NewGroup = SUM(CASE WHEN Weight = Total THEN 1 ELSE 0 END)
OVER (ORDER BY Segment ROWS UNBOUNDED PRECEDING) FROM cte
) AS y
GROUP BY NewGroup
ORDER BY MinSegment
OPTION (MAXRECURSION 0);
- 那個小提琴在這里:db<>fiddle
uj5u.com熱心網友回復:
由于計算,我認為這不能用視窗函式來完成。我傾向于在應用程式服務器上或使用 CLR 使用 .NET DataReader。如果你真的想使用 SQL,你可以嘗試 Quirky Update:
https://www.sqlservercentral.com/articles/solving-the-running-total-and-ordinal-rank-problems-rewritten
但請注意,它是非關系的,可能會被 Microsoft 補丁破壞。
DROP TABLE IF EXISTS #t;
CREATE TABLE #t
(
Segment int NOT NULL PRIMARY KEY
,[Weight] int NOT NULL
,MinSegment int NULL
,WeightSoFar int NULL
);
INSERT INTO #t (Segment, [Weight])
SELECT Segment, [Weight]
FROM #tmpIncoming;
DECLARE @nMaxChunkSize int = 100
,@WeightSoFar int = 0
,@Break int = 0
,@MinSegment int = 1
,@Check int
,@Anchor int;
UPDATE #t
SET @Break =
CASE
WHEN [Weight] @WeightSoFar > @nMaxChunkSize
THEN 1
ELSE 0
END
,@WeightSoFar =
CASE
WHEN @Break = 1
THEN [Weight]
ELSE [Weight] @WeightSoFar
END
,@MinSegment =
CASE
WHEN @Break = 1
THEN Segment
ELSE @MinSegment
END
,WeightSoFar = @WeightSoFar
,MinSegment = @MinSegment
-- Double check running in segment order
,@check = CASE WHEN Segment > ISNULL(@Anchor, -1) THEN 1 ELSE 1/0 END
,@Anchor = Segment
FROM #t WITH (TABLOCKX)
OPTION (MAXDOP 1);
SELECT MinSegment
,MAX(Segment) AS MaxSegment
,MAX(WeightSoFar) AS [Weight]
FROM #t
GROUP BY MinSegment;
--DROP TABLE IF EXISTS #t;
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/361963.html
標籤:sql sql-server 查询语句
