我有一張表格,記錄了在數臺生產機器上生產的每一件作品,可以追溯到幾年前。定期(例如每周一次)我想檢查此表以建立每個機器和產品組合的“最佳性能”記錄,并根據以下規則將它們存盤在新表中;
- 機器必須在 10 天內至少生產 10,000 個零件 - 如果在 10 天內僅生產 9000 個零件,則此記錄無效
- 機器必須在整個期間都在運行相同的產品而沒有更換,即如果在第 5 天更換了產品,這是一個無效記錄
性能資料表如下所示 [VisionMachineResults]
| ID | MCSAP | 約會時間 | 產品名稱 | 檢查結果 |
|---|---|---|---|---|
| 1 | 123456 | 2020-01-01 08:29:34:456 | 產品A | 0 |
| 2 | 123456 | 2020-01-01 08:45:50:456 | 產品B | 1 |
| 3 | 844214 | 2020-01-01 08:34:48:456 | 產品A | 2 |
| 4 | 978415 | 2020-01-02 09:29:26:456 | 產品C | 0 |
| 5 | 985633 | 2020-01-04 23:29:11:456 | 產品A | 2 |
我能夠生成一個結果,其中列出了每個 SAP / 產品組合的各個天的性能串列,但是我需要在 SQL 之外的復雜回圈中處理資料以建立 10 天組。
我目前的查詢是:
SELECT CAST(DateTime AS date) AS InputDate,
MCSAP,
ZAssetRegister.LocalName,
ProductName,
SUM(CASE WHEN InspectionResult = 0 THEN 1 END) AS OKParts,
COUNT(CASE WHEN InspectionResult > 0 THEN 1 END) AS NGParts
FROM [VisionMachineResults]
INNER JOIN ZAssetRegister ON VisionMachineResults.MCSAP = ZAssetRegister.SAP_Number
GROUP BY CAST(DateTime AS date),
MCSAP,
ProductName,
ZAssetRegister.LocalName
ORDER BY InputDate,
ZAssetRegister.LocalName;
是否有可能讓 SQL 查詢以 10 天組的形式給出結果,而不是每天,即
01-01-2021 to 11-01-2021 | Machine 1 | Product 1 | 20,000 | 5,000
02-01-2021 to 12-01-2021 | Machine 1 | Product 1 | 22,000 | 1,000
03-01-2021 to 13-01-2021 | Machine 1 | Product 1 | 18,000 | 4,000
etc...
然后,我將遍歷這些行以找到 OK 部分百分比最高的行。任何想法表示贊賞!
uj5u.com熱心網友回復:
我想到的一個選項是使用數字表(谷歌 Jeff Moden 在 SQL Server Central 上了解更多背景資訊)。
然后,數字表使用開始日期(來自要調查的日期范圍),除了生成要鏈接到的日期外,還會生成一個“桶”,之后根據該“桶”進行分組。
如同:
-- generate date frame from and to
DECLARE
@date_start date = Convert( date, '20211110', 112 ),
@date_end date = Convert( date, '20220110', 112 )
;
WITH
cteN
(
Number
)
AS
( -- build a list of 10 single digit numbers
SELECT Cast( 0 AS int ) AS Number UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
)
,
cteNumbers
(
Number
)
AS
( -- splice single digit numbers to list from 0 to 99999
SELECT
cN10000.Number * 10000 cN1000.Number * 1000 cN100.Number * 100 cN10.Number * 10 cN1.Number
FROM
cteN AS cN10000
CROSS JOIN cteN AS cN1000
CROSS JOIN cteN AS cN100
CROSS JOIN cteN AS cN10
CROSS JOIN cteN AS cN1
)
,
cteBucketOffset
(
DatediffNum,
Offset
)
AS
( -- determine the offset in datediffs to number buckets later correctly
SELECT
Cast( Datediff( dd, @date_start, @date_end ) AS int ) - 1 AS DatediffNum,
Cast( Datediff( dd, @date_start, @date_end ) % 10 AS tinyint ) - 1 AS Offset
)
,
cteDates
(
Dated,
Bucket,
BucketNumber,
BucketOffset,
DatediffNum
)
AS
( -- generate list of dates with bucket batches and numbers
SELECT
Dateadd( dd, cN.Number * -1, @date_end ) AS Dated,
Cast( ( cBO.Offset cN.Number ) / 10 AS int ) AS Bucket,
Cast( ( cBO.Offset cN.Number ) % 10 AS tinyint ) AS BucketNumber,
cBO.Offset,
cBO.DatediffNum
FROM
cteNumbers AS cN
CROSS JOIN cteBucketOffset AS cBO
WHERE
cN.Number <= Datediff( dd, @date_start, @date_end )
)
SELECT
*
FROM
cteDates AS cD
ORDER BY
cD.Dated ASC
;
因展示每一步而啰嗦。結果是一個可用于連接原始資料的動態表。然后可以使用“桶”而不是日期本身來對原始資料進行分組。
一旦構建了這些資料,就可以根據分組條件做出決策,例如具有最少的行數。
uj5u.com熱心網友回復:
這個程序需要從多個層面考慮。首先,您提到連續 10 天。我們不知道那些日子是否包括周末,機器是否 24/7 運行。如果運行的日期也可以跳過假期?所以,10 天可能是 1 月 1 日到 1 月 10 日。但如果你跳過周末,你只有 6 個實際的 WEEKDAYS。
接下來,考慮一臺機器處理多個產品,例如在日期之間切換,甚至在一天內切換。
正如評論者所指出的,列名與保留字(例如 DateTime)相同,這是一種不好的做法,并嘗試查看是否有任何新列是可能引起混淆的常見關鍵字并避免使用它們。
您還提到您必須進行復雜的回圈檢查,以及如何處理加入到 10 天、拆分等。我想我有一個有點優雅的方法來做這件事,應該證明在事情的計劃中相當簡單。
您正在使用 SQL-Server,因此我將通過“#”表名使用 TEMP 表來執行此操作。這樣,當您完成連接或呼叫使其成為存盤程序時,您不必繼續洗掉和重新創建它們。也就是說,讓我一步一步地帶你走。
首先,我正在創建一個與您的結構相匹配的簡單表,即使使用 DateTime 背景關系也是如此。
CREATE TABLE VisionMachineResults
(
ID int IDENTITY(1,1) NOT NULL,
MCSAP nvarchar(6) NOT NULL,
DateTime datetime NOT NULL,
ProductName nvarchar(10) NOT NULL,
InspectionResult int NOT NULL,
CONSTRAINT ID PRIMARY KEY CLUSTERED
(
[ID] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
現在,我正在插入資料,類似于您擁有的資料,但不是數百萬行。你提到你正在尋找 10 天的時間,所以我只是在結尾處添加了一些額外的東西來模擬它。我還在 1 月 5 日明確強制一臺機器對產品進行間隙更改。此外,我在 1 月 7 日添加了一項產品更改,以在您的 10 天考慮中觸發此“中斷”。稍后您會看到結果。
insert into VisionMachineResults
(MCSAP, [DateTime], ProductName, InspectionResult )
values
( '123456', '2020-01-01 08:29:34.456', 'Product A', 0 ),
( '123456', '2020-01-01 08:29:34.456', 'Product B', 1 ),
( '844214', '2020-01-01 08:29:34.456', 'Product A', 2 ),
( '978415', '2020-01-02 08:29:34.456', 'Product C', 0 ),
( '985633', '2020-01-04 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-05 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-05 08:29:34.456', 'Product B', 0 ),
( '985633', '2020-01-06 08:29:34.456', 'Product A', 2 ),
( '985633', '2020-01-07 08:29:34.456', 'Product B', 0 ),
( '985633', '2020-01-08 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-09 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-10 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-11 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-12 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-13 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-14 08:29:34.456', 'Product A', 1 ),
( '985633', '2020-01-15 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-16 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-17 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-18 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-19 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-20 08:29:34.456', 'Product A', 0 )
go
所以現在,將其視為您的生產資料的基線。我的第一個查詢將做很多事情,但將預聚合存盤到 #tmpPartDailyCounts 結果表中。通過這種方式,您可以在不同階段查看它們,以便對我的方法進行健全性檢查。
在這里,在每臺機器 (MCSAP) 和日期(沒有時間部分)上,我正在抓取某些聚合,并將它們按機器和日期分組。
select
VMR.MCSAP,
cast(VMR.DateTime as Date) as InputDate,
min( VMR.ProductName ) ProductName,
max( VMR.ProductName ) LastProductName,
count( distinct VMR.ProductName ) as MultipleProductsSameDay,
sum( case when VMR.InspectionResult = 0 then 1 else 0 end ) OKParts,
sum( case when NOT VMR.InspectionResult = 0 then 1 else 0 end ) BadParts,
count(*) TotalParts
into
#tmpPartDailyCounts
from
VisionMachineResults VMR
group by
VMR.MCSAP,
cast(VMR.DateTime as Date)
您正在加入資產表,但不認為您真的需要它。如果機器制造了產品,最終組裝是否完成是否重要?不知道,你會知道的更好。
現在,聚合和為什么。min( VMR.ProductName ) ProductName和,這 max( VMR.ProductName ) LastProductName只是為了結轉在相關日期創建的產品名稱以用于任何最終輸出。如果在某一天,只生產一種產品,無論如何它都是一樣的,只要選擇一種。但是,如果在任何一天有多個產品,則 MIN() 和 MAX() 將具有不同的值。如果所有構建的產品都是相同的,那么兩個值將是相同的——在任何一個給定的日期。
The rest are simple aggregates of OK parts, BAD parts (something was wrong), but also the TOTAL Parts created, regardless of any inspection failure. This is the primary qualifier for you to hit you 10,000, but if you wanted to change to 10,000 GOOD parts, change accordingly.
select
VMR.MCSAP,
cast(VMR.DateTime as Date) as InputDate,
min( VMR.ProductName ) ProductName,
max( VMR.ProductName ) LastProductName,
count( distinct VMR.ProductName ) as MultipleProductsSameDay,
sum( case when VMR.InspectionResult = 0 then 1 else 0 end ) OKParts,
sum( case when NOT VMR.InspectionResult = 0 then 1 else 0 end ) BadParts,
count(*) TotalParts
into
#tmpPartDailyCounts
from
VisionMachineResults VMR
group by
VMR.MCSAP,
cast(VMR.DateTime as Date)
Now, at this point, I have a pre-aggregation done on a per machine and date basis. Now, I want to get some counter that is sequentially applied on a per date that a product was done. I will pull this result into a temp table #tmpPartDays. By using the over/partition, this will create a result that first puts the records in order of MCSAP, then by the date and dumps an output with whatever the ROW_NUMBER() is to that. So, if there is no activity for a given machine such as over a weekend or holiday that the machine is not running, the SEQUENTIAL counter via OVER/PARTITION will keep them sequentially 1 through however many days... Again, query the result of this table and you'll see it.
By querying against the pre-aggregated table, that may account for 500k records and results down to say 450 via per machine/day, This query is now only querying against the 450 and will be very quick.
SELECT
PDC.MCSAP,
PDC.InputDate,
MultipleProductsSameDay,
ROW_NUMBER() OVER(PARTITION BY MCSAP
ORDER BY [InputDate] )
AS CapDay
into
#tmpPartDays
FROM
#tmpPartDailyCounts PDC
ORDER BY
PDC.MCSAP;
Now, is the kicker, tying this all together. I'm starting with just the #tmpPartDays JOINED to itself on the same MCSAP AND a MUST-HAVE matching record 10 days out... So this resolves issues of weekend / holidays since serial consecutive.
This now give me the begin/end date range such as 1-10, 2-11, 3-12, 4-13, etc.
I then join to the tmpPartDailyCounts result on the same part AND the date is at the respective begin (PD.InputDate) and END (PD2.InputDate). I re-apply the same aggregates to get the total counts WITHIN EACH Part 10 day period. Run this query WITHOUT the "HAVING" clause to see what is coming out.
select
PD.MCSAP,
PD.InputDate BeginDate,
PD2.InputDate EndDate,
SUM( PDC.MultipleProductsSameDay ) as TotalProductsMade,
sum( PDC.OKParts ) OKParts,
sum( PDC.BadParts ) BadParts,
sum( PDC.TotalParts ) TotalParts,
min( PDC.ProductName ) ProductName,
max( PDC.LastProductName ) LastProductName
from
#tmpPartDays PD
-- join again to get 10 days out for the END cycle
JOIN #tmpPartDays PD2
on PD.MCSAP = PD2.MCSAP
AND PD.CapDay 9 = PD2.CapDay
-- Now join to daily counts for same machine and within the 10 day period
JOIN #tmpPartDailyCounts PDC
on PD.MCSAP = PDC.MCSAP
AND PDC.InputDate >= PD.InputDate
AND PDC.InputDate <= PD2.InputDate
group by
PD.MCSAP,
PD.InputDate,
PD2.InputDate
having
SUM( PDC.MultipleProductsSameDay ) = 10
AND min( PDC.ProductName ) = max( PDC.LastProductName )
AND SUM( PDC.TotalParts ) >= 10
Finally, the elimination of the records you DONT want. Since I dont have millions of records to simulate, just follow along. I am doing a HAVING on
SUM( PDC.TotalParts ) >= 10
- SUM( PDC.MultipleProductsSameDay ) = 10
If on ANY day there are MORE than 1 product created, the count would be 11 or more, thus indicating not the same product, so that would cause an exclusion. But also, if at the tail-end of data such as only 7 days of production, it would never HIT 10 which was your 10-day qualifier also.
2. AND min( PDC.ProductName ) = max( PDC.LastProductName )
Here, since we are spanning back to the DAILY context, if ANY product changes on any date, the Product Name (via min) and LastProductName (via max) will change, regardless of the day, and regardless of the name context. So, by making sure both the min() and max() are the same, you know it is the same product across the entire span.
3. AND SUM( PDC.TotalParts ) >= 10
Finally, the count of things made. In this case, I did >= 10 because I was only testing with 1 item per day, thus 10 days = 10 items. In your scenario, you may have 987 in one day, but 1100 in another, thus balancing low and high production days to get to that 10,000, but for sample of data, just change YOUR context to the 10,000 limit minimum.
This SQLFiddle shows the results as it gets down to the per machine/day and showing the sequential activity. The last MCSAP machine starts on Jan 4th, but has a sequential day row assignment starting at 1 to give proper context to the 1-10, 2-11, etc.
First SQL Fiddle showing machine/day
第二個小提琴顯示了沒有 HAVING 子句的最終查詢,您可以看到 TotalProductsMade 的前幾行是 11,這意味著有問題的任何一天跨度上的 SOMETHING 創建了不同的產品,并將被排除在最終結果之外。對于 1 月 6 日至 15 日和 1 月 7 日至 16 日的開始和結束日期,您將看到顯示產品 A 和產品 B 的 MIN/MAX 產品,因此表明產品在其 10 天跨度內的某個地方切換了......這些也是將被排除在外。
FINAL 查詢 此查詢顯示應用了 HAVING 子句的結果。
uj5u.com熱心網友回復:
似乎只是按年份和日期除以 10 分組的問題。
SELECT
CONCAT(CONVERT(VARCHAR(10),MIN([DateTime]),105), ' to ', CONVERT(VARCHAR(10), MAX([DateTime]), 105)) AS InputDateRange
, MCSAP
, MAX(ZAssetRegister.LocalName) AS LocalName
, ProductName
, SUM(CASE WHEN InspectionResult = 0 THEN 1 END) AS OKParts
, COUNT(CASE WHEN InspectionResult > 0 THEN 1 END) AS NGParts
, COUNT(DISTINCT CAST([Datetime] AS DATE)) AS total_days
FROM VisionMachineResults
JOIN ZAssetRegister
ON VisionMachineResults.MCSAP = ZAssetRegister.SAP_Number
GROUP BY
DATEPART(YEAR, [DateTime]),
CEILING(DATEPART(DAYOFYEAR, [DateTime])/10.0),
MCSAP,
ProductName
ORDER BY
MIN([DateTime]),
MAX(ZAssetRegister.LocalName);
db<>fiddle的簡化測驗在這里
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/410531.html
標籤:
