我有Website(Id)表,每條記錄可能有多個CheckLog(FK WebsiteId)關聯的條目。CheckLog也有一個復合索引[WebsiteId, CreatedTime]。Website只有大約 20 條記錄,但加班時間CheckLog會增加,在我遇到此問題時有 300 萬條記錄。(請參閱問題末尾的使用 EF Core 的架構)。
我經常遇到的一個查詢是查詢所有Websites的串列,以及零/一個最新CheckLog記錄:
return await this.ctx.Websites.AsNoTracking()
.Select(q => new WebsiteListItem()
{
Website = q,
LatestCheckLog = q.CheckLogs
.OrderByDescending(q => q.CreatedTime)
.FirstOrDefault(),
})
.ToListAsync();
我相信[WebsiteId, CreatedTime]索引應該會有所幫助。但是,執行查詢大約需要 11 秒。這是翻譯后的查詢,以及EXPLAIN QUERY PLAN:
SELECT "w"."Id", "t0"."Id", "t0"."CreatedTime", "t0"."WebsiteId"
FROM "Websites" AS "w"
LEFT JOIN (
SELECT "t"."Id", "t"."CreatedTime", "t"."WebsiteId"
FROM (
SELECT "c"."Id", "c"."CreatedTime", "c"."WebsiteId", ROW_NUMBER() OVER(PARTITION BY "c"."WebsiteId" ORDER BY "c"."CreatedTime" DESC) AS "row"
FROM "CheckLogs" AS "c"
) AS "t"
WHERE "t"."row" <= 1
) AS "t0" ON "w"."Id" = "t0"."WebsiteId"

MATERIALIZE 1
CO-ROUTINE 4
SCAN TABLE CheckLogs AS c USING INDEX IX_CheckLogs_WebsiteId_CreatedTime
USE TEMP B-TREE FOR RIGHT PART OF ORDER BY
SCAN SUBQUERY 4
SCAN TABLE Websites AS w
SEARCH SUBQUERY 1 AS t USING AUTOMATIC COVERING INDEX (WebsiteId=?)
這可以用索引解決嗎?如果沒有,是否有一種有效的方法可以在不創建 N 1 查詢的情況下查詢它?我試圖想出一種方法來用 2 個查詢來做到這一點,但想不出有什么更好的方法可以像 EF Core 那樣翻譯它)。
此外,我相信這是一個非常普遍的問題,但我不知道應該使用什么關鍵字來找出此類問題的解決方案。我對此類問題的通用解決方案沒有意見(即獲取 的最新Product串列Categories)。謝謝你。
我將 EF Core 用于資料庫架構:
public class Website
{
public int Id { get; set; }
// Other properties
public ICollection<CheckLog> CheckLogs { get; set; }
}
[Index(nameof(CreatedTime))]
[Index(nameof(WebsiteId), nameof(CreatedTime))]
public class CheckLog
{
public int Id { get; set; }
public DateTime CreatedTime { get; set; }
public int WebsiteId { get; set; }
public Website Website { get; set; }
// Other properties
}
uj5u.com熱心網友回復:
如果您想要的是CreatedTime為每個獲取最新的行,WebsiteId則不需要任何連接。
只需聚合并設定條件:
HAVING MAX(CreatedTime)
這不是標準的 SQL,而是利用了 SQLite 的裸列:
SELECT *
FROM CheckLogs
GROUP BY WebsiteId
HAVING MAX(CreatedTime);
如果你想加入它Websites:
SELECT w.Id, t.Id, t.CreatedTime, t.WebsiteId
FROM Websites AS w
LEFT JOIN (
SELECT *
FROM CheckLogs
GROUP BY WebsiteId
HAVING MAX(CreatedTime)
) AS t ON w.Id = t.WebsiteId;
uj5u.com熱心網友回復:
感謝這個答案,我找到了如何重寫查詢:
SELECT L.*
FROM CheckLogs L
INNER JOIN
(SELECT WebsiteId, Max(CreatedTime) AS CreatedTime
FROM CheckLogs
GROUP BY WebsiteId) L2
ON L.WebsiteId = L2.WebsiteId AND L.CreatedTime = L2.CreatedTime
由于 EF Core 無法翻譯 JOIN 查詢,因此我以這種方式重寫了代碼,2 次往返資料庫服務器。請注意,由于 WebsiteIds 是數字,所以這樣說是安全的,如果您輸入字串引數,則需要對其進行清理。
var websites = await query.ToListAsync();
var websiteIds = websites.Select(q => q.Id).ToList();
var websiteIdsString = string.Join(",", websiteIds);
var logQuery = this.ctx.CheckLogs.FromSqlRaw(@$"
SELECT L.*
FROM CheckLogs L
INNER JOIN
(SELECT WebsiteId, Max(CreatedTime) AS CreatedTime
FROM CheckLogs
GROUP BY WebsiteId) L2
ON L.WebsiteId = L2.WebsiteId AND L.CreatedTime = L2.CreatedTime
WHERE L.WebsiteId IN ({websiteIdsString})");
var logs = await logQuery.AsNoTracking().ToListAsync();
var logDict = logs.ToLookup(q => q.WebsiteId);
return websites.Select(q => new WebsiteListItem()
{
Website = q,
LatestCheckLog = logDict[q.Id].FirstOrDefault(),
}).ToList();
uj5u.com熱心網友回復:
首先嘗試這個查詢:
SELECT MAX("c"."CreatedTime"), "c"."WebsiteId" FROM "CheckLogs" AS "c" GROUP BY "c"."WebsiteId"
將您的查詢優化為比上述更快是很困難的。如果它很快,那么就有希望。
你可以試試:
with logs as (
select max("c"."Id") "Id", max("c"."CreatedTime") "CreatedTime", "c"."WebsiteId"
from "CheckLogs" AS "c"
group by "c"."WebsiteId"
having count(*) = 1)
select "w"."Id", "l"."Id", "l"."CreatedTime", "l"."WebsiteId"
from "Websites" AS "w"
join logs as "l" on "w"."Id" = "l"."WebsiteId"
union all
select "w"."Id", null, null, null
from "Websites" AS "w"
where not exists (
select 'x'
from "CheckLogs" AS "l"
where "l"."WebsiteId" = "w"."Id")
你也可以試試union all上面的兩半,看看它們有多快。如果仍然很慢,您可以添加一個包含您正在查找的結果的新表,并在“CheckLogs”上創建觸發器以填充它,以便資料準備就緒。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/341405.html
