我有下一個資料庫結構:
作業區:
| 鑰匙 | 指數 | |
|---|---|---|
| PK | ID | ID |
| 內容 |
專案:
| 鑰匙 | 指數 | |
|---|---|---|
| PK | ID | ID |
| FK | 作業空間 | 作業空間_1 |
| 已洗掉 | 洗掉_1 | |
| 內容 |
專案:
| 鑰匙 | 指數 | |
|---|---|---|
| PK | ID | ID |
| FK | 專案 | 專案_1 |
| 型別 | _type_1 | |
| 已洗掉 | 洗掉_1 | |
| 內容 |
我需要計算的一些專案每個型別的每個專案的作業區,如預期的輸出:
[
{ _id: 'projectId1', itemType1Count: 100, itemType2Count: 50, itemType3Count: 200 },
{ _id: 'projectId2', itemType1Count: 40, itemType2Count: 100, itemType3Count: 300 },
....
]
經過幾次嘗試和一些除錯后,我創建了一個查詢,它提供了我需要的輸出:
const pipeline = [
{ $match: { workspace: 'workspaceId1' } },
{
$lookup: {
from: 'items',
let: { id: '$_id' },
pipeline: [
{
$match: {
$expr: {
$eq: ['$project', '$$id'],
},
},
},
// project only fields necessary for later pipelines to not overload
// memory and to not get `exceeded memory limit for $group` error
{ $project: { _id: 1, type: 1, deleted: 1 } },
],
as: 'items',
},
},
// Use $unwind here to optimize aggregation pipeline, see:
// https://stackoverflow.com/questions/45724785/aggregate-lookup-total-size-of-documents-in-matching-pipeline-exceeds-maximum-d
// Without $unwind we may get an `matching pipeline exceeds maximum document size` error.
// Error appears not in all requests and it's really strange and hard to debug.
{ $unwind: '$items' },
{ $match: { 'items.deleted': { $eq: false } } },
{
$group: {
_id: '$_id',
items: { $push: '$items' },
},
},
{
$project: {
_id: 1,
// Note: I have only 3 possible item types, so it's OK that it's names hardcoded.
itemType1Count: {
$size: {
$filter: {
input: '$items',
cond: { $eq: ['$$this.type', 'type1'] },
},
},
},
itemType2Count: {
$size: {
$filter: {
input: '$items',
cond: { $eq: ['$$this.type', 'type2'] },
},
},
},
itemType3Count: {
$size: {
$filter: {
input: '$items',
cond: { $eq: ['$$this.type', 'type3'] },
},
},
},
},
},
]
const counts = await Project.aggregate(pipeline)
查詢按預期作業,但速度很慢...如果我在一個作業區中有大約 1000 個專案,則需要大約8 秒才能完成。任何如何使它更快的想法都值得贊賞。
謝謝。
uj5u.com熱心網友回復:
假設您的索引被正確編入索引,它們包含“正確”的欄位,我們仍然可以對查詢本身進行一些調整。
方法 1:保留現有的集合模式
db.projects.aggregate([
{
$match: {
workspace: "workspaceId1"
}
},
{
$lookup: {
from: "items",
let: {id: "$_id"},
pipeline: [
{
$match: {
$expr: {
$and: [
{$eq: ["$project","$$id"]},
{$eq: ["$deleted",false]}
]
}
}
},
// project only fields necessary for later pipelines to not overload
// memory and to not get `exceeded memory limit for $group` error
{
$project: {
_id: 1,
type: 1,
deleted: 1
}
}
],
as: "items"
}
},
// Use $unwind here to optimize aggregation pipeline, see:
// https://stackoverflow.com/questions/45724785/aggregate-lookup-total-size-of-documents-in-matching-pipeline-exceeds-maximum-d
// Without $unwind we may get an `matching pipeline exceeds maximum document size` error.
// Error appears not in all requests and it's really strange and hard to debug.
{
$unwind: "$items"
},
{
$group: {
_id: "$_id",
itemType1Count: {
$sum: {
"$cond": {
"if": {$eq: ["$items.type","type1"]},
"then": 1,
"else": 0
}
}
},
itemType2Count: {
$sum: {
"$cond": {
"if": {$eq: ["$items.type","type2"]},
"then": 1,
"else": 0
}
}
},
itemType3Count: {
$sum: {
"$cond": {
"if": {$eq: ["$items.type","type1"]},
"then": 1,
"else": 0
}
}
}
}
}
])
有2個主要變化:
- 將
items.deleted : false條件移動到子$lookup管道中以查找更少的items檔案 - 跳過
items: { $push: '$items' }。取而代之的是,在做后一個條件和$group階段
這是Mongo 游樂場供您參考。(至少為了新查詢的正確性)
方法二:如果可以修改集合模式。我們可以像這樣反規范化projects.workspace到items集合中:
{
"_id": "i1",
"project": "p1",
"workspace": "workspaceId1",
"type": "type1",
"deleted": false
}
這樣,您可以跳過$lookup. 一個簡單$match而$group就足夠了。
db.items.aggregate([
{
$match: {
"deleted": false,
"workspace": "workspaceId1"
}
},
{
$group: {
_id: "$project",
itemType1Count: {
$sum: {
"$cond": {
"if": {$eq: ["$type","type1"]},
"then": 1,
"else": 0
}
}
},
...
這是帶有非規范化架構的Mongo 游樂場供您參考。
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/314069.html
標籤:MongoDB 表现 mongodb-查询 聚合框架 查询优化
上一篇:意想不到的答案
下一篇:流中兩個字串的正則運算式驗證
