我有 1 個檔案的以下結構:
{
"uid": uid,
"at": at,
"url": url,
"members": members
}
如果檔案已經存在,我想批量創建/更新檔案,uid因為我在一個請求中傳遞了許多像上面這樣的物件。
我想要看起來像(在python中)的東西:
ids = []
for item in items:
ids.append(item["uid"])
db.collection.update_many({"_id": {"$id": ids}}, {"$set": items}, upsert=True)
但是它會根據正確的檔案更新正確的檔案ids并插入uid不存在的檔案嗎?
如果存在與否,我想執行insertOR ,它將根據.updateobjectuid
提前致謝
uj5u.com熱心網友回復:
沒有直接的方法來upsert獲得帶有內容串列的鍵串列,例如(這是偽代碼,無效的 MQL):
keys = ['A','B'];
content = [{key:'A',name:'X',pos:1},{key:'B',name:'Y',pos:2}];
db.collection.updateMany(keys, {$set: content});
該updateMany功能旨在過濾出相對較少的檔案并應用(大部分)恒定更新模式,這不是此處所需的操作。
可以通過兩步方式輕松完成此操作:
- 將您的更新和潛在插入加載到臨時集合中。
- 使用
$merge運算子將??材料疊加到目標集合上。
考慮這個現有的目標集合foo:
db.foo.insert([
{uid: 0, at: "A", members: [1,2]},
{uid: 1, at: "B", members: [1,2]},
{uid: 2, at: "C", members: [1,2]},
{uid: 3, at: "D", members: [1,2]}
]);
假設我們構建了一個類似于 OP 發布的專案陣列:
var items = [
{uid:1, at:"XX"}, // change
{uid:3, at:"XX", members:[3,4]}, // change
{uid:7, at:"Q", members:[3,4]}, // new
{uid:8, at:"P", members:[3,4]} // new
];
以下管道將這些合并到目標集合中:
db.TEMP1.drop(); // OK if does not exist
// Blast items into the TEMP1 collection. You could use the bulk
// insert functions https://docs.mongodb.com/manual/reference/method/Bulk.insert/ if you prefer,
// or in theory even use mongoimport outside of this script to bulk
// load material into TEMP1.
db.TEMP1.insert(items);
c=db.TEMP1.aggregate([
{$project: {_id:false}}, // must exclude _id to prevent conflict with target 'foo'
{$merge: {
into: "foo", // ah HA!
on: [ "uid" ],
whenMatched: "replace",
whenNotMatched: "insert"
}}
]);
db.foo.find();
{ "_id" : 0, "uid" : 0, "at" : "A", "members" : [ 1, 2 ] } // unchanged
{ "_id" : 1, "uid" : 1, "at" : "XX" } // modified and members gone
{ "_id" : 2, "uid" : 2, "at" : "C", "members" : [ 1, 2 ] } // unchanged
{ "_id" : 3, "uid" : 3, "at" : "XX", "members" : [ 3, 4 ] } // modified
{ "_id" : ObjectId("6228be517dec6ed6fa40cbe3"), "uid" : 7, "at" : "Q", "members" : [ 3, 4 ] } // inserted
{ "_id" : ObjectId("6228be517dec6ed6fa40cbe4"), "uid" : 8, "at" : "P", "members" : [ 3, 4 ] } // inserted
需要注意的是,目標集合中必須存在唯一索引uid,但這可能是您無論如何都希望這樣的操作具有合理性能的索引。
這種方法更面向客戶端資料,即它假設客戶端正在構建大而完整的資料結構以覆寫或插入,而不是update在其選擇性地接觸盡可能少的欄位的能力方面更加“外科手術”可能的。
For quantitative comparison, 2 million docs sized roughly as above were loaded into foo. A script that uses bulk operations loaded 50,000 items into TEMP1 of which half had matching uid (for update) and half did not (for insert), followed by the running of the $merge script. On a MacBookPro w/16GB RAM and SSD disk and a v4.4.2 server, the times are
1454 millis to insert 50013; 34396 ins/sec
7144 millis to merge; 7000 upd/sec
8598 total millis; 5816 ops/sec
If done in a simple loop:
33452 millis for loop update; 1495 upd/sec
So the temp collection and $merge approach is nearly 4X faster.
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/440596.html
