如何在一個查詢MongoDB中根據條件更新/插入多個檔案-有解無憂

我有 1 個檔案的以下結構：

{
    "uid": uid,
    "at": at,
    "url": url,
    "members": members
}

如果檔案已經存在，我想批量創建/更新檔案，uid因為我在一個請求中傳遞了許多像上面這樣的物件。

我想要看起來像（在python中）的東西：

ids = []
for item in items:
  ids.append(item["uid"])
db.collection.update_many({"_id": {"$id": ids}}, {"$set": items}, upsert=True)

但是它會根據正確的檔案更新正確的檔案ids并插入uid不存在的檔案嗎？

如果存在與否，我想執行insertOR ，它將根據.updateobjectuid

提前致謝

uj5u.com熱心網友回復：

沒有直接的方法來upsert獲得帶有內容串列的鍵串列，例如（這是偽代碼，無效的 MQL）：

keys = ['A','B'];
content = [{key:'A',name:'X',pos:1},{key:'B',name:'Y',pos:2}];
db.collection.updateMany(keys, {$set: content});

該updateMany功能旨在過濾出相對較少的檔案并應用（大部分）恒定更新模式，這不是此處所需的操作。

可以通過兩步方式輕松完成此操作：

將您的更新和潛在插入加載到臨時集合中。
使用$merge運算子將??材料疊加到目標集合上。

考慮這個現有的目標集合foo：

db.foo.insert([
    {uid: 0, at: "A", members: [1,2]},
    {uid: 1, at: "B", members: [1,2]},
    {uid: 2, at: "C", members: [1,2]},
    {uid: 3, at: "D", members: [1,2]}
]);

假設我們構建了一個類似于 OP 發布的專案陣列：

var items = [
    {uid:1, at:"XX"},   // change
    {uid:3, at:"XX", members:[3,4]},  // change
    {uid:7, at:"Q", members:[3,4]},   // new
    {uid:8, at:"P", members:[3,4]}    // new
];

以下管道將這些合并到目標集合中：


db.TEMP1.drop(); // OK if does not exist

// Blast items into the TEMP1 collection.  You could use the bulk
// insert functions https://docs.mongodb.com/manual/reference/method/Bulk.insert/ if you prefer,
// or in theory even use mongoimport outside of this script to bulk
// load material into TEMP1.
db.TEMP1.insert(items); 

c=db.TEMP1.aggregate([
    {$project: {_id:false}}, // must exclude _id to prevent conflict with target 'foo'
    {$merge: {
        into: "foo",  // ah HA!  
        on: [ "uid" ],
        whenMatched: "replace",
        whenNotMatched: "insert"
    }}
]);

db.foo.find();
{ "_id" : 0, "uid" : 0, "at" : "A", "members" : [ 1, 2 ] }  // unchanged
{ "_id" : 1, "uid" : 1, "at" : "XX" }     // modified and members gone
{ "_id" : 2, "uid" : 2, "at" : "C", "members" : [ 1, 2 ] }  // unchanged
{ "_id" : 3, "uid" : 3, "at" : "XX", "members" : [ 3, 4 ] }  // modified
{ "_id" : ObjectId("6228be517dec6ed6fa40cbe3"), "uid" : 7, "at" : "Q", "members" : [ 3, 4 ] }   // inserted
{ "_id" : ObjectId("6228be517dec6ed6fa40cbe4"), "uid" : 8, "at" : "P", "members" : [ 3, 4 ] }   // inserted

需要注意的是，目標集合中必須存在唯一索引uid，但這可能是您無論如何都希望這樣的操作具有合理性能的索引。

這種方法更面向客戶端資料，即它假設客戶端正在構建大而完整的資料結構以覆寫或插入，而不是update在其選擇性地接觸盡可能少的欄位的能力方面更加“外科手術”可能的。

For quantitative comparison, 2 million docs sized roughly as above were loaded into foo. A script that uses bulk operations loaded 50,000 items into TEMP1 of which half had matching uid (for update) and half did not (for insert), followed by the running of the $merge script. On a MacBookPro w/16GB RAM and SSD disk and a v4.4.2 server, the times are

1454 millis to insert 50013; 34396 ins/sec
7144 millis to merge; 7000 upd/sec
8598 total millis; 5816 ops/sec

If done in a simple loop:

33452 millis for loop update; 1495 upd/sec

So the temp collection and $merge approach is nearly 4X faster.

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/440596.html

標籤：mongodb pymongo

上一篇：如何在mongoDB中用值0填充缺失的檔案？

下一篇：如何在SpringDataJPA中映射具有2列參考同一個表的表