如何使用python將url中的影像保存到mongodb中？-有解無憂

我已經使用wikipedia包從任何 wikipedia 頁面獲取影像 url 串列：

import wikipedia
et_page = wikipedia.page("Summer")
images = et_page.images

現在，我想將所有影像從影像變數保存到名為影像的集合中的 mongodb。

import pymongo
from PIL import Image
import io

client = pymongo.MongoClient("mongodb srv://<>:<>@cluster0.lfrg6.mongodb.net/myFirstDatabase?retryWrites=true&w=majority")

database_name = 'test'
database = client[database_name]

collection = 'images'
image_collection = database[collection]

有什么辦法嗎？由于有多個影像，它們可以以串列格式保存嗎？

uj5u.com熱心網友回復：

最好不要將 MongoDB 用作任意 blob 資料存盤，尤其是。對于大影像。縮略圖和小資訊圖表很好。但OP試圖了解它是如何能做到，最好的方法是使用gridFS。 gridFS是 pymongo 環境的一部分，所以如果可以的import pymongo話import gridfs。這是一個作業示例：

import wikipedia
import pymongo
import gridfs
from urllib.request import urlopen

connstr = "mongodb://yourInfoHere"
client = pymongo.MongoClient(connstr)

database = client.testX

# This will create two collections that are under control                                           
# of the gridfs object, images.chunks and images.files.  Do                                         
# not go to these collections directly; use the gridfs                                              
# methods instead. The choice of "images" is arbitrary; you
# can use any name you wish.  gridfs will add .chunks and .files
# to the real collection names.
#  Docs are here
#  https://pymongo.readthedocs.io/en/stable/api/gridfs/index.html#module-gridfs                                                                                  
gfs = gridfs.GridFS(database, collection="images")

page_name = "Summer"
print("capturing URLs to images on page",page_name)
et_page = wikipedia.page(page_name)
images = et_page.images

n = 0
for ii in images:
    print("processing",ii)
    f = urlopen(ii)
    # put() "inserts" the file-like object into the gfs subsystem                                   
    # and returns an ID.                                                                            
    file_id = gfs.put(f)

    # Make up a name and capture it AND the gridfs ID in a                                          
    # regular collection, called imageMeta here but it is                                           
    # any name you like.  It is not strictly necessary to do this
    # and it is completely separate from gridFS but you will almost 
    # always have a need to capture some metadata around the pix.                                                                            
    name = "IMAGE_"   str(n)
    database.imageMeta.insert_one({"name":name, "fileId":file_id})
    n  = 1

# Here is an alternate solution where only 1 imageMeta doc is written                               
# but with arrays of image info.  You STILL need to push each image                                 
# individually into gridfs:                                                                         
n = 0
info = []
for ii in images:
    print("processing",ii)
    f = urlopen(ii)
    # put() "inserts" the file-like object into the gfs subsystem                                   
    # and returns an ID.                                                                            
    file_id = gfs.put(f)

    # Make up a name and capture it AND the gridfs ID in a                                          
    # regular collection, called imageMeta here but it is                                           
    # any name you like.                                                                            
    name = "IMAGE_"   str(n)
    info.append({"name":name, "fileId":file_id})
    n  = 1

database.imageMeta.insert_one({"page":page_name, "imageInfo":info});


# Here is how you can get your images out.  Let's pick                                             
# IMAGE_0 for example but obviously any query criteria on the                                       
# imageMeta docs is valid:                                                                          
doc = database.imageMeta.find_one({"name":"IMAGE_0"});
gg = gfs.get(doc['fileId'])

with open('foo.jpg', 'wb ') as wf:
    wf.write(gg.read())  # Nice read/write slurp

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/401473.html

標籤：Python MongoDB 皮蒙戈

上一篇：嘗試將我的Express應用程式與MongoDBAtlas連接時出錯

下一篇：MySQL多執行緒環境下的操作要點