我試圖遍歷一些檔案路徑,以便我單獨 gzip 每個檔案。在testList每個專案包含字串(路徑)是這樣的:/tmp/File。
對它們進行 gzip 壓縮后,我想將每個 gzip 檔案上傳到 S3:
import boto3
import gzip
import shutil
s3 = boto3.client('s3')
bucket = s3_resource.Bucket('testunzipping')
with zipfile.ZipFile('/tmp/DataPump_10000838.zip', 'r') as zip_ref:
testList = []
for i in zip_ref.namelist():
if (i.startswith("__MACOSX/") == False):
val = '/tmp/' i
testList.append(val)
testList.remove(testList[0])
for i in testList:
fileName = i.replace("/tmp/DataPump_10000838/", "")
fileName2 = i '.gz'
with open(i, 'rb') as f_in:
with gzip.open(fileName2, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
gzip_object = gzip.compress(f_out)
bucket.upload_fileobj(f_out, fileName, ExtraArgs={'ContentType': "text/plain", 'ContentEncoding':'gzip'})
但是,目前,最后一行給了我這個錯誤:
Response
{
"errorMessage": "Input <gzip on 0x7fd53bc53fa0> of type: <class 'gzip.GzipFile'> is not supported.",
"errorType": "RuntimeError",
"requestId": "",
"stackTrace": [
" File \"/var/lang/lib/python3.9/importlib/__init__.py\", line 127, in import_module\n return _bootstrap._gcd_import(name[level:], package, level)\n",
" File \"<frozen importlib._bootstrap>\", line 1030, in _gcd_import\n",
" File \"<frozen importlib._bootstrap>\", line 1007, in _find_and_load\n",
" File \"<frozen importlib._bootstrap>\", line 986, in _find_and_load_unlocked\n",
" File \"<frozen importlib._bootstrap>\", line 680, in _load_unlocked\n",
" File \"<frozen importlib._bootstrap_external>\", line 850, in exec_module\n",
" File \"<frozen importlib._bootstrap>\", line 228, in _call_with_frames_removed\n",
" File \"/var/task/lambda_function.py\", line 50, in <module>\n bucket.upload_fileobj(f_out, fileName, ExtraArgs={'ContentType': \"text/plain\", 'ContentEncoding':'gzip'})\n",
" File \"/var/runtime/boto3/s3/inject.py\", line 579, in bucket_upload_fileobj\n return self.meta.client.upload_fileobj(\n",
" File \"/var/runtime/boto3/s3/inject.py\", line 539, in upload_fileobj\n return future.result()\n",
" File \"/var/runtime/s3transfer/futures.py\", line 106, in result\n return self._coordinator.result()\n",
" File \"/var/runtime/s3transfer/futures.py\", line 265, in result\n raise self._exception\n",
" File \"/var/runtime/s3transfer/tasks.py\", line 255, in _main\n self._submit(transfer_future=transfer_future, **kwargs)\n",
" File \"/var/runtime/s3transfer/upload.py\", line 545, in _submit\n upload_input_manager = self._get_upload_input_manager_cls(\n",
" File \"/var/runtime/s3transfer/upload.py\", line 521, in _get_upload_input_manager_cls\n raise RuntimeError(\n"
]
}
我還能如何將我的 f_out 物件上傳到 S3 存盤桶?S3/boto 不支持 gzips 嗎?我也試過,ExtraArgs={'ContentType': "application/gzip"}但得到了同樣的錯誤。
uj5u.com熱心網友回復:
假設每個檔案都可以放入記憶體中,您可以簡單地執行此操作以壓縮記憶體中的資料并將其打包在 BytesIO 中以供 S3 API 讀取。
import boto3
import gzip
import io
s3 = boto3.client("s3")
bucket = s3_resource.Bucket("testunzipping")
for i in testList:
fileName = i.replace("/tmp/DataPump_10000838/", "")
with open(i, "rb") as f_in:
gzipped_content = gzip.compress(f_in.read())
bucket.upload_fileobj(
io.BytesIO(gzipped_content),
fileName,
ExtraArgs={"ContentType": "text/plain", "ContentEncoding": "gzip"},
)
如果不是這種情況,您可以先使用臨時檔案將資料壓縮到磁盤上:
import boto3
import gzip
import io
import shutil
s3 = boto3.client("s3")
bucket = s3_resource.Bucket("testunzipping")
for i in testList:
fileName = i.replace("/tmp/DataPump_10000838/", "")
with tempfile.TemporaryFile() as tmpf:
with open(i, "rb") as f_in, gzip.GzipFile(mode="wb", fileobj=tmpf) as gzf:
shutil.copyfileobj(f_in, gzf)
tmpf.seek(0)
bucket.upload_fileobj(
tmpf,
fileName,
ExtraArgs={"ContentType": "text/plain", "ContentEncoding": "gzip"},
)
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/329542.html
