我正在嘗試 Python 網路每天為學校專案抓取此網頁:
我正在嘗試在 Python 中模仿相同的發布請求,以便我可以獲得該請求將生成的 txt 檔案。
from urllib import request, parse
data_dict = {
'Data':'Stamp_1',
'Title':'Retired Offset Credits',
'Exclude':',rhid,ftType,Other Attributes here,Make Public,ahid,',
'Columns':'all,Account Holder,Quantity of Offset Credits,FacilityName,Email,Status Effective',
'Masks':'|||||MM/DD/YYYY',
'ClassMasks':',,#.0,,,',
'Headings':',,,Project Name,,',
'FormatType':'txt'
}
data = parse.urlencode(data_dict).encode()
req = request.Request('https://thereserve2.apx.com/myModule/include/rptdownload.asp', data=data_dict)
resp = request.urlretrieve(req, 'download.txt')
這不起作用 - 我收到“TypeError:預期的字串或類似位元組的物件”。我覺得我在這里越來越近了,但我似乎無法將發布請求轉換為我想要的檔案下載或表格拉取。任何幫助將不勝感激。
uj5u.com熱心網友回復:
還需要餅干才能讓它作業~
import requests
from io import StringIO
import pandas as pd
data = {
'myFilter': '',
'Data': 'Stamp_0',
'Title': 'Retired Offset Credits',
'Exclude': ',rhid,ftType,Other Attributes here,Make Public,ahid,',
'Columns': 'all,Account Holder,Quantity of Offset Credits,FacilityName,Email,Status Effective',
'Masks': '|||||MM/DD/YYYY',
'ClassMasks': ',,#.0,,,',
'Headings': ',,,Project Name,,',
'Parameters': '',
'ParametersOriginal': '',
'SortORder': '',
'FormatType': 'txt',
'ReplaceExpression': '',
'ReplaceValue': '',
}
cookies = {
'ASPSESSIONIDCGTRQSDS': 'DFDMDAFDFEPACLKJAAPHHBDH',
}
# Get the file
response = requests.post('https://thereserve2.apx.com/myModule/include/rptdownload.asp', cookies=cookies, data=data)
# Look at the file
df = pd.read_table(StringIO(response.text), sep=',', on_bad_lines='warn')
print(df.head())
# Write the file
with open('download.txt', 'wb') as f:
f.write(response.content)
輸出:
Vintage Offset Credit Serial Numbers Quantity of Offset Credits Status Effective Project ID Project Name Project Type Protocol Version Project Site Location Project Site State Project Site Country Additional Certification(s) CORSIA Eligible Account Holder Retirement Reason Retirement Reason Details Unnamed: 16
0 2021 CAR-1-US-888-4-666-TX-2021-6665-1 to 17444 17444 12/09/2021 CAR888 Angelina County Landfill Landfill Gas Capture/Combustion Version 3.0 Lufkin TEXAS US NaN No Element Markets Emissions, LLC On Behalf of Third Party NaN NaN
1 2021 CAR-1-US-1247-37-234-MT-2021-6653-1 to 110 110 04/20/2022 CAR1247 Bluesource - Carroll Avoided Grassland Convers... Avoided Grassland Conversion Version 1.0 Valley County, MT MONTANA US NaN No Cool Effect Environmental Benefit NaN NaN
2 2021 CAR-1-MX-1282-42-938-PU-2021-6736-1 to 1604 1604 02/17/2022 CAR1282 Captura de carbono en San Rafael Ixtapalucan Forestry - MX Version 1.5 San Rafael Ixtapalucan PUEBLA MX NaN No Cultivo Land PBC On Behalf of Third Party Meta / Facebook Sustainability Goals NaN
3 2021 CAR-1-MX-1282-42-938-PU-2021-6734-1 to 5 5 02/17/2022 CAR1282 Captura de carbono en San Rafael Ixtapalucan Forestry - MX Version 1.5 San Rafael Ixtapalucan PUEBLA MX NaN No Cultivo Land PBC On Behalf of Third Party Meta / Facebook Sustainability Goals NaN
4 2021 CAR-1-MX-1415-42-938-OA-2021-6719-1 to 213 213 12/06/2021 CAR1415 Carbono, Agua y Biodiversidad Indígena Capulálpam Forestry - MX Version 2.0 Capulálpam de Méndez, Oaxaca OAXACA MX NaN No Cool Effect Environmental Benefit NaN NaN
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/471074.html
上一篇:網路抓取資料框
下一篇:重復資料刮板jsonapi
