我正在嘗試使用 python 和應用一些過濾器的請求庫向這個 url 發出正確的 get 請求:
https://www.efast.dol.gov/5500search/
我只需要一些過濾器即可在搜索頁面中獲取正確的資料,它們是:planyear、ein 和 pn。當我嘗試執行請求時,我得到了錯誤的資料,因為我的 dict 在“q”之后得到了一個已洗掉的值
這是一個例子:
import requests
args = {'q.parser': 'lucene', 'q': {'ein': '814699012', 'planyear': '2020', 'pn': '001'}}
url = "https://www.efast.dol.gov/services/afs"
response = requests.get(url, params=args)
當我檢查 response.url 時,我得到:
https://www.efast.dol.gov/services/afs?q.parser=lucene&q=ein&q=planyear&q=pn
每個鍵都沒有價值
這是我最近的一次:
args = {"q.parser":"lucene","q":{"ein":"814699012"}, "planyear":"2020","pn":"001"}
但如果我做 response.url 我得到:
'https://www.efast.dol.gov/services/afs?q.parser=lucene&q=ein&planyear=2020&pn=001
ein 值沒了,我把 planyear 或 pn 作為 q 旁邊的值都沒有關系,結果是一樣的。
我究竟做錯了什么?
正確的結果是2020年對應的資料,正確的ein號和pn號,我得到幾個結果或只有一個結果都沒有關系
正確的結果是這樣的:
https://www.efast.dol.gov/services/afs?q.parser=lucene&size=200&sort=planname asc&q=(((planyear:2020)) AND%20((ein:814699012)) AND%20((pn:001)))&facet.planyear={size:30}&facet.plancode={size:100}&facet.plancode={size:100}&facet.assetseoy={buckets:[%22{,100000]%22,%22[100001,500000]%22,%22[500001,1000000]%22,%22[1000001,10000000]%22,%22[10000001,}%22]}&facet.plantype={size:20}&facet.businesscodecat={size:30}&facet.businesscode={size:30}&facet.state={size:100}&facet.countrycode={buckets:["CA%22,"GB%22,"BM%22,"KY%22]}&facet.formyear={size:30}
uj5u.com熱心網友回復:
您似乎對請求和回應感到困惑。在您的情況下,您應該使用長 URL 作為您的請求,然后決議回應 JSON 資料。所以下面的代碼應該適合你,你需要決議回應:
import requests
url = "https://www.efast.dol.gov/services/afs?q.parser=lucene&size=200&sort=planname asc&q=(((planyear:2020)) AND ((ein:814699012)) AND ((pn:001)))&facet.planyear={size:30}&facet.plancode={size:100}&facet.plancode={size:100}&facet.assetseoy={buckets:["{,100000]","[100001,500000]","[500001,1000000]","[1000001,10000000]","[10000001,}"]}&facet.plantype={size:20}&facet.businesscodecat={size:30}&facet.businesscode={size:30}&facet.state={size:100}&facet.countrycode={buckets:["CA","GB","BM","KY"]}&facet.formyear={size:30}"
payload={}
headers = {}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)
一探究竟:
import requests
parser='lucene'
query='(((planyear:2020)) AND ((ein:814699012)) AND ((pn:001)))'
url = "https://www.efast.dol.gov/services/afs?q.parser=" parser "&size=200&sort=planname asc&q=" query "&facet.planyear={size:30}&facet.plancode={size:100}&facet.plancode={size:100}&facet.assetseoy={buckets:["{,100000]","[100001,500000]","[500001,1000000]","[1000001,10000000]","[10000001,}"]}&facet.plantype={size:20}&facet.businesscodecat={size:30}&facet.businesscode={size:30}&facet.state={size:100}&facet.countrycode={buckets:["CA","GB","BM","KY"]}&facet.formyear={size:30}"
payload={}
headers = {}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)
uj5u.com熱心網友回復:
Python 的requests包不支持 dict 之類的引數。它們在鍵值字典中的值必須是字串或字串串列:
Requests 允許您使用 params 關鍵字引數將這些引數作為字串字典提供。
https://docs.python-requests.org/en/latest/user/quickstart/#passing-parameters-in-urls
您的網站使用非標準編碼將字典編碼為 url 有效字符。
如果你看看你的例子:
(((planyear:2020)) AND%20((ein:814699012)) AND%20((pn:001)))
我們可以推斷格式為:
(((KEY:VALUE)) AND ((KEY:VALUE)) AND <...>)
所以它是()每個 key:value 對被包圍的地方(()),空格被 urlquoted to 。
我們可以在我們的代碼中自己復制這種編碼:
>>> params = {"planyear": "2020", "ein": 814699012, "pn": "001"}
>>> encoded = ' AND '.join(f"(({k}:{v})" for k, v in params.items())
>>> f"({encoded})"
'(((planyear:2020)) AND ((ein:814699012)) AND ((pn:001)))'
然后將其作為您的q引數傳遞。
編輯:我已經為你的具體情況編譯了這個:
import requests
q = {'ein': '814699012', 'planyear': '2020', 'pn': '001'}
# convert Q parameter to website's encoding:
q = ' AND '.join(f"(({k}:{v}))" for k, v in q.items())
q = f"({q})"
# put all params together: normal key:value parameters special Q parameter:
params = {'q.parser': 'lucene', 'size': 200, 'sort': 'planname asc', 'q': q}
url = "https://www.efast.dol.gov/services/afs"
response = requests.get(url, params=params)
print(response.status)
print(response.url)
# 200
# 'https://www.efast.dol.gov/services/afs?q.parser=lucene&size=200&sort=planname asc&q=(((ein:814699012)) AND ((planyear:2020)) AND ((pn:001)))'
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/340124.html
下一篇:將文本檔案保存到.npy檔案
