長期使用 python 請求的用戶在這里。嘗試對此端點進行簡單呼叫:
https://www.overstock.com/api/product.json?prod_id=10897789
我目前的代碼:
import requests
headers = { 'User-Agent': 'Mozilla/5.0', 'Accept': 'application/json' }
url = 'https://www.overstock.com/api/product.json?prod_id=10897789'
r = requests.get( url, headers=headers )
result = r.json()
print( result )
預期結果(縮短):
{'categoryId': 244, 'subCategoryId': 31446, 'altSubCategoryId': 0, 'taxonomy': {'store': {'id': 1, 'name': 'Rugs', 'apiUrl': 'https://www.overstock.com/api/search.json?taxonomy=sto1', 'htmlUrl': 'https://www.overstock.com/Home-Garden/1/store.html'}, 'department': {'id': 3, 'name': 'Casual Rugs'...
不幸的是,從 Linux 上的同一個腳本中,我沒有得到相同的結果。到目前為止,我對為什么會發生這種情況感到困惑......
這是丑陋的 Linux 錯誤:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/root/.local/share/virtualenvs/online-project-7j1lNF7P/lib/python3.6/site-packages/requests/models.py", line 900, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1)
可能是什么問題?這是我嘗試過的其他方法...
Linux 沒有運行 python 3.6,而是運行 2.7x 來執行請求。添加 'Accept': 'application/json' 到 headers 肯定會解決這個問題解碼資料變數第一個 data = response.decode()(鏈接到 SO 帖子)失敗:“AttributeError: 'Response' object has no attribute 'decode'”使用 requests.Response.json(鏈接到 SO post)失敗:給出與上述相同的錯誤。升級到 python 3.9.9 可能會解決它。不!這對我來說仍然失敗。也許這是你的防火墻。不,檢查過ufw,它是Status: inactive
#5 錯誤(在新的 Linux 機器上,將 python 升級到 3.9.9):
`$ python3 test.py
Traceback (most recent call last):
File "/home/user/test.py", line 13, in <module>
print(r.json())
File "/usr/lib/python3/dist-packages/requests/models.py", line 892, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.9/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
@balmy - 這是我在確認請求版本 2.26.0 和 python 3.9 后得到的輸出......
$ python3 test3.py
Traceback (most recent call last):
File "/home/user/test_scripts/test3.py", line 13, in <module>
print(r.json())
File "/home/eric/.local/lib/python3.9/site-packages/requests/models.py", line 910, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.9/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1)
@JCaesar - 這是文本(縮短為我認為相關的部分,可能出現機器人檢測)
<div id="bd">
<div class="ohNoRedBar">
There was an error processing your request.
</div>
<span class="ohNoText"></span>
</div>
@Philippe - 這是對您的評論“您能否將列印陳述句更改為print(r.text)并運行python3 test3.py | jq .”的回應結果...
$ sudo python3 test3.py | jq .
Traceback (most recent call last):
File "/usr/lib/command-not-found", line 28, in <module>
from CommandNotFound import CommandNotFound
File "/usr/lib/python3/dist-packages/CommandNotFound/CommandNotFound.py", line 19, in <module>
from CommandNotFound.db.db import SqliteDatabase
File "/usr/lib/python3/dist-packages/CommandNotFound/db/db.py", line 5, in <module>
import apt_pkg
ModuleNotFoundError: No module named 'apt_pkg'
Traceback (most recent call last):
File "/home/eric/test_scripts/test3.py", line 13, in <module>
print(r.text)
BrokenPipeError: [Errno 32] Broken pipe
@Philippe - 回答您的下一條評論
$ sudo python3 test3.py | jq . parse error: Invalid numeric literal at line 2, column 10 Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'> BrokenPipeError: [Errno 32] Broken pipe
如果您有解決方案,請告訴我。謝謝!
uj5u.com熱心網友回復:
在 macOS 12.0.1 和 Python 3.9.9 上運行requests 2.26.0 我發現該網站需要在標頭中使用 Accept-Encoding。這對我來說按預期作業:
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Safari/605.1.15',
'Accept': 'application/json',
'Connection': 'keep-alive',
'Accept-Encoding': 'gzip, deflate, br'
}
with requests.Session() as session:
(r := session.get('https://www.overstock.com/api/product.json?prod_id=10897789', headers=headers)).raise_for_status()
print(r.json())
uj5u.com熱心網友回復:
這都是因為被IP封鎖了。
這是最終拯救這一天的腳本......
import requests
url = "https://www.overstock.com/api/product.json?prod_id=10897789"
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Safari/605.1.15',
'Accept': 'application/json',
'Connection': 'keep-alive',
'Accept-Encoding': 'gzip, deflate, br'
}
http_proxy = "http://ip:port"
https_proxy = "http://ip:port"
proxyDict = {
"http" : http_proxy,
"https" : https_proxy
}
r = requests.get(url, headers=headers, proxies=proxyDict)
result = r.json()
print(result)
感謝大家的集體努力!
在看到這對 @JCaesar、@diggusbickus、@balmy 和 @Philippe 有效后,我意識到唯一沒有解決的問題就是 IP 地址。通過添加輪換住宅代理 IP,我提出了請求并立即獲得了資料。
感謝@JCaesar 透露'Accept-Encoding',沒有它,它根本無法作業。感謝@diggusbickus 的評論,walrus notation :=否則我會假設 Python 3.9.x 正在運行并升級它。
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/382738.html
標籤:Python json linux 网页抓取 蟒蛇请求
下一篇:Json腳本無法正常運行
