請注意,我是編程新手。以上是我在使用python學習網頁抓取時遇到的問題。我使用的網站是https://www.mobikwik.com/(移動、dth、電費的在線充值和支付網站)但我在抓取時得到的只是 403 回應。后來我明白了,這可能是因為網站使用了ajax。我在制作程式時的目標是接收用戶輸入的手機號碼,然后在網站的移動運營商搜索中傳遞值,頁面加載當前運營商和圓圈,我想在我的程式中顯示它們。如果手機號碼被移植到另一個運營商,python 電話號碼模塊是沒有用的。任何幫助表示贊賞。謝謝。
uj5u.com熱心網友回復:
有兩個 xhr 請求,我不確定您想要哪個,所以我都做了。您所需要的只是重新創建請求。
getconnectiondetails:
scrapy shell
In [1]: phone_number = '9820123456'
In [2]: url = 'https://rapi.mobikwik.com/recharge/infobip/getconnectiondetails?cn='
In [3]: headers = {
...: "Accept": "application/json, text/plain, */*",
...: "Accept-Encoding": "gzip, deflate, br",
...: "Accept-Language": "en-US,en;q=0.5",
...: "Cache-Control": "no-cache",
...: "Connection": "keep-alive",
...: "DNT": "1",
...: "Host": "rapi.mobikwik.com",
...: "Origin": "https://www.mobikwik.com",
...: "Pragma": "no-cache",
...: "Referer": "https://www.mobikwik.com/",
...: "Sec-Fetch-Dest": "empty",
...: "Sec-Fetch-Mode": "cors",
...: "Sec-Fetch-Site": "same-site",
...: "Sec-GPC": "1",
...: "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.372
...: 9.169 Safari/537.36",
...: "X-MClient": "0"
...: }
In [4]: req = scrapy.Request(url=url phone_number, headers=headers)
In [5]: fetch(req)
[scrapy.core.engine] INFO: Spider opened
[scrapy.core.engine] DEBUG: Crawled (200) <GET https://rapi.mobikwik.com/recharge/infobip/getconnectiondetails?cn=9820123456> (referer: https://www.mobikwik.com/)
In [6]: json_data = response.json()
In [7]: json_data['data']['operatorId']
Out[7]: 338
In [8]: json_data['data']['circleId']
Out[8]: 15
recommendedplans:
scrapy shell
In [1]: phone_number = '9820123456'
In [2]: url = 'https://rapi.mobikwik.com/recharge/v1/rechargePlansAPI/recommendedplans/338/15?cn='
In [3]: headers = {
...: "Accept": "application/json, text/plain, */*",
...: "Accept-Encoding": "gzip, deflate, br",
...: "Accept-Language": "en-US,en;q=0.5",
...: "Cache-Control": "no-cache",
...: "Connection": "keep-alive",
...: "DNT": "1",
...: "Host": "rapi.mobikwik.com",
...: "Origin": "https://www.mobikwik.com",
...: "Pragma": "no-cache",
...: "Referer": "https://www.mobikwik.com/",
...: "Sec-Fetch-Dest": "empty",
...: "Sec-Fetch-Mode": "cors",
...: "Sec-Fetch-Site": "same-site",
...: "Sec-GPC": "1",
...: "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.372
...: 9.169 Safari/537.36",
...: "X-MClient": "0"
...: }
In [4]: req = scrapy.Request(url=url phone_number, headers=headers)
In [5]: fetch(req)
[scrapy.core.engine] INFO: Spider opened
[scrapy.core.engine] DEBUG: Crawled (200) <GET https://rapi.mobikwik.com/recharge/v1/rechargePlansAPI/recommendedplans/338/15?cn=9820123456> (referer: https://www.mobikwik.com/)
In [6]: json_data = response.json()
In [7]: for item in json_data['data']['plans']:
...: print(item['id'])
...:
1104293
1155779
1155937
1164885
1156067
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/401182.html
下一篇:如何應用限制以獲得10個結果
