我正在創建一個 python 腳本來查看網路報廢。我可以在 div 或 class 中提取我需要的大部分文本,但是標簽有問題。
url1 = "https://www.wowhead.com/"
soup1 = BeautifulSoup(html1, "html.parser")
test = soup1.find_all('div', attrs={"featured-content-block type-today-in-wow today-in-wow"})[0].find_all('script')[-1]
這將回傳: <script>new WH.News.TodayInWoW( [{"assaults":{"duration":302400,"expansion":8,...."zone":10288}}}] );</script >
我需要的是WH.News.TodayInWoW函式(不確定腳本的哪一部分呼叫它,大膽)但我不知道如何提取它。無論是串列還是字典,我都計劃過濾掉我想要的鍵/值。
任何事情都會不勝感激,我已經看過其他一些 BS 和電子郵件提取,但對我不起作用
uj5u.com熱心網友回復:
您可以嘗試使用子字串(注意[31: -11]末尾的):
test = soup1.find_all('div', attrs={"featured-content-block type-today-in-wow today-in-wow"})[0].find_all('script')[-1][31: -11]
要獲取實際的字典,請評估字串:
d = eval(test)
uj5u.com熱心網友回復:
如果我正確理解這一點,您將無法通過網路抓取功能。您正在做的只是網路抓取函式的引數。假設即使網站公開了腳本,您也需要檢查 element > sources > script,然后嘗試找到該函式所在的 js 檔案!
uj5u.com熱心網友回復:
您可以像以前一樣使用 BeautifulSoup 進行提取,然后您需要進行一些字串/資料操作以及將json其作為串列和字典讀入 python 的包。或者簡單地說,我所做的是使用正則運算式從 html 請求中提取 json。不確定您到底在追求什么,但我將其轉換為表格以pandas向您顯示輸出。
import requests
import re
import json
url1 = "https://www.wowhead.com/"
pattern = 'WH.News.TodayInWoW\((\[.*\])'
html1 = requests.get(url1).text
jsonStr = re.search(pattern, html1).group(1)
jsonData = json.loads(jsonStr)
輸出:
print(jsonData)
[{'assaults': [{'duration': 302400, 'expansion': 8, 'iconActive': 'ui_sigil_kyrian', 'iconInactive': 'item_rep_deathsadvance_01', 'id': 'covenant-assault', 'label': 'Covenant Assault', 'name': 'Kyrian Assault', 'requiresExpansion': False, 'type': 'covenant-assault', 'upcoming': [1644634800, 1644937200, 1645239600, 1645542000, 1645844400], 'url': '/guides/covenant-assaults-maw-invasions'}, {'duration': 21600, 'expansion': 6, 'iconActive': 'inv_misc_map08', 'iconInactive': 'achievement_faction_legionfall', 'id': 'legion-assaults', 'label': 'Legion Assaults', 'name': None, 'requiresExpansion': True, 'type': 'legion-assaults', 'upcoming': [1644984000, 1645050600, 1645117200, 1645183800, 1645250400], 'url': '/legion-assaults-guide'}, {'duration': 25200, 'expansion': 7, 'iconActive': 'inv_misc_map08', 'iconInactive': 'inv_misc_map08', 'id': 'faction-assaults', 'label': 'Faction Assaults', 'name': None, 'requiresExpansion': True, 'type': 'faction-assaults', 'upcoming': [1644955200, 1645023600, 1645092000, 1645160400, 1645228800], 'url': '/guides/battle-for-azeroth-incursions'}, {'duration': 604800, 'expansion': 7, 'iconActive': 'achievement_zone_uldum', 'iconInactive': 'inv_eyeofnzothpet', 'id': 'nzoth-assaults-major', 'label': "N'Zoth Assaults - Major", 'name': 'Uldum', 'requiresExpansion': True, 'type': 'nzoth-assaults-major', 'upcoming': [1644332400, 1644937200, 1645542000, 1646146800, 1646751600], 'url': '/guides/visions-of-nzoth-assaults'}, {'duration': 302400, 'expansion': 7, 'iconActive': 'achievement_zone_valeofeternalblossoms', 'iconInactive': 'inv_eyeofnzothpet', 'id': 'nzoth-assaults-minor', 'label': "N'Zoth Assaults - Minor", 'name': 'Vale of Eternal Blossoms', 'requiresExpansion': True, 'type': 'nzoth-assaults-minor', 'upcoming': [1644634800, 1644937200, 1645239600, 1645542000, 1645844400], 'url': '/guides/visions-of-nzoth-assaults'}], 'id': 'US', 'name': 'NA', 'progression': {'defeatedBosses': 10, 'icon': 'achievement_raid_torghastraid', 'id': 'progression', 'name': 'Sanctum of Domination', 'topGuildCount': 1683, 'totalBosses': 10, 'type': 'progression', 'url': 'https://www.wowhead.com/guides/sanctum-of-domination-raid-overview-strategy-boss-guides-rewards'}, 'warfronts': {'9734': {'expansion': 7, 'contributions': None, 'icon': 'achievement_zone_arathihighlands_01', 'name': 'Arathi Highlands', 'progress': 17.5049603, 'requiresExpansion': True, 'side': 2, 'stateId': 'attacking', 'stateName': 'Horde Attacking', 'url': '/how-to-win-battle-for-stromgarde-warfront-strategy-guide', 'zone': 9734}, '10288': {'expansion': 7, 'contributions': None, 'icon': 'achievement_zone_darkshore_01', 'name': 'Darkshore', 'progress': 86.73307299613953, 'requiresExpansion': True, 'side': 1, 'stateId': 'contributing', 'stateName': 'Alliance Contributing', 'url': '/guides/launching-a-warfront-through-contributions', 'zone': 10288}}}, {'assaults': [{'duration': 302400, 'expansion': 8, 'iconActive': 'ui_sigil_kyrian', 'iconInactive': 'item_rep_deathsadvance_01', 'id': 'covenant-assault', 'label': 'Covenant Assault', 'name': 'Kyrian Assault', 'requiresExpansion': False, 'type': 'covenant-assault', 'upcoming': [1644692400, 1644994800, 1645297200, 1645599600, 1645902000], 'url': '/guides/covenant-assaults-maw-invasions'}, {'duration': 21600, 'expansion': 6, 'iconActive': 'inv_misc_map08', 'iconInactive': 'achievement_faction_legionfall', 'id': 'legion-assaults', 'label': 'Legion Assaults', 'name': None, 'requiresExpansion': True, 'type': 'legion-assaults', 'upcoming': [1644955200, 1645021800, 1645088400, 1645155000, 1645221600], 'url': '/legion-assaults-guide'}, {'duration': 25200, 'expansion': 7, 'iconActive': 'inv_tiragardesound', 'iconInactive': 'inv_misc_map08', 'id': 'faction-assaults', 'label': 'Faction Assaults', 'name': 'Tiragarde Sound', 'requiresExpansion': True, 'type': 'faction-assaults', 'upcoming': [1644922800, 1644991200, 1645059600, 1645128000, 1645196400], 'url': '/guides/battle-for-azeroth-incursions'}, {'duration': 604800, 'expansion': 7, 'iconActive': 'achievement_zone_uldum', 'iconInactive': 'inv_eyeofnzothpet', 'id': 'nzoth-assaults-major', 'label': "N'Zoth Assaults - Major", 'name': 'Uldum', 'requiresExpansion': True, 'type': 'nzoth-assaults-major', 'upcoming': [1644390000, 1644994800, 1645599600, 1646204400, 1646809200], 'url': '/guides/visions-of-nzoth-assaults'}, {'duration': 302400, 'expansion': 7, 'iconActive': 'achievement_zone_valeofeternalblossoms', 'iconInactive': 'inv_eyeofnzothpet', 'id': 'nzoth-assaults-minor', 'label': "N'Zoth Assaults - Minor", 'name': 'Vale of Eternal Blossoms', 'requiresExpansion': True, 'type': 'nzoth-assaults-minor', 'upcoming': [1644692400, 1644994800, 1645297200, 1645599600, 1645902000], 'url': '/guides/visions-of-nzoth-assaults'}], 'id': 'EU', 'name': 'EU', 'progression': {'defeatedBosses': 10, 'icon': 'achievement_raid_torghastraid', 'id': 'progression', 'name': 'Sanctum of Domination', 'topGuildCount': 1683, 'totalBosses': 10, 'type': 'progression', 'url': 'https://www.wowhead.com/guides/sanctum-of-domination-raid-overview-strategy-boss-guides-rewards'}, 'warfronts': {'9734': {'expansion': 7, 'contributions': None, 'icon': 'achievement_zone_arathihighlands_01', 'name': 'Arathi Highlands', 'progress': 0.9389017708599567, 'requiresExpansion': True, 'side': 2, 'stateId': 'contributing', 'stateName': 'Horde Contributing', 'url': '/guides/launching-a-warfront-through-contributions', 'zone': 9734}, '10288': {'expansion': 7, 'contributions': None, 'icon': 'achievement_zone_darkshore_01', 'name': 'Darkshore', 'progress': 49.4943783, 'requiresExpansion': True, 'side': 2, 'stateId': 'attacking', 'stateName': 'Horde Attacking', 'url': '/guides/rewards-from-the-darkshore-warfront', 'zone': 10288}}}]
進入資料框:
import pandas as pd
df = pd.json_normalize(jsonData, record_path=['assaults'])
輸出:
print(df.to_string())
duration expansion iconActive iconInactive id label name requiresExpansion type upcoming url
0 302400 8 ui_sigil_kyrian item_rep_deathsadvance_01 covenant-assault Covenant Assault Kyrian Assault False covenant-assault [1644634800, 1644937200, 1645239600, 1645542000, 1645844400] /guides/covenant-assaults-maw-invasions
1 21600 6 inv_misc_map08 achievement_faction_legionfall legion-assaults Legion Assaults None True legion-assaults [1644984000, 1645050600, 1645117200, 1645183800, 1645250400] /legion-assaults-guide
2 25200 7 inv_misc_map08 inv_misc_map08 faction-assaults Faction Assaults None True faction-assaults [1644955200, 1645023600, 1645092000, 1645160400, 1645228800] /guides/battle-for-azeroth-incursions
3 604800 7 achievement_zone_uldum inv_eyeofnzothpet nzoth-assaults-major N'Zoth Assaults - Major Uldum True nzoth-assaults-major [1644332400, 1644937200, 1645542000, 1646146800, 1646751600] /guides/visions-of-nzoth-assaults
4 302400 7 achievement_zone_valeofeternalblossoms inv_eyeofnzothpet nzoth-assaults-minor N'Zoth Assaults - Minor Vale of Eternal Blossoms True nzoth-assaults-minor [1644634800, 1644937200, 1645239600, 1645542000, 1645844400] /guides/visions-of-nzoth-assaults
5 302400 8 ui_sigil_kyrian item_rep_deathsadvance_01 covenant-assault Covenant Assault Kyrian Assault False covenant-assault [1644692400, 1644994800, 1645297200, 1645599600, 1645902000] /guides/covenant-assaults-maw-invasions
6 21600 6 inv_misc_map08 achievement_faction_legionfall legion-assaults Legion Assaults None True legion-assaults [1644955200, 1645021800, 1645088400, 1645155000, 1645221600] /legion-assaults-guide
7 25200 7 inv_tiragardesound inv_misc_map08 faction-assaults Faction Assaults Tiragarde Sound True faction-assaults [1644922800, 1644991200, 1645059600, 1645128000, 1645196400] /guides/battle-for-azeroth-incursions
8 604800 7 achievement_zone_uldum inv_eyeofnzothpet nzoth-assaults-major N'Zoth Assaults - Major Uldum True nzoth-assaults-major [1644390000, 1644994800, 1645599600, 1646204400, 1646809200] /guides/visions-of-nzoth-assaults
9 302400 7 achievement_zone_valeofeternalblossoms inv_eyeofnzothpet nzoth-assaults-minor N'Zoth Assaults - Minor Vale of Eternal Blossoms True nzoth-assaults-minor [1644692400, 1644994800, 1645297200, 1645599600, 1645902000] /guides/visions-of-nzoth-assaults
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/424253.html
