我正在嘗試創建一個元組串列,第一個元素是下載 URL,第二個元素是 URL 字串中的檔案名,代碼如下:
import urllib
import requests
from bs4 import BeautifulSoup
import pandas as pd
import io
url = r"https://www.ers.usda.gov/data-products/livestock-meat-domestic-data"
my_bytes = urllib.request.urlopen(url)
my_bytes = my_bytes.read().decode("utf8")
parsed_html = BeautifulSoup(my_bytes, features = "lxml")
table_data = parsed_html.body.find('table', attrs = {'id':'data_table'})
download_url = "https://www.ers.usda.gov"
full_download_url = [tuple(download_url,i["href"]) for i in table_data.find_all('a')]
但我一直在TypeError: must be str, not list相處,我不知道如何解決這個問題,請幫忙?謝謝!
uj5u.com熱心網友回復:
這就是我需要的:
import urllib
import requests
from bs4 import BeautifulSoup
import pandas as pd
import io
url = r"https://www.ers.usda.gov/data-products/livestock-meat-domestic-data"
my_bytes = urllib.request.urlopen(url)
my_bytes = my_bytes.read().decode("utf8")
parsed_html = BeautifulSoup(my_bytes, features = "lxml")
table_data = parsed_html.body.find('table', attrs = {'id':'data_table'})
download_url = "https://www.ers.usda.gov"
def convertTuple(tup):
str = ''
for item in tup:
str = str item
return str
full_download_url = [convertTuple(tuple(download_url i["href"])) for i in table_data.find_all('a')]
感謝Geeks 的極客和每個試圖提供幫助的人 :)
uj5u.com熱心網友回復:
您錯誤地訪問了download_url陣列索引。
[0]例如,Python 將您的代碼解釋為在為 0時創建一個包含一個元素的陣列i,然后嘗試訪問該元素["href"]是字串,而不是有效索引
如果您download_url在訪問索引之前指定它將按預期作業
full_download_url = [(download_url, download_url[i]["href"]) for i in table_data.find_all('a')]
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/456715.html
上一篇:如何更改看起來相同的字串位元組
