我撰寫了一個小決議器來獲取 5 個網頁的內容,我運行它,它沒有錯誤地執行并給我Process finished with exit code 0
,但沒有任何反應——我在腳本中有一個作家應該創建一個 .csv 并寫入資料在那里,但我的腳本沒有輸出,它只是執行并以退出代碼 0 靜默終止。從來沒有遇到過這樣的事情,有什么想法嗎?
網頁的結構是這樣的 - 標簽row mb-2
包含所有帶有串列的 div 標簽:
<div class="row mb-2">
<div class="grid-block col-12 col-sm-6 col-md-4 col-lg-3 py-2 px-0 px-sm-2"></div>
<div class="grid-block col-12 col-sm-6 col-md-4 col-lg-3 py-2 px-0 px-sm-2"></div>
<div class="grid-block col-12 col-sm-6 col-md-4 col-lg-3 py-2 px-0 px-sm-2"></div>
<div class="grid-block col-12 col-sm-6 col-md-4 col-lg-3 py-2 px-0 px-sm-2"></div>
</div>
腳本的代碼如下所示:
import requests
import csv
from bs4 import BeautifulSoup as bs
u_list = ['https://www.landcentury.com/search/page-1?categories[0]=commercial-and-industrial-land&options[0]=for-sale',
'https://www.landcentury.com/search/page-2?categories[0]=commercial-and-industrial-land&options[0]=for-sale',
'https://www.landcentury.com/search/page-3?categories[0]=commercial-and-industrial-land&options[0]=for-sale',
'https://www.landcentury.com/search/page-4?categories[0]=commercial-and-industrial-land&options[0]=for-sale',
'https://www.landcentury.com/search/page-5?categories[0]=commercial-and-industrial-land&options[0]=for-sale']
for url in range(0, 5):
page = requests.get(u_list[url])
soup = bs(page.content, 'html.parser')
landplots = soup.find_all('div', class_='row mb-2')
for l in landplots:
row = []
try:
plot_price = l.find('div', class_= 'price ').find_next(text=True).get_text(strip=True)
plot_location = l.find('div', class_ = 'card-title').find_next(text=True).text
plot_square = l.find('div', class_ = 'card-title').find_next(text=True).get_text(strip=True)
row.append(plot_price)
row.append(plot_location)
row.append(plot_square)
print(plot_price)
print(plot_square)
print(plot_location)
print()
except AttributeError:
continue
with open("parsing_second.csv", 'a', newline = '') as f:
writer = csv.writer(f)
writer.writerow(row)
uj5u.com熱心網友回復:
據我了解,您需要欄位 - 價格、廣場和位置。我建議使用 Pandas 保存到檔案。這將使您在將來選擇格式時更加靈活。
import requests
from bs4 import BeautifulSoup
import pandas as pd
results = []
for page in range(0, 5):
url = f'https://www.landcentury.com/search/page-{page}?categories[0]=commercial-and-industrial-land&options[0]=for-sale'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')
for card in soup.find_all('div', class_='listing-item'):
price = card.find('div', class_='price').getText(separator=' ').strip()
square, location = card.find('div', class_='card-title').getText(separator='/').split('/')
data = {
'Price': price,
'Square': square,
'Location': location
}
results.append(data)
df = pd.DataFrame(results)
df.to_csv('landcentury.csv', index=False)
CSV 結果:
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/496797.html
上一篇:如何在陣列之后洗掉以下引數