我試圖刮images他們會給我的23 images但我不想申請limit他們只會給我 10 張圖片你能幫我解決這些問題嗎
import requests
from bs4 import BeautifulSoup
import pandas as pd
baseurl='https://twillmkt.com'
headers ={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}
r =requests.get('https://twillmkt.com/collections/denim')
soup=BeautifulSoup(r.content, 'html.parser')
tra = soup.find_all('div',class_='ProductItem__Wrapper')
productlinks=[]
for links in tra:
for link in links.find_all('a',href=True):
comp=baseurl link['href']
productlinks.append(comp)
data = []
for link in set(productlinks):
r =requests.get(link,headers=headers)
soup=BeautifulSoup(r.content, 'html.parser')
up = soup.find('div',class_='Product__SlideshowNavScroller')
for e,pro in enumerate(up):
t=pro.find('img').get('src')
data.append({'id':t.split('=')[-1], 'image':'Image ' str(e) ' UI','link':t})
df = pd.DataFrame(data)
df.image=pd.Categorical(df.image,categories=df.image.unique(),ordered=True)
df = df.pivot(index='id', columns='image', values='link').reset_index().fillna('')
df.to_csv('kj.csv')
uj5u.com熱心網友回復:
將影像的結果集切片 [:10]
...
up = soup.select('div.Product__SlideshowNavScroller img')[:10]
for e,pro in enumerate(up):
t=pro.get('src')
data.append({'id':t.split('=')[-1], 'image':'Image ' str(e) ' UI','link':t})
...
如果您想從 1 而不是 0 開始命名影像:
...
up = soup.select('div.Product__SlideshowNavScroller img')[:10]
for e,pro in enumerate(up, start=1):
t=pro.get('src')
data.append({'id':t.split('=')[-1], 'image':'Image ' str(e) ' UI','link':t})
...
編輯
基本上在 9 個條目后的 excel 檔案中,他們將在一行中存盤 5 個影像,在另一行中存盤下 5 個影像問題是他們不能在一行中存盤 10 個影像
好的明白了 - 行為不是基于影像數量,這里的問題是 id 不是唯一的,它不是產品的 id/sku。
怎么修?
讓我們從產品中選擇 sku 并將其用作資料幀中的 id:
sku = soup.select_one('.oos_sku').text.strip().split(' ')[-1]
for e,pro in enumerate(up, start=1):
t=pro.get('src')
data.append({'id':sku, 'image':'Image ' str(e) ' UI','link':t})
例子
import requests
from bs4 import BeautifulSoup
import pandas as pd
baseurl='https://twillmkt.com'
headers ={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}
r =requests.get('https://twillmkt.com/collections/denim')
soup=BeautifulSoup(r.content, 'html.parser')
tra = soup.find_all('div',class_='ProductItem__Wrapper')
productlinks=[]
for links in tra:
for link in links.find_all('a',href=True):
comp=baseurl link['href']
productlinks.append(comp)
data = []
for link in set(productlinks):
r =requests.get(link,headers=headers)
soup=BeautifulSoup(r.content, 'html.parser')
up = soup.select('div.Product__SlideshowNavScroller img')
sku = soup.select_one('.oos_sku').text.strip().split(' ')[-1]
for e,pro in enumerate(up, start=1):
t=pro.get('src')
data.append({'id':sku, 'image':'Image ' str(e) ' UI','link':t})
df = pd.DataFrame(data)
df.image=pd.Categorical(df.image,categories=df.image.unique(),ordered=True)
df = df.pivot(index='id', columns='image', values='link').reset_index().fillna('')
df#.to_excel('test.xlsx')
輸出
| ID | 圖 1 用戶界面 | 圖 2 用戶界面 | 圖 3 用戶界面 | 圖 4 用戶界面 | 圖 5 用戶界面 | 圖 6 用戶界面 | 圖 7 用戶界面 | 圖 8 用戶界面 | 圖 9 用戶界面 | 圖 10 用戶界面 | 圖 11 用戶界面 | 圖 12 用戶界面 | 圖 13 用戶界面 | 圖 14 用戶界面 | 圖 15 用戶界面 | 圖 16 用戶界面 | 圖 17 用戶界面 | 圖 18 用戶界面 | 圖 19 用戶界面 | 圖 20 用戶界面 | 圖 21 用戶界面 | 圖 22 用戶界面 | 圖 23 用戶界面 | 圖 24 用戶界面 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | LOTFEELPJ023-30 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-2_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-3_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-4_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-5_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-6_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-7_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-8_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-9_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-10_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-11_160x.jpg?v=1631812617 | |||||||||||||
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 47 | LOTFEELPJ564-S-BRN | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_16_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_17_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_22_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_15_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_6_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_9_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/sizechart-stretch-pants_3_ec7e0b0c-1043-4306-a766-33f7e0b3edc8_160x.png?v=166994677 |
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/401183.html
