我正在嘗試使用 unirest 和 Cheerio 來抓取 google 影像,但是當我發現決議沒有正確發生時我被卡住了。這是我目前的代碼:
const unirest = require("unirest");
const cheerio = require("cheerio");
const getData = async() => {
let count= [] , page_url = [];
let url =
"https://www.google.com/search?q=india&oq=india&tbm=isch&asearch=ichunk&async=_id:rg_s,_pms:s,_fmt:pc&sourceid=chrome&ie=UTF-8";
const response = await unirest
.get(
url
)
.headers({
"User-Agent":
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36",
})
.proxy(
"proxy"
);
const $ = cheerio.load(response.body)
console.log(response.body)//html file returned successsfully
let title = [] , link = [];
$(".vbC6V").each((i,el) => {
title[i] = $(el).find(".iKjWAf .mVDMnf").text()//not parsing
link[i] = $(el).find(".rg_l .rg_ic").attr("src")//not parsing
})
console.log(title)//returned empty
console.log(link)//returned empty
}
getData();
uj5u.com熱心網友回復:
所以是的,我發現決議的父類將是rg_bx而不是vbC6V。所以更新的代碼將是:
$(".rg_bx").each((i,el) => {
title[i] = $(el).find(".iKjWAf .mVDMnf").text()
link[i] = $(el).find(".rg_l .rg_ic").attr("src")
})
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/492311.html
標籤:javascript 网页抓取 切里奥 统一
上一篇:抓取網頁資訊
