我正在做一個報廢專案,但我遇到了一個問題:
我想用 nokigiri 獲取https://coinmarketcap.com/all/views/all/的所有資料,但我在加載 nokogiri 的 200 上只得到 20 個加密名稱
代碼:
ruby
require 'nokogiri'
require 'open-uri'
require 'rubygems'
def scrapper
return doc = Nokogiri::HTML(URI.open('https://coinmarketcap.com/all/views/all/'))
end
def fusiontab(tab1,tab2)
return Hash[tab1.zip(tab2)]
end
def crypto(page)
array_name=[]
array_value=[]
name_of_crypto=page.xpath('//tr//td[3]')
value_of_crypto=page.xpath('//tr//td[5]')
hash={}
name_of_crypto.each{ |name|
array_name<<name.text
}
value_of_crypto.each{|price|
array_value << price.text
}
hash=fusiontab(array_name,array_value)
return hash
end
puts crypto(scrapper)
你能幫我弄到所有的加密貨幣嗎?
uj5u.com熱心網友回復:
您使用的 URL 不會將所有資料生成為 HTML;很多是在頁面加載后呈現的。
查看頁面的源代碼,資料似乎是從嵌入在頁面中的 JSON 腳本呈現的。
為了找出 JSON 資料的哪一部分具有您想要使用的內容,需要花費一些時間來查找物件:
- HTML 中的 JSON 物件,作為
String物件
page.css('script[type="application/json"]').first.inner_html
JSONString轉換為真正的 JSONHash
JSON.parse(page.css('script[type="application/json"]').first.inner_html)
JSON 或ArrayCrypto Hashes中的位置
my_json["props"]["initialState"]["cryptocurrency"]["listingLatest"]["data"]
漂亮地列印第一個“加密”
2.7.2 :142 > pp cryptos.first
{"id"=>1,
"name"=>"Bitcoin",
"symbol"=>"BTC",
"slug"=>"bitcoin",
"tags"=>
["mineable",
"pow",
"sha-256",
"store-of-value",
"state-channel",
"coinbase-ventures-portfolio",
"three-arrows-capital-portfolio",
"polychain-capital-portfolio",
"binance-labs-portfolio",
"blockchain-capital-portfolio",
"boostvc-portfolio",
"cms-holdings-portfolio",
"dcg-portfolio",
"dragonfly-capital-portfolio",
"electric-capital-portfolio",
"fabric-ventures-portfolio",
"framework-ventures-portfolio",
"galaxy-digital-portfolio",
"huobi-capital-portfolio",
"alameda-research-portfolio",
"a16z-portfolio",
"1confirmation-portfolio",
"winklevoss-capital-portfolio",
"usv-portfolio",
"placeholder-ventures-portfolio",
"pantera-capital-portfolio",
"multicoin-capital-portfolio",
"paradigm-portfolio"],
"cmcRank"=>1,
"marketPairCount"=>9158,
"circulatingSupply"=>18960043,
"selfReportedCirculatingSupply"=>0,
"totalSupply"=>18960043,
"maxSupply"=>21000000,
"isActive"=>1,
"lastUpdated"=>"2022-02-16T14:26:00.000Z",
"dateAdded"=>"2013-04-28T00:00:00.000Z",
"quotes"=>
[{"name"=>"USD",
"price"=>43646.858047604175,
"volume24h"=>20633664171.70021,
"marketCap"=>827546305397.4712,
"percentChange1h"=>-0.86544168,
"percentChange24h"=>-1.6482985,
"percentChange7d"=>-0.73945082,
"lastUpdated"=>"2022-02-16T14:26:00.000Z",
"percentChange30d"=>2.18336134,
"percentChange60d"=>-6.84146969,
"percentChange90d"=>-26.08073361,
"fullyDilluttedMarketCap"=>916584018999.69,
"marketCapByTotalSupply"=>827546305397.4712,
"dominance"=>42.1276,
"turnover"=>0.02493355,
"ytdPriceChangePercentage"=>-8.4718}],
"isAudited"=>false,
"rank"=>1,
"hasFilters"=>false,
"quote"=>
{"USD"=>
{"name"=>"USD",
"price"=>43646.858047604175,
"volume24h"=>20633664171.70021,
"marketCap"=>827546305397.4712,
"percentChange1h"=>-0.86544168,
"percentChange24h"=>-1.6482985,
"percentChange7d"=>-0.73945082,
"lastUpdated"=>"2022-02-16T14:26:00.000Z",
"percentChange30d"=>2.18336134,
"percentChange60d"=>-6.84146969,
"percentChange90d"=>-26.08073361,
"fullyDilluttedMarketCap"=>916584018999.69,
"marketCapByTotalSupply"=>827546305397.4712,
"dominance"=>42.1276,
"turnover"=>0.02493355,
"ytdPriceChangePercentage"=>-8.4718}}
}
第一個“加密貨幣”的價值
crypto.first["quote"]["USD"]["price"]
您Hash在第一個“加密”中使用的密鑰
crypto.first["symbol"]
put it all together and you get the following code (looping through each "crypto" with each_with_object)
require `json`
require 'nokogiri'
require 'open-uri'
...
def crypto(page)
my_json = JSON.parse(page.css('script[type="application/json"]').first.inner_html)
cryptos = my_json["props"]["initialState"]["cryptocurrency"]["listingLatest"]["data"]
hash = cryptos.each_with_object({}) do |crypto, hsh|
hsh[crypto["name"]] = crypto["quote"]["USD"]["price"]
end
return hash
end
puts crypto(scrapper);
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/426942.html
上一篇:試圖抓取資料提供空結果
