主頁 > 企業開發 > 美麗的湯,尋找類不起作用只回傳目錄標題?

美麗的湯,尋找類不起作用只回傳目錄標題?

2022-11-05 02:13:45 企業開發

所以這個想法是廢棄這個特定的頁面

美麗的湯,尋找類不起作用只回傳目錄標題?

但是在我的以下代碼中:

import requests
import json
from bs4 import BeautifulSoup

url = "https://www.perlego.com/book/921329/getting-started-with-python-understand-key-data-structures-and-use-python-in-objectoriented-programming-pdf?queryID=9315f2c9285af80efdc99eaa9c5621bc&index=prod_BOOKS&gridPosition=2"

    r = requests.get(url)
    
    print(r.status_code)
    
    soup = BeautifulSoup(r.content, 'html.parser')
    
    
    #another extra number on the side of sc-b81....-1 is the next link
    print(soup.find_all(attrs={'class': 'sc-b81fc1ca-0'}))

這個函式列印出來的是

<div class="sc-b81fc1ca-0 eqkOXa" data-testid="table-of-contents"><h2 class="sc-b81fc1ca-1 OnMGm">Table of contents</h2></div>]

而我想要此類標簽 sc-b81fc1ca-2 下的所有標簽,盡管我嘗試使用 findall 進行搜索,但它只回傳一個空串列

uj5u.com熱心網友回復:

您要查找的內容僅在頁面上運行一些 javascript 后才會加載。本教程應該可以幫助您在執行 scapeing 之前運行該 javascript:

https://pythonprogramming.net/javascript-dynamic-scraping-parsing-beautiful-soup-tutorial/

uj5u.com熱心網友回復:

Table of contents選項卡是動態且可點擊的,您可以從此處獲取所需的內容,您可以點擊它,您可以通過名為 selenium 的自動化工具來做到這一點

例子:

from selenium import webdriver
import time
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.service import Service
import pandas as pd
from selenium.webdriver.common.by import By
webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service)

data = []
driver.get('https://www.perlego.com/book/921329/getting-started-with-python-understand-key-data-structures-and-use-python-in-objectoriented-programming-pdf?queryID=9315f2c9285af80efdc99eaa9c5621bc&index=prod_BOOKS&gridPosition=2')
driver.maximize_window()
time.sleep(6)

driver.find_element(By.XPATH, "(//*[contains(text(),'Table of contents')])[1]").click()
time.sleep(1)


soup = BeautifulSoup(driver.page_source,"html.parser")

txt = soup.select_one('div.sc-b81fc1ca-2.kydYov:-soup-contains("Contributors")').text
print(txt)

輸出:

貢獻者

uj5u.com熱心網友回復:

關于 TOC 的資料是通過 JavaScript 從外部 URL 加載的。您可以使用requests/json模塊來加載它:

import json
import requests

# the number at the end of URL is ID of the boook:
ajax_url = "https://api.perlego.com/metadata/v2/metadata/books/toc/921329"
data = requests.get(ajax_url).json()

# uncomment to print all data:
# print(json.dumps(data, indent=4))

def print_title(t, tabs=0):
    print("\t" * tabs, t["element_title"])
    for s in t.get("subchapters") or []:
        print_title(s, tabs   1)


for t in data["data"]["book_toc"]:
    print_title(t)

印刷:

 
 Title Page
 Copyright and Credits
     Getting Started with Python
 About Packt
     Why subscribe?
     Packt.com
 Contributors
     About the authors
     Packt is searching for authors like you
         
 Preface
     Who this book is for
     What this book covers
     To get the most out of this book
         Download the example code files
         Conventions used
     Get in touch
         Reviews
 A Gentle Introduction to Python
     A proper introduction
     Enter the Python
     About Python
         Portability
         Coherence
         Developer productivity
         An extensive library
         Software quality
         Software integration
         Satisfaction and enjoyment
     What are the drawbacks?
     Who is using Python today?
     Setting up the environment
         Python 2 versus Python 3
     Installing Python
         Setting up the Python interpreter
         About virtualenv
         Your first virtual environment
         Your friend, the console
     How you can run a Python program
         Running Python scripts
         Running the Python interactive shell
         Running Python as a service
         Running Python as a GUI application
     How is Python code organized?
         How do we use modules and packages?
     Python's execution model
         Names and namespaces
         Scopes
         Objects and classes
     Guidelines on how to write good code
     The Python culture
     A note on IDEs
     Summary
 Built-in Data Types
     Everything is an object
     Mutable or immutable? That is the question
     Numbers
         Integers
         Booleans
         Real numbers
         Complex numbers
         Fractions and decimals
     Immutable sequences
         Strings and bytes
             Encoding and decoding strings
             Indexing and slicing strings
             String formatting
         Tuples
     Mutable sequences
         Lists
         Byte arrays
     Set types
     Mapping types – dictionaries
     The collections module
         namedtuple
         defaultdict
         ChainMap
     Enums
     Final considerations
         Small values caching
         How to choose data structures
         About indexing and slicing
         About the names
     Summary
 Iterating and Making Decisions
     Conditional programming
         A specialized else – elif
         The ternary operator
     Looping
         The for loop
             Iterating over a range
             Iterating over a sequence
         Iterators and iterables
         Iterating over multiple sequences
         The while loop
         The break and continue statements
         A special else clause
     Putting all this together
         A prime generator
         Applying discounts
     A quick peek at the itertools module
         Infinite iterators
         Iterators terminating on the shortest input sequence
         Combinatoric generators
     Summary
 Functions, the Building Blocks of Code
     Why use functions?
         Reducing code duplication
         Splitting a complex task
         Hiding implementation details
         Improving readability
         Improving traceability
     Scopes and name resolution
         The global and nonlocal statements
     Input parameters
         Argument passing
         Assignment to argument names doesn't affect the caller
         Changing a mutable affects the caller
         How to specify input parameters
             Positional arguments
             Keyword arguments and default values
             Variable positional arguments
             Variable keyword arguments
             Keyword-only arguments
             Combining input parameters
             Additional unpacking generalizations
             Avoid the trap! Mutable defaults
     Return values
         Returning multiple values
     A few useful tips
     Recursive functions
     Anonymous functions
     Function attributes
     Built-in functions
     One final example
     Documenting your code
     Importing objects
         Relative imports
     Summary
 Files and Data Persistence
     Working with files and directories
         Opening files
             Using a context manager to open a file
         Reading and writing to a file
             Reading and writing in binary mode
             Protecting against overriding an existing file
         Checking for file and directory existence
         Manipulating files and directories
             Manipulating pathnames
         Temporary files and directories
         Directory content
         File and directory compression
     Data interchange formats
         Working with JSON
             Custom encoding/decoding with JSON
     IO, streams, and requests
         Using an in-memory stream
         Making HTTP requests
     Persisting data on disk
         Serializing data with pickle
         Saving data with shelve
         Saving data to a database
     Summary
 Principles of Algorithm Design
     Algorithm design paradigms
     Recursion and backtracking
         Backtracking
         Divide and conquer - long multiplication
         Can we do better? A recursive approach
     Runtime analysis
         Asymptotic analysis
         Big O notation
             Composing complexity classes
             Omega notation (?)
             Theta notation (?)
     Amortized analysis
     Summary
 Lists and Pointer Structures
     Arrays
     Pointer structures
     Nodes
     Finding endpoints
         Node
             Other node types
     Singly linked lists
         Singly linked list class
         Append operation
     A faster append operation
     Getting the size of the list
     Improving list traversal
     Deleting nodes
         List search
     Clearing a list
     Doubly linked lists
         A doubly linked list node
             Doubly linked list
         Append operation
         Delete operation
         List search
     Circular lists
         Appending elements
         Deleting an element
             Iterating through a circular list
     Summary
 Stacks and Queues
     Stacks
         Stack implementation
         Push operation
         Pop operation
             Peek
         Bracket-matching application
     Queues
         List-based queue
             Enqueue operation
             Dequeue operation
         Stack-based queue
             Enqueue operation
             Dequeue operation
         Node-based queue
             Queue class
             Enqueue operation
             Dequeue operation
         Application of queues
             Media player queue
     Summary
 Trees
     Terminology
     Tree nodes
     Binary trees
         Binary search trees
         Binary search tree implementation
         Binary search tree operations
             Finding the minimum and maximum nodes
         Inserting nodes
         Deleting nodes
         Searching the tree
         Tree traversal
             Depth-first traversal
                 In-order traversal and infix notation
                 Pre-order traversal and prefix notation
                 Post-order traversal and postfix notation.
             Breadth-first traversal
         Benefits of a binary search tree
         Expression trees
             Parsing a reverse Polish expression
         Balancing trees
         Heaps
     Summary
 Hashing and Symbol Tables
     Hashing
         Perfect hashing functions
     Hash table
         Putting elements
         Getting elements
         Testing the hash table
         Using [] with the hash table
         Non-string keys
         Growing a hash table
         Open addressing
             Chaining
         Symbol tables
     Summary
 Graphs and Other Algorithms
     Graphs
     Directed and undirected graphs
     Weighted graphs
     Graph representation
         Adjacency list
         Adjacency matrix
     Graph traversal
         Breadth-first search
         Depth-first search
     Other useful graph methods
     Priority queues and heaps
         Inserting
         Pop
         Testing the heap
     Selection algorithms
     Summary
 Searching
     Linear Search
         Unordered linear search
         Ordered linear search
     Binary search
     Interpolation search
         Choosing a search algorithm
     Summary
 Sorting
     Sorting algorithms
     Bubble sort
     Insertion sort
     Selection sort
     Quick sort
         List partitioning
             Pivot selection
         Implementation
         Heap sort
     Summary
 Selection Algorithms
     Selection by sorting
     Randomized selection
         Quick select
             Partition step
     Deterministic selection
         Pivot selection
         Median of medians
         Partitioning step
     Summary
 Object-Oriented Design
     Introducing object-oriented
     Objects and classes
     Specifying attributes and behaviors
         Data describes objects
         Behaviors are actions
     Hiding details and creating the public interface
     Composition
     Inheritance
         Inheritance provides abstraction
         Multiple inheritance
     Case study
     Exercises
     Summary
 Objects in Python
     Creating Python classes
         Adding attributes
         Making it do something
             Talking to yourself
             More arguments
         Initializing the object
         Explaining yourself
     Modules and packages
         Organizing modules
             Absolute imports
             Relative imports
     Organizing module content
     Who can access my data?
     Third-party libraries
     Case study
     Exercises
     Summary
 When Objects Are Alike
     Basic inheritance
         Extending built-ins
         Overriding and super
     Multiple inheritance
         The diamond problem
         Different sets of arguments
     Polymorphism
     Abstract base classes
         Using an abstract base class
         Creating an abstract base class
         Demystifying the magic
     Case study
     Exercises
     Summary
 Expecting the Unexpected
     Raising exceptions
         Raising an exception
         The effects of an exception
         Handling exceptions
         The exception hierarchy
         Defining our own exceptions
     Case study
     Exercises
     Summary
 When to Use Object-Oriented Programming
     Treat objects as objects
     Adding behaviors to class data with properties
         Properties in detail
         Decorators – another way to create properties
         Deciding when to use properties
     Manager objects
         Removing duplicate code
         In practice
     Case study
     Exercises
     Summary
 Python Object-Oriented Shortcuts
     Python built-in functions
         The len() function
         Reversed
         Enumerate
         File I/O
         Placing it in context
     An alternative to method overloading
         Default arguments
         Variable argument lists
         Unpacking arguments
     Functions are objects too
         Using functions as attributes
         Callable objects
     Case study
     Exercises
     Summary
 The Iterator Pattern
     Design patterns in brief
     Iterators
         The iterator protocol
     Comprehensions
         List comprehensions
         Set and dictionary comprehensions
         Generator expressions
     Generators
         Yield items from another iterable
     Coroutines
         Back to log parsing
         Closing coroutines and throwing exceptions
         The relationship between coroutines, generators, and functions
     Case study
     Exercises
     Summary
 Python Design Patterns I
     The decorator pattern
         A decorator example
         Decorators in Python
     The observer pattern
         An observer example
     The strategy pattern
         A strategy example
         Strategy in Python
     The state pattern
         A state example
         State versus strategy
         State transition as coroutines
     The singleton pattern
         Singleton implementation
         Module variables can mimic singletons
     The template pattern
         A template example
     Exercises
     Summary
 Python Design Patterns II
     The adapter pattern
     The facade pattern
     The flyweight pattern
     The command pattern
     The abstract factory pattern
     The composite pattern
     Exercises
     Summary
 Testing Object-Oriented Programs
     Why test?
         Test-driven development
     Unit testing
         Assertion methods
         Reducing boilerplate and cleaning up
         Organizing and running tests
         Ignoring broken tests
     Testing with pytest
         One way to do setup and cleanup
         A completely different way to set up variables
         Skipping tests with pytest
     Imitating expensive objects
     How much testing is enough?
     Case study
         Implementing it
     Exercises
     Summary
 Other Books You May Enjoy
     Leave a review - let other readers know what you think

轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/527590.html

標籤:Python网页抓取美丽的汤蟒蛇请求

上一篇:基于相同的列值Sql創建列的排列

下一篇:將特定模式重定向到主頁的Htaccess正則運算式

標籤雲
其他(157675) Python(38076) JavaScript(25376) Java(17977) C(15215) 區塊鏈(8255) C#(7972) AI(7469) 爪哇(7425) MySQL(7132) html(6777) 基礎類(6313) sql(6102) 熊猫(6058) PHP(5869) 数组(5741) R(5409) Linux(5327) 反应(5209) 腳本語言(PerlPython)(5129) 非技術區(4971) Android(4554) 数据框(4311) css(4259) 节点.js(4032) C語言(3288) json(3245) 列表(3129) 扑(3119) C++語言(3117) 安卓(2998) 打字稿(2995) VBA(2789) Java相關(2746) 疑難問題(2699) 细绳(2522) 單片機工控(2479) iOS(2429) ASP.NET(2402) MongoDB(2323) 麻木的(2285) 正则表达式(2254) 字典(2211) 循环(2198) 迅速(2185) 擅长(2169) 镖(2155) 功能(1967) .NET技术(1958) Web開發(1951) python-3.x(1918) HtmlCss(1915) 弹簧靴(1913) C++(1909) xml(1889) PostgreSQL(1872) .NETCore(1853) 谷歌表格(1846) Unity3D(1843) for循环(1842)

熱門瀏覽
  • IEEE1588PTP在數字化變電站時鐘同步方面的應用

    IEEE1588ptp在數字化變電站時鐘同步方面的應用 京準電子科技官微——ahjzsz 一、電力系統時間同步基本概況 隨著對IEC 61850標準研究的不斷深入,國內外學者提出基于IEC61850通信標準體系建設數字化變電站的發展思路。數字化變電站與常規變電站的顯著區別在于程序層傳統的電流/電壓互 ......

    uj5u.com 2020-09-10 03:51:52 more
  • HTTP request smuggling CL.TE

    CL.TE 簡介 前端通過Content-Length處理請求,通過反向代理或者負載均衡將請求轉發到后端,后端Transfer-Encoding優先級較高,以TE處理請求造成安全問題。 檢測 發送如下資料包 POST / HTTP/1.1 Host: ac391f7e1e9af821806e890 ......

    uj5u.com 2020-09-10 03:52:11 more
  • 網路滲透資料大全單——漏洞庫篇

    網路滲透資料大全單——漏洞庫篇漏洞庫 NVD ——美國國家漏洞庫 →http://nvd.nist.gov/。 CERT ——美國國家應急回應中心 →https://www.us-cert.gov/ OSVDB ——開源漏洞庫 →http://osvdb.org Bugtraq ——賽門鐵克 →ht ......

    uj5u.com 2020-09-10 03:52:15 more
  • 京準講述NTP時鐘服務器應用及原理

    京準講述NTP時鐘服務器應用及原理京準講述NTP時鐘服務器應用及原理 安徽京準電子科技官微——ahjzsz 北斗授時原理 授時是指接識訓通過某種方式獲得本地時間與北斗標準時間的鐘差,然后調整本地時鐘使時差控制在一定的精度范圍內。 衛星導航系統通常由三部分組成:導航授時衛星、地面檢測校正維護系統和用戶 ......

    uj5u.com 2020-09-10 03:52:25 more
  • 利用北斗衛星系統設計NTP網路時間服務器

    利用北斗衛星系統設計NTP網路時間服務器 利用北斗衛星系統設計NTP網路時間服務器 安徽京準電子科技官微——ahjzsz 概述 NTP網路時間服務器是一款支持NTP和SNTP網路時間同步協議,高精度、大容量、高品質的高科技時鐘產品。 NTP網路時間服務器設備采用冗余架構設計,高精度時鐘直接來源于北斗 ......

    uj5u.com 2020-09-10 03:52:35 more
  • 詳細解讀電力系統各種對時方式

    詳細解讀電力系統各種對時方式 詳細解讀電力系統各種對時方式 安徽京準電子科技官微——ahjzsz,更多資料請添加VX 衛星同步時鐘是我京準公司開發研制的應用衛星授時時技術的標準時間顯示和發送的裝置,該裝置以M國全球定位系統(GLOBAL POSITIONING SYSTEM,縮寫為GPS)或者我國北 ......

    uj5u.com 2020-09-10 03:52:45 more
  • 如何保證外包團隊接入企業內網安全

    不管企業規模的大小,只要企業想省錢,那么企業的某些服務就一定會采用外包的形式,然而看似美好又經濟的策略,其實也有不好的一面。下面我通過安全的角度來聊聊使用外包團的安全隱患問題。 先看看什么服務會使用外包的,最常見的就是話務/客服這種需要大量重復性、無技術性的服務,或者是一些銷售外包、特殊的職能外包等 ......

    uj5u.com 2020-09-10 03:52:57 more
  • PHP漏洞之【整型數字型SQL注入】

    0x01 什么是SQL注入 SQL是一種注入攻擊,通過前端帶入后端資料庫進行惡意的SQL陳述句查詢。 0x02 SQL整型注入原理 SQL注入一般發生在動態網站URL地址里,當然也會發生在其它地發,如登錄框等等也會存在注入,只要是和資料庫打交道的地方都有可能存在。 如這里http://192.168. ......

    uj5u.com 2020-09-10 03:55:40 more
  • [GXYCTF2019]禁止套娃

    git泄露獲取原始碼 使用GET傳參,引數為exp 經過三層過濾執行 第一層過濾偽協議,第二層過濾帶引數的函式,第三層過濾一些函式 preg_replace('/[a-z,_]+\((?R)?\)/', NULL, $_GET['exp'] (?R)參考當前正則運算式,相當于匹配函式里的引數 因此傳遞 ......

    uj5u.com 2020-09-10 03:56:07 more
  • 等保2.0實施流程

    流程 結論 ......

    uj5u.com 2020-09-10 03:56:16 more
最新发布
  • 使用Django Rest framework搭建Blog

    在前面的Blog例子中我們使用的是GraphQL, 雖然GraphQL的使用處于上升趨勢,但是Rest API還是使用的更廣泛一些. 所以還是決定回到傳統的rest api framework上來, Django rest framework的官網上給了一個很好用的QuickStart, 我參考Qu ......

    uj5u.com 2023-04-20 08:17:54 more
  • 記錄-new Date() 我忍你很久了!

    這里給大家分享我在網上總結出來的一些知識,希望對大家有所幫助 大家平時在開發的時候有沒被new Date()折磨過?就是它的諸多怪異的設定讓你每每用的時候,都可能不小心踩坑。造成程式意外出錯,卻一下子找不到問題出處,那叫一個煩透了…… 下面,我就列舉它的“四宗罪”及應用思考 可惡的四宗罪 1. Sa ......

    uj5u.com 2023-04-20 08:17:47 more
  • 使用Vue.js實作文字跑馬燈效果

    實作文字跑馬燈效果,首先用到 substring()截取 和 setInterval計時器 clearInterval()清除計時器 效果如下: 實作代碼如下: <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta ......

    uj5u.com 2023-04-20 08:12:31 more
  • JavaScript 運算子

    JavaScript 運算子/運算子 在 JavaScript 中,有一些運算子可以使代碼更簡潔、易讀和高效。以下是一些常見的運算子: 1、可選鏈運算子(optional chaining operator) ?.是可選鏈運算子(optional chaining operator)。?. 可選鏈操 ......

    uj5u.com 2023-04-20 08:02:25 more
  • CSS—相對單位rem

    一、概述 rem是一個相對長度單位,它的單位長度取決于根標簽html的字體尺寸。rem即root em的意思,中文翻譯為根em。瀏覽器的文本尺寸一般默認為16px,即默認情況下: 1rem = 16px rem布局原理:根據CSS媒體查詢功能,更改根標簽的字體尺寸,實作rem單位隨螢屏尺寸的變化,如 ......

    uj5u.com 2023-04-20 08:02:21 more
  • 我的第一個NPM包:panghu-planebattle-esm(胖虎飛機大戰)使用說明

    好家伙,我的包終于開發完啦 歡迎使用胖虎的飛機大戰包!! 為你的主頁添加色彩 這是一個有趣的網頁小游戲包,使用canvas和js開發 使用ES6模塊化開發 效果圖如下: (覺得圖片太sb的可以自己改) 代碼已開源!! Git: https://gitee.com/tang-and-han-dynas ......

    uj5u.com 2023-04-20 08:01:50 more
  • 如何在 vue3 中使用 jsx/tsx?

    我們都知道,通常情況下我們使用 vue 大多都是用的 SFC(Signle File Component)單檔案組件模式,即一個組件就是一個檔案,但其實 Vue 也是支持使用 JSX 來撰寫組件的。這里不討論 SFC 和 JSX 的好壞,這個仁者見仁智者見智。本篇文章旨在帶領大家快速了解和使用 Vu ......

    uj5u.com 2023-04-20 08:01:37 more
  • 【Vue2.x原始碼系列06】計算屬性computed原理

    本章目標:計算屬性是如何實作的?計算屬性快取原理以及洋蔥模型的應用?在初始化Vue實體時,我們會給每個計算屬性都創建一個對應watcher,我們稱之為計算屬性watcher ......

    uj5u.com 2023-04-20 08:01:31 more
  • http1.1與http2.0

    一、http是什么 通俗來講,http就是計算機通過網路進行通信的規則,是一個基于請求與回應,無狀態的,應用層協議。常用于TCP/IP協議傳輸資料。目前任何終端之間任何一種通信方式都必須按Http協議進行,否則無法連接。tcp(三次握手,四次揮手)。 請求與回應:客戶端請求、服務端回應資料。 無狀態 ......

    uj5u.com 2023-04-20 08:01:10 more
  • http1.1與http2.0

    一、http是什么 通俗來講,http就是計算機通過網路進行通信的規則,是一個基于請求與回應,無狀態的,應用層協議。常用于TCP/IP協議傳輸資料。目前任何終端之間任何一種通信方式都必須按Http協議進行,否則無法連接。tcp(三次握手,四次揮手)。 請求與回應:客戶端請求、服務端回應資料。 無狀態 ......

    uj5u.com 2023-04-20 08:00:32 more