主頁 > 作業系統 > scrapy停止抓取已解決的元素

scrapy停止抓取已解決的元素

2021-12-09 23:59:22 作業系統

這是我的蜘蛛代碼和我得到的日志。問題是蜘蛛似乎停止抓取從第 10 頁中間某處尋址的專案(而有 352 頁要抓取)。當我檢查其余元素的 XPath 運算式時,我發現它們在我的瀏覽器中是相同的。

這是我的蜘蛛:

# -*- coding: utf-8 -*-
import scrapy
import logging
import urllib.parse
parts = urllib.parse.urlsplit(u'http://fa.wikipedia.org/wiki/?????_????')
parts = parts._replace(path=urllib.parse.quote(parts.path.encode('utf8')))
encoded_url = parts.geturl().encode('ascii')
'https://fa.wikipedia.org/wiki/صفحهٔ_اصلی'



class CriptolernSpider(scrapy.Spider):
    name = 'criptolern'
    allowed_domains = ['arzdigital.com']

    def start_requests(self):
            yield scrapy.Request(url='https://arzdigital.com',
                callback= self.parse,dont_filter = True)
    

    def parse(self, response):
            posts=response.xpath("//a[@class='arz-last-post arz-row']")
            
            try:

                for post in posts:
                    post_title=post.xpath(".//@title").get()
                    post_link=post.xpath(".//@href").get()
                    post_date=post.xpath(".//div[@class='arz-col-12 arz-col-md arz-last-post__link-box']/div/div[@class='arz-last-post__info']/div[@class='arz-last-post__publish-time']/time/@datetime").get()

                    if post.xpath(".//div[@class='arz-col-12 arz-col-md arz-last-post__link-box']/div/div[@class='arz-last-post__info']/div[@class='arz-post__info-likes']/span[2]/text()"):
                        likes=int(post.xpath(".//div[@class='arz-col-12 arz-col-md arz-last-post__link-box']/div/div[@class='arz-last-post__info']/div[@class='arz-post__info-likes']/span[2]/text()").get())
                    else:
                        likes=0
                    if post.xpath(".//div[@class='arz-col-12 arz-col-md arz-last-post__link-box']/div/div[@class='arz-last-post__info']/div[@class='arz-post__info-comment']/span[2]/text()"):
                        commnents=int(post.xpath(".//div[@class='arz-col-12 arz-col-md arz-last-post__link-box']/div/div[@class='arz-last-post__info']/div[@class='arz-post__info-comment']/span[2]/text()").get())
                    else:
                        commnents=0
                
                    yield{
                        'post_title':post_title,
                        'post_link':post_link,
                        'post_date':post_date,
                        'likes':likes,
                        'commnents':commnents
                    }

                next_page=response.xpath("//div[@class='arz-last-posts__get-more']/a[@class='arz-btn arz-btn-info arz-round arz-link-nofollow']/@href").get()
                if next_page:
                    yield scrapy.Request(url=next_page, callback=self.parse,dont_filter = True)

                else:

                    next_pages= response.xpath("//div[@class='arz-pagination']/ul/li[@class='arz-pagination__item arz-pagination__next']/a[@class='arz-pagination__link']/@href").get()
                    if next_pages:
                        yield scrapy.Request(url=next_pages, callback=self.parse, dont_filter = True)

            except AttributeError:
                logging.error("The element didn't exist")

這是日志,當蜘蛛停止時:

2021-12-04 11:06:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://arzdigital.com/latest-posts/page/10/>
{'post_title': '???????? ?????: ?????? ??????? ?? ??? ??? ??????? ?????', 'post_link': 'https://arzdigital.com/russias-putin-says-crypto-has-value-but-maybe-not-for-trading-oil-html/', 'post_date': '2021-10-16', 'likes': 17, 'commnents': 1}
2021-12-04 11:06:51 [scrapy.core.scraper] ERROR: Spider error processing <GET https://arzdigital.com/latest-posts/page/10/> (referer: https://arzdigital.com/latest-posts/page/9/)
Traceback (most recent call last):
  File "C:\Users\shima\anaconda3\envs\virtual_workspace\lib\site-packages\scrapy\utils\defer.py", line 102, in iter_errback
    yield next(it)
  File "C:\Users\shima\anaconda3\envs\virtual_workspace\lib\site-packages\scrapy\spidermiddlewares\offsite.py", line 29, in process_spider_output
    for x in result:
  File "C:\Users\shima\anaconda3\envs\virtual_workspace\lib\site-packages\scrapy\spidermiddlewares\referer.py", line 339, in <genexpr>
    return (_set_referer(r) for r in result or ())
  File "C:\Users\shima\anaconda3\envs\virtual_workspace\lib\site-packages\scrapy\spidermiddlewares\urllength.py", line 37, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "C:\Users\shima\anaconda3\envs\virtual_workspace\lib\site-packages\scrapy\spidermiddlewares\depth.py", line 58, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "C:\Users\shima\projects\arzdigital\arzdigital\spiders\criptolern.py", line 32, in parse
    likes=int(post.xpath(".//div[@class='arz-col-12 arz-col-md arz-last-post__link-box']/div/div[@class='arz-last-post__info']/div[@class='arz-post__info-likes']/span[2]/text()").get())
ValueError: invalid literal for int() with base 10: '?,???'
2021-12-04 11:06:51 [scrapy.core.engine] INFO: Closing spider (finished)
2021-12-04 11:06:51 [scrapy.extensions.feedexport] INFO: Stored csv feed (242 items) in: dataset.csv
2021-12-04 11:06:51 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 4112,
 'downloader/request_count': 12,
 'downloader/request_method_count/GET': 12,
 'downloader/response_bytes': 292561,
 'downloader/response_count': 12,
 'downloader/response_status_count/200': 12,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2021, 12, 4, 7, 36, 51, 830291),
 'item_scraped_count': 242,
 'log_count/DEBUG': 254,
 'log_count/ERROR': 1,
 'log_count/INFO': 10,
 'request_depth_max': 10,
 'response_received_count': 12,
 'robotstxt/request_count': 1,
 'robotstxt/response_count': 1,
 'robotstxt/response_status_count/200': 1,
 'scheduler/dequeued': 11,
 'scheduler/dequeued/memory': 11,
 'scheduler/enqueued': 11,
 'scheduler/enqueued/memory': 11,
 'spider_exceptions/ValueError': 1,
 'start_time': datetime.datetime(2021, 12, 4, 7, 36, 47, 423017)}
2021-12-04 11:06:51 [scrapy.core.engine] INFO: Spider closed (finished)

我找不到問題,如果它與錯誤的 XPath 運算式有關。謝謝你的幫助!!

編輯:所以我想最好在這里看到兩個檔案。第一個是settings.py:

BOT_NAME = 'arzdigital'

SPIDER_MODULES = ['arzdigital.spiders']
NEWSPIDER_MODULE = 'arzdigital.spiders'


# Crawl responsibly by identifying yourself (and your website) on the user-agent
USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36'

# Obey robots.txt rules
ROBOTSTXT_OBEY = False
# See also autothrottle settings and docs
DOWNLOAD_DELAY = 10
# Enable or disable downloader middlewares
# See https://doc.scrapy.org/en/latest/topics/downloader-middleware.html
DOWNLOADER_MIDDLEWARES = {
    'arzdigital.middlewares.ArzdigitalDownloaderMiddleware': None,
    'arzdigital.middlewares.UserAgentRotatorMiddleware':400
}
# Enable and configure the AutoThrottle extension (disabled by default)
# See https://doc.scrapy.org/en/latest/topics/autothrottle.html
AUTOTHROTTLE_ENABLED = True
# The initial download delay
AUTOTHROTTLE_START_DELAY = 60
# The maximum download delay to be set in case of high latencies
AUTOTHROTTLE_MAX_DELAY = 120
# The average number of requests Scrapy should be sending in parallel to
# each remote server
AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0
# Enable showing throttling stats for every response received:
#AUTOTHROTTLE_DEBUG = False

# Enable and configure HTTP caching (disabled by default)
# See https://doc.scrapy.org/en/latest/topics/downloader-middleware.html#httpcache-middleware-settings
HTTPCACHE_ENABLED = False
FEED_EXPORT_ENCODING='utf-8'

第二個檔案是 middlewares.py:

from scrapy import signals
from scrapy.downloadermiddlewares.useragent import UserAgentMiddleware
import random, logging


class UserAgentRotatorMiddleware(UserAgentMiddleware):
    user_agent_list=[
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36',
        'Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/2010010 1 Firefox/7.0.1',
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWeb Kit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.79 Safari/537.36 Edge/14.14393'
    ]
    def __init__(self, user_agent=''):
        self.user_agent= user_agent
    def process_request(self, request, spider):
        try:
            self.user_agent= random.choice(self.user_agent_list)
            request.headers.setdefault('User-Agent', self.user_agent)
        except IndexError:
            logging.error("Couldn't fetch the user agent")

uj5u.com熱心網友回復:

您的代碼按您的預期作業正常,問題出在分頁部分,我在 start_urls 中進行了分頁,這種分頁型別總是準確的,并且比下一頁快兩倍多。

代碼

import scrapy
import logging
#base url=https://arzdigital.com/latest-posts/
#start_url =https://arzdigital.com/latest-posts/page/2/

class CriptolernSpider(scrapy.Spider):
    name = 'criptolern'
    allowed_domains = ['arzdigital.com']
    start_urls=[f'https://arzdigital.com/latest-posts/page/{i}/'.format(i) for i in range(1,353)]

    def parse(self, response):
        posts = response.xpath("//a[@class='arz-last-post arz-row']")

        try:

            for post in posts:
                post_title = post.xpath(".//@title").get()
                post_link = post.xpath(".//@href").get()
                post_date = post.xpath(
                    ".//div[@class='arz-col-12 arz-col-md arz-last-post__link-box']/div/div[@class='arz-last-post__info']/div[@class='arz-last-post__publish-time']/time/@datetime").get()

                if post.xpath(".//div[@class='arz-col-12 arz-col-md arz-last-post__link-box']/div/div[@class='arz-last-post__info']/div[@class='arz-post__info-likes']/span[2]/text()"):
                    likes = int(post.xpath(
                        ".//div[@class='arz-col-12 arz-col-md arz-last-post__link-box']/div/div[@class='arz-last-post__info']/div[@class='arz-post__info-likes']/span[2]/text()").get())
                else:
                    likes = 0
                if post.xpath(".//div[@class='arz-col-12 arz-col-md arz-last-post__link-box']/div/div[@class='arz-last-post__info']/div[@class='arz-post__info-comment']/span[2]/text()"):
                    commnents = int(post.xpath(
                        ".//div[@class='arz-col-12 arz-col-md arz-last-post__link-box']/div/div[@class='arz-last-post__info']/div[@class='arz-post__info-comment']/span[2]/text()").get())
                else:
                    commnents = 0

                yield{
                    'post_title': post_title,
                    'post_link': post_link,
                    'post_date': post_date,
                    'likes': likes,
                    'commnents': commnents
                }

            
        except AttributeError:
            logging.error("The element didn't exist")

輸出:

2021-12-04 17:25:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://arzdigital.com/latest-posts/page/352/>
{'post_title': '????? ???? ???? ???? ????? ???? ?????? ???? ?? ????? ?????? ????? ?? ??? ???????', 'post_link': 'https://arzdigital.com/blockchain-investment/', 'post_date': '2017-07-27', 'likes': 4, 'commnents': 0}
2021-12-04 17:25:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://arzdigital.com/latest-posts/page/352/>
{'post_title': '???? ?????? ????? ?? ???? ICO', 'post_link': 'https://arzdigital.com/ico-risk/', 'post_date': '2017-07-27', 'likes': 9, 'commnents': 0}
2021-12-04 17:25:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://arzdigital.com/latest-posts/page/352/>
{'post_title': '\xa0??.??.?? ?????', 'post_link': 'https://arzdigital.com/what-is-ico/', 'post_date': '2017-07-27', 'likes': 7, 'commnents': 7}
2021-12-04 17:25:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://arzdigital.com/latest-posts/page/352/>
{'post_title': '???\xa0?????? ??? ???? ? ??????? ????? ??? ???? ?? ???? ??????? ???? ???\u200c?? ????', 'post_link': 'https://arzdigital.com/bitcoin-currency/', 'post_date': '2017-07-27', 'likes': 6, 'commnents': 0}
2021-12-04 17:25:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://arzdigital.com/latest-posts/page/352/>
{'post_title': '?????? ?????? Ethereum Classic ???? ?', 'post_link': 'https://arzdigital.com/what-is-ethereum-classic/', 'post_date': '2017-07-24', 'likes': 10, 'commnents': 2}
2021-12-04 17:25:19 [scrapy.core.engine] INFO: Closing spider (finished)
2021-12-04 17:25:19 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 111431,
 'downloader/request_count': 353,
 'downloader/request_method_count/GET': 353,
 'downloader/response_bytes': 8814416,
 'downloader/response_count': 353,
 'downloader/response_status_count/200': 352,
 'downloader/response_status_count/301': 1,
 'elapsed_time_seconds': 46.29503,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2021, 12, 4, 11, 25, 19, 124154),
 'httpcompression/response_bytes': 55545528,
 'httpcompression/response_count': 352,
 'item_scraped_count': 7920

.. 很快

設定.py檔案

請確保settings.py檔案,您只需要更改取消注釋部分就可以了

# Crawl responsibly by identifying yourself (and your website) on the user-agent
USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36'

# Obey robots.txt rules
ROBOTSTXT_OBEY = False

# Configure maximum concurrent requests performed by Scrapy (default: 16)
#CONCURRENT_REQUESTS = 32

# Configure a delay for requests for the same website (default: 0)
# See https://docs.scrapy.org/en/latest/topics/settings.html#download-delay
# See also autothrottle settings and docs
#DOWNLOAD_DELAY = 10
# The download delay setting will honor only one of:
#CONCURRENT_REQUESTS_PER_DOMAIN = 16
#CONCURRENT_REQUESTS_PER_IP = 16

# Disable cookies (enabled by default)
#COOKIES_ENABLED = False

# Disable Telnet Console (enabled by default)
#TELNETCONSOLE_ENABLED = False

# Override the default request headers:
#DEFAULT_REQUEST_HEADERS = {
#   'Accept': 'text/html,application/xhtml xml,application/xml;q=0.9,*/*;q=0.8',
#   'Accept-Language': 'en',
#}

# Enable or disable spider middlewares
# See https://docs.scrapy.org/en/latest/topics/spider-middleware.html
#SPIDER_MIDDLEWARES = {
#    'gs_spider.middlewares.GsSpiderSpiderMiddleware': 543,
#}

# Enable or disable downloader middlewares
# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
#DOWNLOADER_MIDDLEWARES = {
#    'gs_spider.middlewares.GsSpiderDownloaderMiddleware': 543,
#}

# Enable or disable extensions
# See https://docs.scrapy.org/en/latest/topics/extensions.html
#EXTENSIONS = {
#    'scrapy.extensions.telnet.TelnetConsole': None,
#}

# Configure item pipelines
# See https://docs.scrapy.org/en/latest/topics/item-pipeline.html
#ITEM_PIPELINES = {
#    'gs_spider.pipelines.GsSpiderPipeline': 300,
#}

# Enable and configure the AutoThrottle extension (disabled by default)
# See https://docs.scrapy.org/en/latest/topics/autothrottle.html
#AUTOTHROTTLE_ENABLED = True
# The initial download delay
#AUTOTHROTTLE_START_DELAY = 5
# The maximum download delay to be set in case of high latencies
#AUTOTHROTTLE_MAX_DELAY = 60
# The average number of requests Scrapy should be sending in parallel to
# each remote server
#AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0
# Enable showing throttling stats for every response received:
#AUTOTHROTTLE_DEBUG = False

轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/377793.html

標籤:网页抓取 刮的 网络爬虫

上一篇:使用beautifulsoup抓取代碼的特定部分

下一篇:學習網頁抓取..需要對使用xpath插件的xpath="/html/body/div[3]/div[3]/div[4]/div/table[5]有所了解

標籤雲
其他(157675) Python(38076) JavaScript(25376) Java(17977) C(15215) 區塊鏈(8255) C#(7972) AI(7469) 爪哇(7425) MySQL(7132) html(6777) 基礎類(6313) sql(6102) 熊猫(6058) PHP(5869) 数组(5741) R(5409) Linux(5327) 反应(5209) 腳本語言(PerlPython)(5129) 非技術區(4971) Android(4554) 数据框(4311) css(4259) 节点.js(4032) C語言(3288) json(3245) 列表(3129) 扑(3119) C++語言(3117) 安卓(2998) 打字稿(2995) VBA(2789) Java相關(2746) 疑難問題(2699) 细绳(2522) 單片機工控(2479) iOS(2429) ASP.NET(2402) MongoDB(2323) 麻木的(2285) 正则表达式(2254) 字典(2211) 循环(2198) 迅速(2185) 擅长(2169) 镖(2155) 功能(1967) .NET技术(1958) Web開發(1951) python-3.x(1918) HtmlCss(1915) 弹簧靴(1913) C++(1909) xml(1889) PostgreSQL(1872) .NETCore(1853) 谷歌表格(1846) Unity3D(1843) for循环(1842)

熱門瀏覽
  • CA和證書

    1、在 CentOS7 中使用 gpg 創建 RSA 非對稱密鑰對 gpg --gen-key #Centos上生成公鑰/密鑰對(存放在家目錄.gnupg/) 2、將 CentOS7 匯出的公鑰,拷貝到 CentOS8 中,在 CentOS8 中使用 CentOS7 的公鑰加密一個檔案 gpg -a ......

    uj5u.com 2020-09-10 00:09:53 more
  • Kubernetes K8S之資源控制器Job和CronJob詳解

    Kubernetes的資源控制器Job和CronJob詳解與示例 ......

    uj5u.com 2020-09-10 00:10:45 more
  • VMware下安裝CentOS

    VMware下安裝CentOS 一、軟硬體準備 1 Centos鏡像準備 1.1 CentOS鏡像下載地址 下載地址 1.2 CentOS鏡像下載程序 點擊下載地址進入如下圖的網站,選擇需要下載的版本,這里選擇的是Centos8,點擊如圖所示。 決定選擇Centos8后,選擇想要的鏡像源進行下載,此 ......

    uj5u.com 2020-09-10 00:12:10 more
  • 如何使用Grep命令查找多個字串

    如何使用Grep 命令查找多個字串 大家好,我是良許! 今天向大家介紹一個非常有用的技巧,那就是使用 grep 命令查找多個字串。 簡單介紹一下,grep 命令可以理解為是一個功能強大的命令列工具,可以用它在一個或多個輸入檔案中搜索與正則運算式相匹配的文本,然后再將每個匹配的文本用標準輸出的格式 ......

    uj5u.com 2020-09-10 00:12:28 more
  • git配置http代理

    git配置http代理 經常遇到克隆 github 慢的問題,這里記錄一下幾種配置 git 代理的方法,解決 clone github 過慢。 目錄 git配置代理 git單獨配置github代理 git配置全域代理 配置終端環境變數 git配置代理 主要使用 git config 命令 git單獨 ......

    uj5u.com 2020-09-10 00:12:33 more
  • Linux npm install 裝包時提示Error EACCES permission denied解

    npm install 裝包時提示Error EACCES permission denied解決辦法 ......

    uj5u.com 2020-09-10 00:12:53 more
  • Centos 7下安裝nginx,使用yum install nginx,提示沒有可用的軟體包

    Centos 7下安裝nginx,使用yum install nginx,提示沒有可用的軟體包。 18 (flaskApi) [root@67 flaskDemo]# yum -y install nginx 19 已加載插件:fastestmirror, langpacks 20 Loading ......

    uj5u.com 2020-09-10 00:13:13 more
  • Linux查看服務器暴力破解ssh IP

    在公網的服務器上經常遇到別人爆破你服務器的22埠,用來挖礦或者干其他嘿嘿嘿的事情~ 這種情況下正確的做法是: 修改默認ssh的22埠 使用設定密鑰登錄或者白名單ip登錄 建議服務器密碼為復雜密碼 創建普通用戶登錄服務器(root權限過大) 建立堡壘機,實作統一管理服務器 統計爆破IP [root ......

    uj5u.com 2020-09-10 00:13:17 more
  • CentOS 7系統常見快捷鍵操作方式

    Linux系統中一些常見的快捷方式,可有效提高操作效率,在某些時刻也能避免操作失誤帶來的問題。 ......

    uj5u.com 2020-09-10 00:13:31 more
  • CentOS 7作業系統目錄結構介紹

    作業系統存在著大量的資料檔案資訊,相應檔案資訊會存在于系統相應目錄中,為了更好的管理資料資訊,會將系統進行一些目錄規劃,不同目錄存放不同的資源。 ......

    uj5u.com 2020-09-10 00:13:35 more
最新发布
  • vim的常用命令

    Vim的6種基本模式 1. 普通模式在普通模式中,用的編輯器命令,比如移動游標,洗掉文本等等。這也是Vim啟動后的默認模式。這正好和許多新用戶期待的操作方式相反(大多數編輯器默認模式為插入模式)。 2. 插入模式在這個模式中,大多數按鍵都會向文本緩沖中插入文本。大多數新用戶希望文本編輯器編輯程序中一 ......

    uj5u.com 2023-04-20 08:43:21 more
  • vim的常用命令

    Vim的6種基本模式 1. 普通模式在普通模式中,用的編輯器命令,比如移動游標,洗掉文本等等。這也是Vim啟動后的默認模式。這正好和許多新用戶期待的操作方式相反(大多數編輯器默認模式為插入模式)。 2. 插入模式在這個模式中,大多數按鍵都會向文本緩沖中插入文本。大多數新用戶希望文本編輯器編輯程序中一 ......

    uj5u.com 2023-04-20 08:42:36 more
  • docker學習

    ###Docker概述 真實專案部署環境可能非常復雜,傳統發布專案一個只需要一個jar包,運行環境需要單獨部署。而通過Docker可將jar包和相關環境(如jdk,redis,Hadoop...)等打包到docker鏡像里,將鏡像發布到Docker倉庫,部署時下載發布的鏡像,直接運行發布的鏡像即可。 ......

    uj5u.com 2023-04-19 09:26:53 more
  • 設定Windows主機的瀏覽器為wls2的默認瀏覽器

    這里以Chrome為例。 1. 準備作業 wsl是可以使用Windows主機上安裝的exe程式,出于安全考慮,默認情況下改功能是無法使用。要使用的話,終端需要以管理員權限啟動。 我這里以Windows Terminal為例,介紹如何默認使用管理員權限打開終端,具體操作如下圖所示: 2. 操作 wsl ......

    uj5u.com 2023-04-19 09:25:49 more
  • docker學習

    ###Docker概述 真實專案部署環境可能非常復雜,傳統發布專案一個只需要一個jar包,運行環境需要單獨部署。而通過Docker可將jar包和相關環境(如jdk,redis,Hadoop...)等打包到docker鏡像里,將鏡像發布到Docker倉庫,部署時下載發布的鏡像,直接運行發布的鏡像即可。 ......

    uj5u.com 2023-04-19 09:19:04 more
  • Linux學習筆記

    IP地址和主機名 IP地址 ifconfig可以用來查詢本機的IP地址,如果不能使用,可以通過install net-tools安裝。 Centos系統下ens33表示主網卡;inet后表示IP地址;lo表示本地回環網卡; 127.0.0.1表示代指本機;0.0.0.0可以用于代指本機,同時在放行設 ......

    uj5u.com 2023-04-18 06:52:01 more
  • 解決linux系統的kdump服務無法啟動的問題

    問題:專案麒麟系統服務器的kdump服務無法啟動,沒有相關日志無法定位問題。 1、查看服務狀態是關閉的,重啟系統也無法啟動 systemctl status kdump 2、修改grub引數,修改“crashkernel”為“512M(有的機器數值太大太小都會導致報錯,建議從128M開始試,或者加個 ......

    uj5u.com 2023-04-12 09:59:50 more
  • 解決linux系統的kdump服務無法啟動的問題

    問題:專案麒麟系統服務器的kdump服務無法啟動,沒有相關日志無法定位問題。 1、查看服務狀態是關閉的,重啟系統也無法啟動 systemctl status kdump 2、修改grub引數,修改“crashkernel”為“512M(有的機器數值太大太小都會導致報錯,建議從128M開始試,或者加個 ......

    uj5u.com 2023-04-12 09:59:01 more
  • 你是不是暴露了?

    作者:袁首京 原創文章,轉載時請保留此宣告,并給出原文連接。 如果您是計算機相關從業人員,那么應該經歷不止一次網路安全專項檢查了,你肯定是收到過資訊系統技術檢測報告,要求你加強風險監測,確保你提供的系統服務堅實可靠了。 沒檢測到問題還好,檢測到問題的話,有些處理起來還是挺麻煩的,尤其是線上正在運行的 ......

    uj5u.com 2023-04-05 16:52:56 more
  • 細節拉滿,80 張圖帶你一步一步推演 slab 記憶體池的設計與實作

    1. 前文回顧 在之前的幾篇記憶體管理系列文章中,筆者帶大家從宏觀角度完整地梳理了一遍 Linux 記憶體分配的整個鏈路,本文的主題依然是記憶體分配,這一次我們會從微觀的角度來探秘一下 Linux 內核中用于零散小記憶體塊分配的記憶體池 —— slab 分配器。 在本小節中,筆者還是按照以往的風格先帶大家簡單 ......

    uj5u.com 2023-04-05 16:44:11 more