Python-如何將回圈的輸出保存到多個可呼叫變數-有解無憂

我有以下 Python 代碼，其中專案是從兩個網站請求/回應生成的連接 XML 資料的字串：

items = ET.fromstring(new)
for item in list(items):
    url = item.find("url")
    endpoint = url.text
    ##
    resp = item.find("response")
    response = resp.text
    responses = response.split("\n")
    index = responses.index('')
    indexed = responses[:index]
    print(endpoint, *indexed, sep = "\n")

列印：

https://www.youtube.com/sw.js_data
HTTP/2 200 OK
Content-Type: application/json; charset=utf-8
X-Content-Type-Options: nosniff
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Date: Mon, 14 Mar 2022 17:59:34 GMT
Content-Disposition: attachment; filename="response.bin"; filename*=UTF-8''response.bin
Strict-Transport-Security: max-age=31536000
X-Frame-Options: SAMEORIGIN
Cross-Origin-Opener-Policy-Report-Only: same-origin; report-to="ATmXEA_XZXH6CdbrmjUzyTbVgxu22C8KYH7NsxKbRt94"
Permissions-Policy: ch-ua-arch=*, ch-ua-bitness=*, ch-ua-full-version=*, ch-ua-full-version-list=*, ch-ua-model=*, ch-ua-platform=*, ch-ua-platform-version=*
Accept-Ch: Sec-CH-UA-Arch, Sec-CH-UA-Bitness, Sec-CH-UA-Full-Version, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Model, Sec-CH-UA-Platform, Sec-CH-UA-Platform-Version
Server: ESF
X-Xss-Protection: 0
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
https://www.google.com/client_204?&atyp=i&biw=1440&bih=849&dpr=1.5&ei=Z4IvYpTtF5LU9AP1nIOICQ
HTTP/2 204 No Content
Content-Type: text/html; charset=UTF-8
Strict-Transport-Security: max-age=31536000
Content-Security-Policy: object-src 'none';base-uri 'self';script-src 'nonce-9KQUw4dRjvKnx/zTrOblTQ==' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/cdt1
Bfcache-Opt-In: unload
Date: Mon, 14 Mar 2022 17:59:10 GMT
Server: gws
Content-Length: 0
X-Xss-Protection: 0
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2022-03-14-17; expires=Wed, 13-Apr-2022 17:59:10 GMT; path=/; domain=.google.com; Secure; SameSite=none
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

基本上，我希望能夠單獨評估從上述代碼生成的資料，以便我可以檢查以確保標頭值存在于網站的每個回應中。所以在這個例子中，代碼將首先檢查從第一個網站（youtube）生成的標題集，并說所有標題看起來都不錯。然后檢查從第二個網站（google）生成的標頭集，并說缺少 Strict-Transport-Security 標頭（例如）。此代碼的目標是，無論初始字串中加載了多少回應，它都能夠通過這些網站回應運行驗證，并告訴我是否缺少任何標題。

是否有捷徑可尋？我認為在某些時候每個網站的每個輸出（標題串列）都會保存到可以參考/呼叫的變數中？也許這會變得一團糟，而且不容易做到——不確定！如果有更有效的方法來做我想做的事情，也很高興接受任何關于使這段代碼更干凈的建議。

謝謝！

完整的 XML 字串如下：

<?xml version='1.0' encoding='utf8'?>
<items burpVersion="2022.2.3" exportTime="Mon Mar 14 14:28:18 EDT 2022">
  <item>
    <time>Mon Mar 14 13:59:37 EDT 2022</time>
    <url>https://www.youtube.com/sw.js_data</url>
    <host ip="142.250.190.142">www.youtube.com</host>
    <port>443</port>
    <protocol>https</protocol>
    <method>GET</method>
    <path>/sw.js_data</path>
    <extension>null</extension>
    <request base64="false">GET /sw.js_data HTTP/2
Host: www.youtube.com
Accept: */*
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: cors
Sec-Fetch-Dest: empty
Referer: https://www.youtube.com/sw.js
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9

</request>
    <status>200</status>
    <responselength>3524</responselength>
    <mimetype>JSON</mimetype>
    <response base64="false">HTTP/2 200 OK
Content-Type: application/json; charset=utf-8
X-Content-Type-Options: nosniff
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Date: Mon, 14 Mar 2022 17:59:34 GMT
Content-Disposition: attachment; filename="response.bin"; filename*=UTF-8''response.bin
Strict-Transport-Security: max-age=31536000
X-Frame-Options: SAMEORIGIN
Cross-Origin-Opener-Policy-Report-Only: same-origin; report-to="ATmXEA_XZXH6CdbrmjUzyTbVgxu22C8KYH7NsxKbRt94"
Permissions-Policy: ch-ua-arch=*, ch-ua-bitness=*, ch-ua-full-version=*, ch-ua-full-version-list=*, ch-ua-model=*, ch-ua-platform=*, ch-ua-platform-version=*
Accept-Ch: Sec-CH-UA-Arch, Sec-CH-UA-Bitness, Sec-CH-UA-Full-Version, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Model, Sec-CH-UA-Platform, Sec-CH-UA-Platform-Version
Server: ESF
X-Xss-Protection: 0
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

)]}'

[["yt.sw.adr",null,[[["en","US","US","75.188.116.252",null,null,1,null,[],null,null,"","",null,null,"","QUFFLUhqbnREclEzblJmc25GVF9XSXQ1dFZQSm9sRGlmQXxBQ3Jtc0tuU3huS1RoOHQyaFlqN0dLdm4wcGMweXp0OURWQU5RbEJKRko1TlhGYjBoZ3N1Nnpla3QxUFRkN19uaWxoQVZTV0FRUGh0cUw2ckRWbmh5bGhxYkRjNFc2cUREbjB4MnFxMEpval9HUXNZeWU5d1Ztaw\u003d\u003d","CgtaVS1FWnl4ZTJEZyiGhb6RBg=="],"Vf114d778||"]]</response>
    <comment />
  </item>
  <item>
    <time>Mon Mar 14 13:59:14 EDT 2022</time>
    <url>https://www.google.com/client_204?&amp;atyp=i&amp;biw=1440&amp;bih=849&amp;dpr=1.5&amp;ei=Z4IvYpTtF5LU9AP1nIOICQ</url>
    <host ip="172.217.4.36">www.google.com</host>
    <port>443</port>
    <protocol>https</protocol>
    <method>GET</method>
    <path>/client_204?&amp;atyp=i&amp;biw=1440&amp;bih=849&amp;dpr=1.5&amp;ei=Z4IvYpTtF5LU9AP1nIOICQ</path>
    <extension>null</extension>
    <request base64="false">GET /client_204?&amp;atyp=i&amp;biw=1440&amp;bih=849&amp;dpr=1.5&amp;ei=Z4IvYpTtF5LU9AP1nIOICQ HTTP/2
Host: www.google.com
Sec-Ch-Ua: "(Not(A:Brand";v="8", "Chromium";v="99"
Sec-Ch-Ua-Mobile: ?0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36
Sec-Ch-Ua-Arch: "x86"
Sec-Ch-Ua-Full-Version: "99.0.4844.51"
Sec-Ch-Ua-Platform-Version: "10.0.0"
Sec-Ch-Ua-Bitness: "64"
Sec-Ch-Ua-Model: 
Sec-Ch-Ua-Platform: "Windows"
Accept: image/avif,image/webp,image/apng,image/svg xml,image/*,*/*;q=0.8
X-Client-Data: CJDnygE=
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: no-cors
Sec-Fetch-Dest: image
Referer: https://www.google.com/
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9

</request>
    <status>204</status>
    <responselength>781</responselength>
    <mimetype />
    <response base64="false">HTTP/2 204 No Content
Content-Type: text/html; charset=UTF-8
Strict-Transport-Security: max-age=31536000
Content-Security-Policy: object-src 'none';base-uri 'self';script-src 'nonce-9KQUw4dRjvKnx/zTrOblTQ==' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/cdt1
Bfcache-Opt-In: unload
Date: Mon, 14 Mar 2022 17:59:10 GMT
Server: gws
Content-Length: 0
X-Xss-Protection: 0
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2022-03-14-17; expires=Wed, 13-Apr-2022 17:59:10 GMT; path=/; domain=.google.com; Secure; SameSite=none
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

</response>
    <comment />
  </item>
</items>

更新：過去幾天一直在弄亂代碼，但仍然沒有運氣。歡迎任何和所有想法！

uj5u.com熱心網友回復：

只需將輸出保存到多個專案的單個字典變數中。由于您的文本拆分需要多個步驟，因此請考慮使用已定義的方法。

# DEFINED METHOD TO SPLIT RESPONSE BY LINE BREAKS
def split_text(resp): 
    responses = resp.split("\n")
    index = responses.index('') 
    indexed = responses[:index]

    return indexed

# PARSE XML FILE
doc = ET.fromstring(new)

# RETRIEVE ITEM NODES WITH DICTIONARY COMPREHENSION
website_items = {
    item.find("url").text: split_text(item.find("response").text)
    for item in doc.findall(".//item")
}

# REVIEW SAVED DATA WITH URLS AS KEYS
website_items["https://www.youtube.com/sw.js_data"]
website_items["https://www.google.com/client_204?&amp;atyp=i&amp;biw=1440&amp;bih=849&amp;dpr=1.5&amp;ei=Z4IvYpTtF5LU9AP1nIOICQ"]

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/452166.html

標籤：Python xml for循环解析

上一篇：如何使用Selenium檢查HTML中是否存在元素

下一篇：在Webgl構建時出現決議錯誤，而同一專案在編輯器中作業正常