我有以下 Python 代碼,其中專案是從兩個網站請求/回應生成的連接 XML 資料的字串:
items = ET.fromstring(new)
for item in list(items):
url = item.find("url")
endpoint = url.text
##
resp = item.find("response")
response = resp.text
responses = response.split("\n")
index = responses.index('')
indexed = responses[:index]
print(endpoint, *indexed, sep = "\n")
列印:
https://www.youtube.com/sw.js_data
HTTP/2 200 OK
Content-Type: application/json; charset=utf-8
X-Content-Type-Options: nosniff
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Date: Mon, 14 Mar 2022 17:59:34 GMT
Content-Disposition: attachment; filename="response.bin"; filename*=UTF-8''response.bin
Strict-Transport-Security: max-age=31536000
X-Frame-Options: SAMEORIGIN
Cross-Origin-Opener-Policy-Report-Only: same-origin; report-to="ATmXEA_XZXH6CdbrmjUzyTbVgxu22C8KYH7NsxKbRt94"
Permissions-Policy: ch-ua-arch=*, ch-ua-bitness=*, ch-ua-full-version=*, ch-ua-full-version-list=*, ch-ua-model=*, ch-ua-platform=*, ch-ua-platform-version=*
Accept-Ch: Sec-CH-UA-Arch, Sec-CH-UA-Bitness, Sec-CH-UA-Full-Version, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Model, Sec-CH-UA-Platform, Sec-CH-UA-Platform-Version
Server: ESF
X-Xss-Protection: 0
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
https://www.google.com/client_204?&atyp=i&biw=1440&bih=849&dpr=1.5&ei=Z4IvYpTtF5LU9AP1nIOICQ
HTTP/2 204 No Content
Content-Type: text/html; charset=UTF-8
Strict-Transport-Security: max-age=31536000
Content-Security-Policy: object-src 'none';base-uri 'self';script-src 'nonce-9KQUw4dRjvKnx/zTrOblTQ==' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/cdt1
Bfcache-Opt-In: unload
Date: Mon, 14 Mar 2022 17:59:10 GMT
Server: gws
Content-Length: 0
X-Xss-Protection: 0
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2022-03-14-17; expires=Wed, 13-Apr-2022 17:59:10 GMT; path=/; domain=.google.com; Secure; SameSite=none
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
基本上,我希望能夠單獨評估從上述代碼生成的資料,以便我可以檢查以確保標頭值存在于網站的每個回應中。所以在這個例子中,代碼將首先檢查從第一個網站(youtube)生成的標題集,并說所有標題看起來都不錯。然后檢查從第二個網站(google)生成的標頭集,并說缺少 Strict-Transport-Security 標頭(例如)。此代碼的目標是,無論初始字串中加載了多少回應,它都能夠通過這些網站回應運行驗證,并告訴我是否缺少任何標題。
是否有捷徑可尋?我認為在某些時候每個網站的每個輸出(標題串列)都會保存到可以參考/呼叫的變數中?也許這會變得一團糟,而且不容易做到——不確定!如果有更有效的方法來做我想做的事情,也很高興接受任何關于使這段代碼更干凈的建議。
謝謝!
完整的 XML 字串如下:
<?xml version='1.0' encoding='utf8'?>
<items burpVersion="2022.2.3" exportTime="Mon Mar 14 14:28:18 EDT 2022">
<item>
<time>Mon Mar 14 13:59:37 EDT 2022</time>
<url>https://www.youtube.com/sw.js_data</url>
<host ip="142.250.190.142">www.youtube.com</host>
<port>443</port>
<protocol>https</protocol>
<method>GET</method>
<path>/sw.js_data</path>
<extension>null</extension>
<request base64="false">GET /sw.js_data HTTP/2
Host: www.youtube.com
Accept: */*
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: cors
Sec-Fetch-Dest: empty
Referer: https://www.youtube.com/sw.js
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
</request>
<status>200</status>
<responselength>3524</responselength>
<mimetype>JSON</mimetype>
<response base64="false">HTTP/2 200 OK
Content-Type: application/json; charset=utf-8
X-Content-Type-Options: nosniff
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Date: Mon, 14 Mar 2022 17:59:34 GMT
Content-Disposition: attachment; filename="response.bin"; filename*=UTF-8''response.bin
Strict-Transport-Security: max-age=31536000
X-Frame-Options: SAMEORIGIN
Cross-Origin-Opener-Policy-Report-Only: same-origin; report-to="ATmXEA_XZXH6CdbrmjUzyTbVgxu22C8KYH7NsxKbRt94"
Permissions-Policy: ch-ua-arch=*, ch-ua-bitness=*, ch-ua-full-version=*, ch-ua-full-version-list=*, ch-ua-model=*, ch-ua-platform=*, ch-ua-platform-version=*
Accept-Ch: Sec-CH-UA-Arch, Sec-CH-UA-Bitness, Sec-CH-UA-Full-Version, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Model, Sec-CH-UA-Platform, Sec-CH-UA-Platform-Version
Server: ESF
X-Xss-Protection: 0
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
)]}'
[["yt.sw.adr",null,[[["en","US","US","75.188.116.252",null,null,1,null,[],null,null,"","",null,null,"","QUFFLUhqbnREclEzblJmc25GVF9XSXQ1dFZQSm9sRGlmQXxBQ3Jtc0tuU3huS1RoOHQyaFlqN0dLdm4wcGMweXp0OURWQU5RbEJKRko1TlhGYjBoZ3N1Nnpla3QxUFRkN19uaWxoQVZTV0FRUGh0cUw2ckRWbmh5bGhxYkRjNFc2cUREbjB4MnFxMEpval9HUXNZeWU5d1Ztaw\u003d\u003d","CgtaVS1FWnl4ZTJEZyiGhb6RBg=="],"Vf114d778||"]]</response>
<comment />
</item>
<item>
<time>Mon Mar 14 13:59:14 EDT 2022</time>
<url>https://www.google.com/client_204?&atyp=i&biw=1440&bih=849&dpr=1.5&ei=Z4IvYpTtF5LU9AP1nIOICQ</url>
<host ip="172.217.4.36">www.google.com</host>
<port>443</port>
<protocol>https</protocol>
<method>GET</method>
<path>/client_204?&atyp=i&biw=1440&bih=849&dpr=1.5&ei=Z4IvYpTtF5LU9AP1nIOICQ</path>
<extension>null</extension>
<request base64="false">GET /client_204?&atyp=i&biw=1440&bih=849&dpr=1.5&ei=Z4IvYpTtF5LU9AP1nIOICQ HTTP/2
Host: www.google.com
Sec-Ch-Ua: "(Not(A:Brand";v="8", "Chromium";v="99"
Sec-Ch-Ua-Mobile: ?0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36
Sec-Ch-Ua-Arch: "x86"
Sec-Ch-Ua-Full-Version: "99.0.4844.51"
Sec-Ch-Ua-Platform-Version: "10.0.0"
Sec-Ch-Ua-Bitness: "64"
Sec-Ch-Ua-Model:
Sec-Ch-Ua-Platform: "Windows"
Accept: image/avif,image/webp,image/apng,image/svg xml,image/*,*/*;q=0.8
X-Client-Data: CJDnygE=
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: no-cors
Sec-Fetch-Dest: image
Referer: https://www.google.com/
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
</request>
<status>204</status>
<responselength>781</responselength>
<mimetype />
<response base64="false">HTTP/2 204 No Content
Content-Type: text/html; charset=UTF-8
Strict-Transport-Security: max-age=31536000
Content-Security-Policy: object-src 'none';base-uri 'self';script-src 'nonce-9KQUw4dRjvKnx/zTrOblTQ==' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/cdt1
Bfcache-Opt-In: unload
Date: Mon, 14 Mar 2022 17:59:10 GMT
Server: gws
Content-Length: 0
X-Xss-Protection: 0
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2022-03-14-17; expires=Wed, 13-Apr-2022 17:59:10 GMT; path=/; domain=.google.com; Secure; SameSite=none
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
</response>
<comment />
</item>
</items>
更新:過去幾天一直在弄亂代碼,但仍然沒有運氣。歡迎任何和所有想法!
uj5u.com熱心網友回復:
只需將輸出保存到多個專案的單個字典變數中。由于您的文本拆分需要多個步驟,因此請考慮使用已定義的方法。
# DEFINED METHOD TO SPLIT RESPONSE BY LINE BREAKS
def split_text(resp):
responses = resp.split("\n")
index = responses.index('')
indexed = responses[:index]
return indexed
# PARSE XML FILE
doc = ET.fromstring(new)
# RETRIEVE ITEM NODES WITH DICTIONARY COMPREHENSION
website_items = {
item.find("url").text: split_text(item.find("response").text)
for item in doc.findall(".//item")
}
# REVIEW SAVED DATA WITH URLS AS KEYS
website_items["https://www.youtube.com/sw.js_data"]
website_items["https://www.google.com/client_204?&atyp=i&biw=1440&bih=849&dpr=1.5&ei=Z4IvYpTtF5LU9AP1nIOICQ"]
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/452166.html
