所以我在這里要做的基本上是我有一個包含 url 端點串列的檔案,我想在斜線分隔符上拆分檔案中的鏈接,基本上生成端點的子端點,例如:
https://www.somesite.com/path1/path2/path3
我想得到這個:
https://www.somesite.com/path1/
https://www.somesite.com/path1/path2/
https://www.somesite.com/path1/path2/path3
我知道如何在 bash 中實作這一點,但不是用 python,我嘗試使用 split 函式,但它在我手中非常有限。我希望我能在這里得到一些幫助,謝謝
uj5u.com熱心網友回復:
一種選擇是用 a 分割/,然后對結果進行切片并回傳:
>>> url = 'https://www.somesite.com/path1/path2/path3'
>>> parts = url.split('/')
>>> ['/'.join(parts[:p 1]) for p in range(3, len(parts))]
['https://www.somesite.com/path1', 'https://www.somesite.com/path1/path2', 'https://www.somesite.com/path1/path2/path3']
uj5u.com熱心網友回復:
嘗試這樣的事情:
link = "https://www.somesite.com/path1/path2/path3"
splitted = link.split('/')
newLink = splitted[0] "//" splitted[2] "/"
for i in range(3, len(splitted)):
newLink = splitted[i]
if i != len(splitted)-1:
newLink = "/"
print(newLink)
輸出代碼是:
https://www.somesite.com/path1/
https://www.somesite.com/path1/path2/
https://www.somesite.com/path1/path2/path3
但是/不需要最后一個鏈接,因此您可以將其寫為:
link = "https://www.somesite.com/path1/path2/path3"
splitted = link.split('/')
newLink = splitted[0] "//" splitted[2]
for i in range(3, len(splitted)):
newLink = "/" splitted[i]
print(newLink)
uj5u.com熱心網友回復:
對于通用的“拆分”,但保留分隔符,您可以使用以下str.partition方法:https ://docs.python.org/3/library/stdtypes.html#str.partition
現在,對于您的特定用例,您希望將完整的中間字串作為串列,您可以撰寫一些代碼,從 urllib.parse 開始獲取 URL initia;l 部分,而不用擔心極端情況,它們會操縱路徑for,split和join.
url = "https://www.somesite.com/path1/path2/path3"
from urllib.parse import urlparse, urlunparse
path = (components:= list(urlparse(a)))[2]
path_comps_str = ""
path_comps = [path_comps_str:= path_comps_str f"/{comp}" for comp in path.split("/")[1:]]
for path in path_comps:
url_parts = components[:]
url_parts[2] = path
all_urls.append(urlunparse(url_parts))
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/517825.html
