構造一個Powershell Replace不太貪婪的正則運算式有點麻煩。
希望將這種模式的出現轉換/sites/*/*/SitePages/*/*.aspx 為:/sites/*/*/SitePages/*/*.html
但是有一個問題,即一行上有多個值要替換。replace的貪婪正在捕獲整條線,只替換最后一條。
樣本輸入:
<div class="ms-wikicontent ms-rtestate-field" style="padding-right: 10px"><div class="ExternalClass8E56354CC4314DBA861E187B689F3A2B"><table id="layoutsTable" style="width:100%"><tbody><tr style="vertical-align:top"><td style="width:100%"><div class="ms-rte-layoutszone-outer" style="width:100%"><div class="ms-rte-layoutszone-inner" role="textbox" aria-haspopup="true" aria-autocomplete="both" aria-multiline="true"><a id="0::Home|Home" class="ms-wikilink" href="/sites/Team/Project/SitePages/Home.aspx">Home</a> - <a id="1::Jenkins|Jenkins" class="ms-wikilink" href="/sites/Team/Project/SitePages/Jenkins.aspx">Jenkins</a><h1 class="ms-rteElement-H1">Jenkins Integration with Deployment Tools</h1>
失敗的正則運算式段:
% { $_ -Replace '(sites.*SitePages.*)\.aspx' , '${1}.html' }
建議?
(動機:我正在嘗試將 aspx 頁面參考轉換為 html,因為我們已經從 SharePoint 上托管。頁面都是靜態的,所以除了轉換頁面擴展之外沒有問題)
uj5u.com熱心網友回復:
如果沒有環顧四周,您可以使用問題中的捕獲組。但是在匹配時,您不應該"在雙引號之間交叉作為字串。
(/sites\b[^\"]*/SitePages/[^\"] )\.aspx\b
解釋
(捕獲組 1/sites\b匹配sites和單詞邊界[^\"]*/SitePages/可選匹配任何字符,除了"然后匹配/SitePages/[^\"]匹配 1 個字符以外的字符"
)關閉組 1\.aspx\b匹配.aspx和單詞邊界
查看正則運算式演示。
$input = @"
<div hljs-string">" style="padding-right: 10px"><div hljs-string">"><table id="layoutsTable" style="width:100%"><tbody><tr style="vertical-align:top"><td style="width:100%"><div hljs-string">" style="width:100%"><div hljs-string">" role="textbox" aria-haspopup="true" aria-autocomplete="both" aria-multiline="true"><a id="0::Home|Home" hljs-string">" href="/sites/Team/Project/SitePages/Home.aspx">Home</a> - <a id="1::Jenkins|Jenkins" hljs-string">" href="/sites/Team/Project/SitePages/Jenkins.aspx">Jenkins</a><h1 hljs-variable constant_">H1">Jenkins Integration with Deployment Tools</h1>
"@
$input -replace '(/sites\b[^\"]*/SitePages/[^\"] )\.aspx\b' ,'$1.html'
輸出
<div class="ms-wikicontent ms-rtestate-field" style="padding-right: 10px"><div class="ExternalClass8E56354CC4314DBA861E187B689F3A2B"><table id="layoutsTable" style="width:100%"><tbody><tr style="vertical-align:top"><td style="width:100%"><div class="ms-rte-layoutszone-outer" style="width:100%"><div class="ms-rte-layoutszone-inner" role="textbox" aria-haspopup="true" aria-autocomplete="both" aria-multiline="true"><a id="0::Home|Home" class="ms-wikilink" href="/sites/Team/Project/SitePages/Home.html">Home</a> - <a id="1::Jenkins|Jenkins" class="ms-wikilink" href="/sites/Team/Project/SitePages/Jenkins.html">Jenkins</a><h1 class="ms-rteElement-H1">Jenkins Integration with Deployment Tools</h1>
如果總是有 2 個部分,則另一種變體/可以使用量詞進行精確重復,{2}例如在之后斷言雙引號.aspx
(/sites(?:/[^/\"] ){2}/SitePages/[^/\"] )\.aspx(?=\")
查看另一個正則運算式演示。
uj5u.com熱心網友回復:
Daniel已經展示了一個使用字符排除的出色解決方案[^/]:
$_ -replace '(?<=/sites/[^/]*/[^/]*/SitePages/[^/]*)aspx', 'html'
- 正則運算式 101 的演示和詳細說明
或者,您可以使用惰性修飾符?:
$_ -replace '(?<=/sites/.*?/.*?/SitePages/.*?)aspx', 'html'
- 正則運算式 101 的演示和詳細說明
雖然后者看起來更干凈,但性能較差,因為它需要更多的回溯。
我做了一個小基準測驗:
$text = '<div style="padding-right: 10px"><div ><table id="layoutsTable" style="width:100%"><tbody><tr style="vertical-align:top"><td style="width:100%"><div style="width:100%"><div role="textbox" aria-haspopup="true" aria-autocomplete="both" aria-multiline="true"><a id="0::Home|Home" href="/sites/Team/Project/SitePages/Home.aspx">Home</a> - <a id="1::Jenkins|Jenkins" href="/sites/Team/Project/SitePages/Jenkins.aspx">Jenkins</a><h1 >Jenkins Integration with Deployment Tools</h1>'
$runs = 100000
$excludeMillis = (Measure-Command { foreach( $i in 1..$runs ) { $text -replace '(?<=/sites/[^/]*/[^/]*/SitePages/[^/]*)aspx', 'html' }}).TotalMilliseconds
$lazyMillis = (Measure-Command { foreach( $i in 1..$runs ) { $text -replace '(?<=/sites/.*?/.*?/SitePages/.*?)aspx', 'html' }}).TotalMilliseconds
[PSCustomObject]@{
RegExExclude = '{0} ms' -f [int]$excludeMillis
RegExLazy = '{0} ms ({1}%)' -f [int]$lazyMillis, [int]($lazyMillis / $excludeMillis * 100)
}
PS 7.2 的輸出:
RegExExclude RegExLazy
------------ ---------
281 ms 350 ms (125%)
差異很明顯,但不是很大,因此如果性能無關緊要,您可能會追求可讀性。
使用編譯的RegEx時,兩者之間的性能差異變得更小:
$text = '<div style="padding-right: 10px"><div ><table id="layoutsTable" style="width:100%"><tbody><tr style="vertical-align:top"><td style="width:100%"><div style="width:100%"><div role="textbox" aria-haspopup="true" aria-autocomplete="both" aria-multiline="true"><a id="0::Home|Home" href="/sites/Team/Project/SitePages/Home.aspx">Home</a> - <a id="1::Jenkins|Jenkins" href="/sites/Team/Project/SitePages/Jenkins.aspx">Jenkins</a><h1 >Jenkins Integration with Deployment Tools</h1>'
$runs = 100000
$rxExclude = [regex]::new( '(?<=/sites/[^/]*/[^/]*/SitePages/[^/]*)aspx', [Text.RegularExpressions.RegexOptions]::Compiled )
$rxLazy = [regex]::new( '(?<=/sites/.*?/.*?/SitePages/.*?)aspx', [Text.RegularExpressions.RegexOptions]::Compiled )
$excludeMillis = (Measure-Command { foreach( $i in 1..$runs ) { $rxExclude.Replace( $text, 'html' ) }}).TotalMilliseconds
$lazyMillis = (Measure-Command { foreach( $i in 1..$runs ) { $rxLazy.Replace( $text, 'html' ) }}).TotalMilliseconds
[PSCustomObject]@{
RegExExclude = '{0} ms' -f [int]$excludeMillis
RegExLazy = '{0} ms ({1}%)' -f [int]$lazyMillis, [int]($lazyMillis / $excludeMillis * 100)
}
PS 7.2 的輸出:
RegExExclude RegExLazy
------------ ---------
160 ms 178 ms (111%)
uj5u.com熱心網友回復:
嘗試
[string]$string = "<div class='ms-wikicontent ms-rtestate-field' style='padding-right: 10px'><div class='ExternalClass8E56354CC4314DBA861E187B689F3A2B'><table id='layoutsTable' style='width:100%'><tbody><tr style='vertical-align:top'><td style='width:100%'><div class='ms-rte-layoutszone-outer' style='width:100%'><div class='ms-rte-layoutszone-inner' role='textbox' aria-haspopup='true' aria-autocomplete='both' aria-multiline='true'><a id='0::Home|Home' class='ms-wikilink' href='/sites/Team/Project/SitePages/Home.aspx'>Home</a> - <a id='1::Jenkins|Jenkins' class='ms-wikilink' href='/sites/Team/Project/SitePages/Jenkins.aspx'>Jenkins</a><h1 class='ms-rteElement-H1'>Jenkins Integration with Deployment Tools</h1>"
$string.Replace('.aspx','.html')
或者如果您正在尋找構建正則運算式。查看https://rubular.com/ 它有助于構建正則運算式
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/485674.html
