如何使用sed或perl洗掉`<ahref="file://a>`保留此文本`</a>`？-有解無憂

如何洗掉所有<a href="file://???">保留此文本</a>而不是其他文本<a></a>或</a>使用 sed 或 perl？
是：

    <p><a class="a" href="file://any" id="b">keep this text</a>, <a href="http://example.com/abc">example.com/abc</a>, more text</p>

應該：

    <p>keep this text, <a href="http://example.com/abc">example.com/abc</a>, more text</p>

我有這樣的正則運算式，但它太貪婪并洗掉了所有 </a>

gsed -E -i 's/<a*href="file:[^>]*>(. ?)<\/a>/\1>/g' file.xhtml

uj5u.com熱心網友回復：

假設：

OP 無法訪問以 HTML 為中心的工具
洗掉<a href="file:...">...some_text...</a>包裝器只留下...some_text...
僅適用于file:條目
輸入資料在file:條目中間沒有換行符/饋送

顯示多個file:條目與其他一些（無意義的）條目穿插在一起的示例資料：

$ cat sample.html
<p><a href="https:/google.com">some text</a><a href="file://any" >keep this text</a>, <a href="http://example.com/abc">example.com/abc</a>, more text</p><a href="file://anyother" >keep this text,too</a>, last test</p>

sed洗掉所有file:條目的包裝器的一個想法：

sed -E 's|<a[^<>] file:[^>] >([^<] )</a>|\1|g' "${infile}"

注意：對于某些[^..]條目可能有點矯枉過正，但關鍵目標是短路sed's默認貪婪匹配......

這留下：

<p><a href="https:/google.com">some text</a>keep this text, <a href="http://example.com/abc">example.com/abc</a>, more text</p>keep this text,too, last test</p>

uj5u.com熱心網友回復：

單程：

sed -E 's,<a[^>]*?href="file://[^>]*>([^<]*)</a>,\1,g'

<a[^>]*?href="file://[^>]*>匹配<a 任意數量的非>（非貪婪）后跟href="file:// 任意數量的非>字符后跟>
([^<]*)匹配并捕獲任意數量的非<字符
匹配 </a>

匹配的所有內容都被捕獲替換，\1結尾g使它在每行的每次出現時進行替換。

例子：

$ cat data
<p><a class="a" href="file://any" id="b">keep this text</a>, <a id="file:ex" href="http://example.com/abc">example.com/abc</a>, more text</p>
<p><a href="file://any" class="f">keep this text</a>, <a href="http://example.com/abc">example.com/abc</a>, more text</p>

$ sed -E 's,<a[^>]*?href="file://[^>]*>([^<]*)</a>,\1,g' < data
<p>keep this text, <a id="file:ex" href="http://example.com/abc">example.com/abc</a>, more text</p>
<p>keep this text, <a href="http://example.com/abc">example.com/abc</a>, more text</p>

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/348503.html

標籤：正则表达式猛击苹果系统 sed 格雷普

上一篇：CentOS7 安裝以太坊 geth 客戶端、創建私有區塊鏈及挖礦

下一篇：帶有CPP和ASM的CMake專案