我有這棵樹:
<TEI>
<teiHeader/>
<text>
<body>
<div type="chapter">
<p rend="b"><pb n="1"/>lorem ipsum...</p>
<p rend="b">lorem pb n="2"/> ipsum2...</p>
<p>lorem ipsum3...</p>
</div>
<div type="chapter">
<p>lorem ipsum4...</p>
<p rend="b">lorem ipsum5...</p>
<p rend="b">pb n="3"/> lorem ipsum6...</p>
</div>
</body>
</text>
</TEI>
我想改變一切
<p rend="b">lorem ipsum...</p>
進入
<p><hi rend="b">lorem ipsum...</hi></p>
問題是:所有<pb n="X"/>標簽都被洗掉了。
我試過這個(上面的 root = xml 樹):
parser = etree.XMLParser(ns_clean=True, remove_blank_text=True)
root = etree.fromstring(root, parser)
for item in root.findall(".//p[@rend='b']"):
hi = etree.SubElement(item, "hi", rend=font_variant[variant])
hi.text = ''.join(item.itertext())
print(etree.tostring(root, pretty_print=True, xml_declaration=True))
我得到,例如第一個<p/>:
<p><pb n="1"/>lorem ipsum...<hi rend="b"> lorem ipsum...</hi></p>
<pb n="1"/>不見了。
你能幫幫我嗎?
uj5u.com熱心網友回復:
如果我理解正確的話,你可能正在尋找這樣的東西:
for p in root.xpath('//p[@rend="b"]'):
#clone the old <p>
old = etree.fromstring(etree.tostring(p))
#change its name
old.tag = "hi"
#create a new element
new = etree.fromstring('<p/>')
#append the clone to the new element
new.append(old)
new.tail ="\n"
#delete the old <p> and replace it with the new element
p.getparent().replace(p, new)
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/535500.html
上一篇:選擇非空白的隨機單元格
