我正在使用 Python 2.7(遺憾的是我無法升級到任何新版本)并且我正在嘗試決議 2 個 XML 檔案,但使用lxml但有些不對勁,我不確定我做錯了什么:
代碼:
from lxml import etree as ET
def string_to_lxml(string):
xml_file = bytes(bytearray(string, encoding='utf-8'))
return ET.XML(xml_file)
def find_all(tag, atr):
return tag.xpath("//%s" % atr)
xml_str_1 = """<?xml version="1.0" encoding="UTF-8"?>
<A xmlns="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0">
<B name="SOME_NAME_0">
<C/>
<D>SOME NAME</D>
<AA>
<dir name="include" filters="*.h *.hpp *.tpp *.i"/>
</AA>
<H>
<TAG_1 name="main" default="true"/>
</H>
</B>
<TT>
<GG>
<FF configs="main">
<TAG_2 name="NAME_1"/>
<TAG_2 name="NAME_2"/>
<TAG_3 name="NAME_3"/>
<TAG_3 name="NAME_4"/>
<TAG_3 name="NAME_5"/>
</FF>
</GG>
</TT>
</A>"""
xml_str_2 = """<?xml version='1.0' encoding='UTF-8'?>
<A xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://obe.nce.amadeus.net/bms/metadata/1-0/">
<B name="NAME" version="VERSION">
<AA>SOME NAME</AA>
<CC>SOME OTHER NAME</CC>
</B>
<C>
<TAG_3 name="NAME_1" path="path_1"/>
<TAG_3 name="NAME_2" path="path_2"/>
<TAG_3 name="NAME_3" path="path_3"/>
</C>
<D>
<TAG_3 type="type" name="NAME_1" version="version_1"/>
<TAG_3 type="type" name="NAME_2" version="version_2"/>
<TAG_3 type="type" name="NAME_3" version="version_3"/>
</D>
</A>
"""
root = string_to_lxml(xml_str_1)
print(find_all(root, "TAG_3"))
root = string_to_lxml(xml_str_2)
print(find_all(root, "TAG_3"))
輸出:
[]
[<Element TAG_3 at 0x7f257c126640>, <Element TAG_3 at 0x7f257c126be0>, <Element TAG_3 at 0x7f257c126b90>, <Element TAG_3 at 0x7f257c126e10>, <Element TAG_3 at 0x7f257c128730>, <Element TAG_3 at 0x7f257c128640>]
我是否以錯誤的方式決議 XML?
uj5u.com熱心網友回復:
首先 XML 定義了一個必須考慮的匿名命名空間
xmlns="http://www.w3.org/2001/XMLSchema-instance"
為此,xpath 運算式可以表示如下
def find_all(tag, atr):
return tag.xpath("//*[local-name()= '%s']" % atr)
結果:
[<Element {http://www.w3.org/2001/XMLSchema-instance}TAG_3 at 0x7f39cf73de88>, <Element {http://www.w3.org/2001/XMLSchema-instance}TAG_3 at 0x7f39cf73df88>, <Element {http://www.w3.org/2001/XMLSchema-instance}TAG_3 at 0x7f39cf73dfc8>]
[<Element TAG_3 at 0x7f39cf73df88>, <Element TAG_3 at 0x7f39cf73dfc8>, <Element TAG_3 at 0x7f39cf73dec8>, <Element TAG_3 at 0x7f39cf762048>, <Element TAG_3 at 0x7f39cf762088>, <Element TAG_3 at 0x7f39cf762108>]
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/482250.html
標籤:Python xml python-2.7 解析 lxml
下一篇:未找到設定檔案參考
