更新(我進一步了解了...)
所以我的目標是為一個腳本撰寫一個決議器,它是一個奇怪的 XML 類似但不是 XML 格式。
<[file][][]
<[cultivation][][]
<[string8][coordinate_system][lonlat]>
<[list_vegetation_map_exclusion_zone][vegetation_map_exclusion_zone_list][]
>
<[string8][buildings_texture_folder][]>
<[list_plant][plant_list][]
>
<[list_building][building_list][]
<[building][element][0]
<[vector3_float64][position][7.809637 46.182262 0]>
<[float32][direction][-1.82264196872711]>
<[float32][length][25.9434452056885]>
<[float32][width][17.4678573608398]>
<[int32][floors][3]>
<[stringt8c][roof][gable]>
<[stringt8c][usage][residential]>
> ...
到目前為止,我得到了這個:
def toc_parser(file_path):
# save complete file in variable
f = open(file_path, "r")
toc = f.read()
parser = OneOrMore(Word(alphas))
# exclude kommis
parser.ignore('//' pp.restOfLine())
#exclude <>
klammern = Suppress("<")
klammernzu = Suppress(">")
eckig = Suppress("[")
eckigzu = Suppress("]")
element = Suppress("[element]")
leer = Suppress("[]")
#grammar:
nameBuilding = "building"
namePosition = "position"
nameDirection = "direction"
nameLength = "length"
nameWidth = "width"
nameFloors = "floors"
nameRoof = "roof"
nameUsage = "usage"
buildingzahl = klammern eckig nameBuilding eckigzu element eckig Word(nums) eckigzu
pos = klammern eckig SkipTo(Literal("]")) eckigzu eckig namePosition eckigzu eckig Combine(Word(nums) "." Word(nums)) Combine(Word(nums) "." Word(nums)) Word(nums) eckigzu klammernzu
direc = klammern eckig SkipTo(Literal("]")) eckigzu eckig nameDirection eckigzu eckig Combine(Optional("-") Word(nums) Optional("." Word(nums))) eckigzu klammernzu
leng = klammern eckig SkipTo(Literal("]")) eckigzu eckig nameLength eckigzu eckig Combine(Word(nums) Optional("." Word(nums))) eckigzu klammernzu
widt = klammern eckig SkipTo(Literal("]")) eckigzu eckig nameWidth eckigzu eckig Combine(Word(nums) Optional("." Word(nums))) eckigzu klammernzu
floors = klammern eckig SkipTo(Literal("]")) eckigzu eckig nameFloors eckigzu eckig Word(nums) eckigzu klammernzu
roof = klammern eckig SkipTo(Literal("]")) eckigzu eckig nameRoof eckigzu eckig Word(alphas) eckigzu klammernzu
usag = klammern eckig SkipTo(Literal("]")) eckigzu eckig nameUsage eckigzu eckig Word(alphas) eckigzu klammernzu
building = buildingzahl pos direc leng widt floors roof usag klammernzu
file = klammern eckig Literal("file") eckigzu leer leer klammern eckig Literal("cultivation") eckigzu leer leer
vegexcl = Literal("<[list_vegetation_map_exclusion_zone][vegetation_map_exclusion_zone_list][]") klammernzu
coordsis = Literal("<[string8][coordinate_system][lonlat]>")
textures = Literal("<[string8][buildings_texture_folder][]>")
listPlants = Literal("<[list_plant][plant_list][]") klammernzu
listBuildings = Literal("<[list_building][building_list][]") OneOrMore(building) klammernzu
listLights = Literal("<[list_light][light_list][]") klammernzu
listAirportLights = Literal("<[list_airport_light][airport_light_list][]") klammernzu
listXref = Literal("<[list_xref][xref_list][]") klammernzu
fileganz = file coordsis vegexcl textures listPlants listBuildings listLights listAirportLights listXref klammernzu klammernzu
print(fileganz.parseString(toc))
題:
我需要能夠覆寫外部腳本中的某些值并發現(此處)這是您以某種方式執行的操作,但它始終輸入“其他”
#define Values to be updated
valuesToUpdate = {
"building":"home"
""
}
def updateSelectedDefinitions(tokens):
if tokens.name in valuesToUpdate:
newVal = valuesToUpdate[tokens.name]
return "%" % tokens.name, newVal
else:
raise ParseException(print("no Update definded"))
非常感謝您的幫助:)
uj5u.com熱心網友回復:
這是一個快速瀏覽。
首先,我們應該嘗試用文字描述這種格式:
"每個條目都包含在 '<>' 字符中,并包含 3 個值在 '[]' 字符中,后跟零個或多個嵌套條目。'[]' 中的 3 個值包含資料型別、可選名稱和一個或多個可選值。這些值可以是數字或字串,并且可以根據資料型別決議為標量或串列值。”
將其轉換為準 BNF,其中 '*' 用于“零個或多個”:
entry ::= '<' subentry subentry subentry entry* '>'
subentry ::= '[' value* ']'
value ::= number | alphanumeric word
我們可以看到這是一個遞回文法,因為entry可以包含也是entry. 因此,當我們轉換為 pyparsing 時,我們將entry使用 pyparsing定義為占位符Forward,然后在定義所有其他運算式后定義其結構。
將這個簡短的 BNF 轉換為 pyparsing:
# define some basic punctuation - useful at parse time, but we will
# suppress them since we don't really need them after parsing is done
# (we'll use pyparsing Groups to capture the structure that these
# characters represent)
LT, GT, LBRACK, RBRACK = map(pp.Suppress, "<>[]")
# define our placeholder for the nested entry
entry = pp.Forward()
# work bottom-up through the BNF
value = pp.pyparsing_common.number | pp.Word(pp.alphas, pp.alphanums "_")
subentry = pp.Group(LBRACK - value[...] RBRACK)
type_name_value = subentry*3
entry <<= pp.Group(LT
- type_name_value("type_name_value")
pp.Group(entry[...])("contents") GT)
此時,您可以使用 entry 來決議您的示例文本(在添加足夠多的關閉 '> 以使其成為有效的嵌套運算式后):
result = entry.parseString(sample)
result.pprint()
印刷:
[[['file'],
[],
[],
[[['cultivation'],
[],
[],
[[['string8'], ['coordinate_system'], ['lonlat'], []],
[['list_vegetation_map_exclusion_zone'],
['vegetation_map_exclusion_zone_list'],
[],
[]],
[['string8'], ['buildings_texture_folder'], [], []],
[['list_plant'], ['plant_list'], [], []],
[['list_building'],
['building_list'],
[],
[[['building'],
['element'],
[0],
[[['vector3_float64'], ['position'], [7.809637, 46.182262, 0], []],
[['float32'], ['direction'], [-1.82264196872711], []],
[['float32'], ['length'], [25.9434452056885], []],
[['float32'], ['width'], [17.4678573608398], []],
[['int32'], ['floors'], [3], []],
[['stringt8c'], ['roof'], ['gable'], []],
[['stringt8c'], ['usage'], ['residential'], []]]]]]]]]]]
所以這是一個開始。我們可以看到值被決議,值被決議為正確的型別。
為了將這些片段轉換為更連貫的結構,我們可以將一個決議動作附加到entry,這將是一個決議時回呼,因為每個都entry被決議。
在這種情況下,我們將撰寫一個決議動作來處理型別/名稱/值三元組,然后捕獲嵌套內容(如果存在)。我們將嘗試從資料型別字串推斷如何構造值或內容。
def convert_entry_to_dict(tokens):
# entry is wrapped in a Group, so ungroup to get the parsed elements
parsed = tokens[0]
# unpack data type, optional name and optional value
data_type, name, value = parsed.type_name_value
data_type = data_type[0] if data_type else None
name = name[0] if name else None
# save type and name in dict to be returned from the parse action
ret = {'type': data_type, 'name': name}
# if there were contents present, save them as the value; otherwise,
# get the value from the third element in the triple (use the
# parsed data type as a hint as to whether the value should be a
# scalar, a list, or a str)
if parsed.contents:
ret["value"] = list(parsed.contents)
else:
if data_type.startswith(("vector", "list")):
ret["value"] = [*value]
else:
ret["value"] = value[0] if value else None
if ret["value"] is None and data_type.startswith("string"):
ret["value"] = ""
return ret
entry.addParseAction(convert_entry_to_dict)
現在當我們決議樣本時,我們得到這個結構:
[{'name': None,
'type': 'file',
'value': [{'name': None,
'type': 'cultivation',
'value': [{'name': 'coordinate_system',
'type': 'string8',
'value': 'lonlat'},
{'name': 'vegetation_map_exclusion_zone_list',
'type': 'list_vegetation_map_exclusion_zone',
'value': []},
{'name': 'buildings_texture_folder',
'type': 'string8',
'value': ''},
{'name': 'plant_list',
'type': 'list_plant',
'value': []},
{'name': 'building_list',
'type': 'list_building',
'value': [{'name': 'element',
'type': 'building',
'value': [{'name': 'position',
'type': 'vector3_float64',
'value': [7.809637,
46.182262,
0]},
{'name': 'direction',
'type': 'float32',
'value': -1.82264196872711},
{'name': 'length',
'type': 'float32',
'value': 25.9434452056885},
{'name': 'width',
'type': 'float32',
'value': 17.4678573608398},
{'name': 'floors',
'type': 'int32',
'value': 3},
{'name': 'roof',
'type': 'stringt8c',
'value': 'gable'},
{'name': 'usage',
'type': 'stringt8c',
'value': 'residential'}]}]}]}]}]
如果您需要重命名任何欄位名稱,您可以在決議操作中添加該行為。
這應該為您處理標記提供了一個良好的開端。
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/370724.html
