Pyparsing在外部檔案中寫入-有解無憂

更新（我進一步了解了...）

所以我的目標是為一個腳本撰寫一個決議器，它是一個奇怪的 XML 類似但不是 XML 格式。

<[file][][]
<[cultivation][][]
    <[string8][coordinate_system][lonlat]>
    <[list_vegetation_map_exclusion_zone][vegetation_map_exclusion_zone_list][]
    >
    <[string8][buildings_texture_folder][]>
    <[list_plant][plant_list][]
    >
    <[list_building][building_list][]
        <[building][element][0]
            <[vector3_float64][position][7.809637 46.182262 0]>
            <[float32][direction][-1.82264196872711]>
            <[float32][length][25.9434452056885]>
            <[float32][width][17.4678573608398]>
            <[int32][floors][3]>
            <[stringt8c][roof][gable]>
            <[stringt8c][usage][residential]>
        > ...

到目前為止，我得到了這個：

def toc_parser(file_path):
# save complete file in variable
f = open(file_path, "r")
toc = f.read()
parser = OneOrMore(Word(alphas))
# exclude kommis
parser.ignore('//'   pp.restOfLine())
#exclude <>
klammern = Suppress("<")
klammernzu = Suppress(">")
eckig = Suppress("[")
eckigzu = Suppress("]")
element = Suppress("[element]")
leer = Suppress("[]")


#grammar:
nameBuilding = "building"
namePosition = "position"
nameDirection = "direction"
nameLength = "length"
nameWidth = "width"
nameFloors = "floors"
nameRoof = "roof"
nameUsage = "usage"



buildingzahl = klammern   eckig   nameBuilding   eckigzu   element  eckig   Word(nums)  eckigzu
pos = klammern   eckig   SkipTo(Literal("]"))   eckigzu   eckig   namePosition   eckigzu   eckig   Combine(Word(nums) "." Word(nums))  Combine(Word(nums) "." Word(nums))  Word(nums)  eckigzu   klammernzu
direc = klammern   eckig   SkipTo(Literal("]"))   eckigzu   eckig   nameDirection   eckigzu   eckig   Combine(Optional("-") Word(nums) Optional("." Word(nums)))  eckigzu   klammernzu
leng = klammern   eckig   SkipTo(Literal("]"))   eckigzu   eckig   nameLength   eckigzu eckig   Combine(Word(nums) Optional("." Word(nums)))  eckigzu   klammernzu
widt = klammern   eckig   SkipTo(Literal("]"))   eckigzu   eckig   nameWidth   eckigzu eckig Combine(Word(nums) Optional("." Word(nums)))  eckigzu   klammernzu
floors = klammern   eckig   SkipTo(Literal("]"))   eckigzu   eckig   nameFloors   eckigzu eckig Word(nums)  eckigzu   klammernzu
roof = klammern   eckig   SkipTo(Literal("]"))   eckigzu   eckig   nameRoof   eckigzu  eckig Word(alphas)  eckigzu   klammernzu
usag = klammern   eckig   SkipTo(Literal("]"))   eckigzu   eckig   nameUsage  eckigzu eckig Word(alphas)  eckigzu   klammernzu

building = buildingzahl   pos  direc  leng   widt   floors   roof   usag   klammernzu

file = klammern   eckig   Literal("file")   eckigzu   leer   leer   klammern   eckig  Literal("cultivation")  eckigzu   leer   leer
vegexcl = Literal("<[list_vegetation_map_exclusion_zone][vegetation_map_exclusion_zone_list][]")   klammernzu
coordsis = Literal("<[string8][coordinate_system][lonlat]>")
textures = Literal("<[string8][buildings_texture_folder][]>")
listPlants = Literal("<[list_plant][plant_list][]")   klammernzu
listBuildings = Literal("<[list_building][building_list][]")   OneOrMore(building)   klammernzu
listLights = Literal("<[list_light][light_list][]")   klammernzu
listAirportLights = Literal("<[list_airport_light][airport_light_list][]")   klammernzu
listXref = Literal("<[list_xref][xref_list][]")   klammernzu

fileganz = file   coordsis   vegexcl   textures   listPlants   listBuildings   listLights   listAirportLights   listXref   klammernzu   klammernzu
print(fileganz.parseString(toc))

題：

我需要能夠覆寫外部腳本中的某些值并發現（此處）這是您以某種方式執行的操作，但它始終輸入“其他”

#define Values to be updated
valuesToUpdate = {
    "building":"home"
    ""
    }

def updateSelectedDefinitions(tokens):
    if tokens.name in valuesToUpdate:
        newVal = valuesToUpdate[tokens.name]
        return "%" % tokens.name, newVal
    else:
        raise ParseException(print("no Update definded"))

非常感謝您的幫助:)

uj5u.com熱心網友回復：

這是一個快速瀏覽。

首先，我們應該嘗試用文字描述這種格式：

"每個條目都包含在 '<>' 字符中，并包含 3 個值在 '[]' 字符中，后跟零個或多個嵌套條目。'[]' 中的 3 個值包含資料型別、可選名稱和一個或多個可選值。這些值可以是數字或字串，并且可以根據資料型別決議為標量或串列值。”

將其轉換為準 BNF，其中 '*' 用于“零個或多個”：

entry ::= '<' subentry subentry subentry entry* '>'
subentry ::= '[' value* ']'
value ::= number | alphanumeric word

我們可以看到這是一個遞回文法，因為entry可以包含也是entry. 因此，當我們轉換為 pyparsing 時，我們將entry使用 pyparsing定義為占位符Forward，然后在定義所有其他運算式后定義其結構。

將這個簡短的 BNF 轉換為 pyparsing：

# define some basic punctuation - useful at parse time, but we will
# suppress them since we don't really need them after parsing is done
# (we'll use pyparsing Groups to capture the structure that these 
# characters represent)
LT, GT, LBRACK, RBRACK = map(pp.Suppress, "<>[]")

# define our placeholder for the nested entry
entry = pp.Forward()

# work bottom-up through the BNF
value = pp.pyparsing_common.number | pp.Word(pp.alphas, pp.alphanums "_")
subentry = pp.Group(LBRACK - value[...]   RBRACK)
type_name_value = subentry*3
entry <<= pp.Group(LT
                   - type_name_value("type_name_value") 
                     pp.Group(entry[...])("contents")   GT)

此時，您可以使用 entry 來決議您的示例文本（在添加足夠多的關閉 '> 以使其成為有效的嵌套運算式后）：

result = entry.parseString(sample)
result.pprint()

印刷：

[[['file'],
  [],
  [],
  [[['cultivation'],
    [],
    [],
    [[['string8'], ['coordinate_system'], ['lonlat'], []],
     [['list_vegetation_map_exclusion_zone'],
      ['vegetation_map_exclusion_zone_list'],
      [],
      []],
     [['string8'], ['buildings_texture_folder'], [], []],
     [['list_plant'], ['plant_list'], [], []],
     [['list_building'],
      ['building_list'],
      [],
      [[['building'],
        ['element'],
        [0],
        [[['vector3_float64'], ['position'], [7.809637, 46.182262, 0], []],
         [['float32'], ['direction'], [-1.82264196872711], []],
         [['float32'], ['length'], [25.9434452056885], []],
         [['float32'], ['width'], [17.4678573608398], []],
         [['int32'], ['floors'], [3], []],
         [['stringt8c'], ['roof'], ['gable'], []],
         [['stringt8c'], ['usage'], ['residential'], []]]]]]]]]]]

所以這是一個開始。我們可以看到值被決議，值被決議為正確的型別。

為了將這些片段轉換為更連貫的結構，我們可以將一個決議動作附加到entry，這將是一個決議時回呼，因為每個都entry被決議。

在這種情況下，我們將撰寫一個決議動作來處理型別/名稱/值三元組，然后捕獲嵌套內容（如果存在）。我們將嘗試從資料型別字串推斷如何構造值或內容。

def convert_entry_to_dict(tokens):
    # entry is wrapped in a Group, so ungroup to get the parsed elements
    parsed = tokens[0]

    # unpack data type, optional name and optional value
    data_type, name, value = parsed.type_name_value
    data_type = data_type[0] if data_type else None
    name = name[0] if name else None

    # save type and name in dict to be returned from the parse action
    ret = {'type': data_type, 'name': name}

    # if there were contents present, save them as the value; otherwise,
    # get the value from the third element in the triple (use the
    # parsed data type as a hint as to whether the value should be a 
    # scalar, a list, or a str)
    if parsed.contents:
        ret["value"] = list(parsed.contents)
    else:
        if data_type.startswith(("vector", "list")):
            ret["value"] = [*value]
        else:
            ret["value"] = value[0] if value else None
            if ret["value"] is None and data_type.startswith("string"):
                ret["value"] = ""

    return ret

entry.addParseAction(convert_entry_to_dict)

現在當我們決議樣本時，我們得到這個結構：

[{'name': None,
  'type': 'file',
  'value': [{'name': None,
             'type': 'cultivation',
             'value': [{'name': 'coordinate_system',
                        'type': 'string8',
                        'value': 'lonlat'},
                       {'name': 'vegetation_map_exclusion_zone_list',
                        'type': 'list_vegetation_map_exclusion_zone',
                        'value': []},
                       {'name': 'buildings_texture_folder',
                        'type': 'string8',
                        'value': ''},
                       {'name': 'plant_list',
                        'type': 'list_plant',
                        'value': []},
                       {'name': 'building_list',
                        'type': 'list_building',
                        'value': [{'name': 'element',
                                   'type': 'building',
                                   'value': [{'name': 'position',
                                              'type': 'vector3_float64',
                                              'value': [7.809637,
                                                        46.182262,
                                                        0]},
                                             {'name': 'direction',
                                              'type': 'float32',
                                              'value': -1.82264196872711},
                                             {'name': 'length',
                                              'type': 'float32',
                                              'value': 25.9434452056885},
                                             {'name': 'width',
                                              'type': 'float32',
                                              'value': 17.4678573608398},
                                             {'name': 'floors',
                                              'type': 'int32',
                                              'value': 3},
                                             {'name': 'roof',
                                              'type': 'stringt8c',
                                              'value': 'gable'},
                                             {'name': 'usage',
                                              'type': 'stringt8c',
                                              'value': 'residential'}]}]}]}]}]

如果您需要重命名任何欄位名稱，您可以在決議操作中添加該行為。

這應該為您處理標記提供了一個良好的開端。

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/370724.html

標籤：Python 蟒蛇-3.x 列表解析解析

上一篇：python-如何避免在不使用pythonsly中的say函式的情況下列印變數？

下一篇：如何使用OpenCV僅檢測參考影像中出現的黑色矩形