使用Perl前瞻斷言來查找單個串列-有解無憂

給定這樣的串列：

direct_SQL_statement ::=
  directly_executable_statement semicolon

directly_executable_statement ::=
    direct_SQL_data_statement
  | SQL_schema_statement
  | SQL_transaction_statement
  | SQL_connection_statement
  | SQL_session_statement
  | direct_implementation_defined_statement

direct_SQL_data_statement ::=
    delete_statement__searched
  | direct_select_statement__multiple_rows
  | insert_statement
  | update_statement__searched
  | truncate_table_statement
  | merge_statement
  | temporary_table_declaration

direct_implementation_defined_statement ::=
  "!! See the Syntax Rules."

apostrophe ::=
  "'"
/*
5.2     token and separator

Function

Specify lexical units (tokens and separators) that participate in SQL language.


Format
*/
token ::=
    nondelimiter_token
  | delimiter_token

identifier_part ::=
    identifier_start
  | identifier_extend
/*
identifier_start ::=
  "!! See the Syntax Rules."
identifier_extend ::=
  "!! See the Syntax Rules."
*/
large_object_length_token ::=
  digit  multiplier

是否可以使用 Perl 的前瞻斷言將其分解為單獨的定義串列？

我試過，

perl -0777ne 'print "$&\n^^\n\n" while /(?=\w \s*::=)\w \s*::=\s*. /gs;'

但它只是回傳了整個事情（好像前瞻斷言根本不起作用），而

perl -0777ne 'print "$&\n^^\n\n" while /(?=\w \s*::=)\w \s*::=\s*. ?/gs;'

出現得太短了：

direct_SQL_statement ::=
  d
^^

directly_executable_statement ::=
    d
^^

direct_SQL_data_statement ::=
    d
^^

direct_implementation_defined_statement ::=
  "
^^

我需要將其分解為單獨的 BNF 定義塊以進一步處理，例如初始測驗資料：

direct_SQL_statement ::=
  directly_executable_statement semicolon
^^


directly_executable_statement ::=
    direct_SQL_data_statement
  | SQL_schema_statement
  | SQL_transaction_statement
  | SQL_connection_statement
  | SQL_session_statement
  | direct_implementation_defined_statement
^^


direct_SQL_data_statement ::=
    delete_statement__searched
  | direct_select_statement__multiple_rows
  | insert_statement
  | update_statement__searched
  | truncate_table_statement
  | merge_statement
  | temporary_table_declaration
^^


direct_implementation_defined_statement ::=
  "!! See the Syntax Rules."
^^

筆記，

以上輸出來自初始測驗資料。
整個A ::= B事情被稱為 BNF 定義。" ^^" 僅用于視覺指示分離是否正確完成。
和apostrophe以下token是不同的 BNF 定義，應該這樣對待。/* ... */應該從輸出中過濾掉評論。
comments may come without empty lines surrounding them. That's the reason I need to rely on the look-ahead assertion instead of the paragraphs mode.
The question comes as a follow up to How can EBNF or BNF be parsed?, of which the solution is "W3C EBNF doesn't end a production with a semicolon because a ::= operator comes after the LHS symbol of a new production."
The whole file can be found at github.com/ronsavage/SQL/blob/master/sql-2016.ebnf

uj5u.com熱心網友回復：

問題得到了編輯，現在有評論，，/* ... */省略

/* ... */可能需要省略的注釋 ( )：

perl -0777 -wnE'say for m{(.*?::=.*?)\n (?: \n  | (?:/\*.*?\*/) | \z)}gsx' bnf.txt

這將捕獲一行::=及其后面的所有內容：更多換行符，或/*...*/（注釋），或字串結尾。

或者，首先洗掉評論，然后用多行中斷

perl -0777 -wnE's{ (?: /\* .*? \*/ ) }{\n}gsx; say for split /\n\n /;' bnf.txt

原帖，以段落模式閱讀檔案。問題編輯后似乎不合適，因為現在評論可能“連接”兩個定義，因此不再是段落。

如果總是有一個空行分隔感興趣的塊，那么可以分段處理

perl -00 -wne'print' file

這會保留空行，無論如何您似乎都想保留它。如果沒有，它可以被洗掉。

（然后好奇地 evan 可以簡單地做perl -00 -pe'1' file）

否則，可以在多個換行符上打破該字串

perl -0777 -wnE'@chunks = split /\n\n /; say for @chunks' file

或者，如果您確實需要輸出它們

perl -0777 -wnE'say for split /\n\n /' file

塊之間的空行現在被洗掉。

我認為沒有理由向前看。

perl -0777 -wnE'say for /(. ?::=.*?)\n(?:\n |\z)/gs' file

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/410295.html

標籤：

上一篇：如何將帶括號和空格的字串拆分為串列

下一篇：簡化C#中的正則運算式代碼：在數字/十進制和單位之間添加空格