Parsing System (v0.7)
|
Ambiguous SymbolsGOLD, and therefore Goldie, allows grammars to have different symbols that share the same name as long as their SymbolType is different. For instance, in this grammar: Foo = {Letter}+
<Foo> ::= Foo
According to GOLD, that is a perfectly legal grammar with two symbols named Foo: The first one is a SymbolType.Terminal, and the second is a SymbolType.NonTerminal. This can cause problems because if you tried to refer to a symbol named Foo, there are two possible symbols you might mean. Symbol Foo is ambiguous. Because of this, Goldie considers the angle brackets on nonterminal to be part of the symbol's name. So in the example above, the terminal symbol is named Foo, but the nonterminal is named <Foo>. In Goldie, a nonterminal's name always includes the surrounding angle brackets. That usually solves the problem, but there are still cases where it doesn't help, such as these three: <Foo> ::= '<Foo>'
<A> ::= 'EOL'
<B> ::= 'Error'
The above is perfectly legal grammar according to GOLD, but there are three ambiguities:
To solve these problems: If you're checking tokens by looking at their symbol name (ie, Token.name), then you'll need to remember to check that the symbol type (ie, Token.type) is also what you expect. If you're passing a symbol name into a Goldie function or template, Goldie will raise an error if the symbol name is ambiguous. This will be a compile-time error for static-style and it will be a run-time Exception for dynamic-style. If you get such an "ambiguous symbol" error, you will need to disambiguate. For now, there is no built-in way to disambiguate for dynamic-style, you'll have to do it manually by checking the token's type. For static-style, suppose you have the following in your code: // Ex #1
Token_mylang!("<Foo>") // Which <Foo>? Terminal or nonterminal?
// Ex #2
Token_mylang!("<Foo>", "subSymbolOfFoo") // Which <Foo>? Terminal or nonterminal?
// Ex #3
Token_mylang!("<FooParent>", "<Foo>", ";") // Which <Foo>? Terminal or nonterminal?
If <Foo> is ambiguous, you can disambiguate like this: // Ex #1
Token_mylang!(SymbolType.NonTerminal, "<Foo>");
// or
Token_mylang!(SymbolType.Terminal, "<Foo>");
// Ex #2
// No need to change this. All parameters after the first string are sub-symbols,
// and *only* a SymbolType.NonTerminal can ever have sub-symbols.
Token_mylang!("<Foo>", "subSymbolOfFoo")
// Ex #3
Token_mylang!("<FooParent>", Token_mylang!(SymbolType.NonTerminal, "<Foo>"), ";")
// or
Token_mylang!("<FooParent>", Token_mylang!(SymbolType.Terminal, "<Foo>"), ";")
Note that you can always use the disambiguated forms even on symbols that are not ambiguous. Also, note that using a disambiguated form always refers to the *exact same type*, not merely a separate-but-equivalent type: // Grammar:
// "Start Symbol" = <Plus>
// <Plus> ::= '+'
// Terminal token
static assert(
is( Token_mylang!"+" == Token_mylang!(SymbolType.Terminal, "+") )
);
// NonTerminal token
static assert(
is( Token_mylang!"<Plus>" == Token_mylang!(SymbolType.NonTerminal, "<Plus>") )
);
// Rule token
static assert(
is(
Token_mylang!("<Plus>", "+") ==
Token_mylang!("<Plus>", Token_mylang!(SymbolType.Terminal, "+"))
)
);
// Or even:
static assert(
is(
Token_mylang!("<Plus>", "+") ==
Token_mylang!("<Plus>", Token_mylang!"+"))
)
);
|