Parsing System (v0.5)
Goldie Home (v0.5) -> GoldieLib Reference -> class Token

class Token

This is the main interface for processing parse trees.

See the explanation of Tokens vs Symbols.

module goldie.token

None
Line
Block

module goldie.token

Compact
Omit all whitespace, error and comment tokens.
CompactWithSpaces
Like Compact, but adds a space between each token.
default Smart
Like Compact, but adds a space between two tokens whenever the last character of the first token and the first character of the second token are both either alphanumeric or an underscore.
Full

Includes all whitespace, error and comment tokens.

Note: Doesn't currently work after the parse phase because whitespace, error and comment tokens are not currently preserved by the parser.

module goldie.token

this(Symbol symbol, Token[] sub, Language lang, int ruleId)
Constructor for nonterminals. Normally, only GoldieLib itself needs to instantiate tokens, unless you want to create/modify a parse tree or Token array manually.
this(Symbol symbol, Language lang, string content, string file="{unknown}", int line=0, int srcIndexStart=0, int srcIndexEnd=0, CommentType commentMode=CommentType.None, string debugInfo="")
Constructor for terminals. Normally, only GoldieLib itself needs to instantiate tokens, unless you want to create/modify a parse tree or Token array manually.
readonly property Language lang
The Language this Token is associated with.
Token[] subX
The sub-tokens of this Token (if this is a nonterminal).
readonly property int ruleId
If this Token is a nonterminal, then this is the ID of the reduction rule that was used to create the token. This ID is an index into Language.ruleTable.
SymbolType type
The SymbolType of this Token. See the explanation of Tokens, Symbols, and Symbol Types for more information.
Symbol symbol
The Symbol of this Token. See the explanation of Tokens, Symbols, and Symbol Types for more information.
readonly property string typeName
The name of this Token's SymbolType.
readonly property string name
The name of this Token's Symbol.
readonly property string fullName
This is just typeName ~ "." ~ name.
DEPRICATED readonly property int id
This has been removed. Use symbol.id instead.
readonly property int line
readonly property string file

The file and line number of the original source where this Token starts. See Goldie's conventions relating to Line and Column Numbers.

For the line number where this Token ends, or the column number where this Token starts or ends, use srcIndexStart or srcIndexStart together with Lexer.lineIndicies and Lexer.lineAtIndex.

readonly property int srcIndexStart
readonly property int srcIndexEnd
readonly property int srcLength
The locations (zero-indexed) in the original source where this Token starts and ends, and the difference between them.
readonly property CommentType commentMode
Indicates whether or not this token exists inside a comment and, if so, what type of comment.
readonly property Token firstLeaf
readonly property Token lastLeaf
The first and last terminals in this Token. If this Token isn't a nonterminal, then these both just return this.
readonly property string debugInfo
A place for extra debugging information to be stored.
bool matches(string parentSymbol, string[] subSymbols...)

Determine if the Token matches (ie, was created from) a particular reduction rule.

Example: // Did this token come from this reduction rule? // <Add Exp> ::= <Add Exp> '+' <Mult Exp> bool checkToken(Token tok) { return tok.matches("<Add Exp>", "<Add Exp>", "+", "<Mult Exp>"); }

string toString()
string toString(TokenToStringMode mode)
string toStringCompact()
string toStringCompactWithSpaces()
string toStringSmart()
string toStringFull()

Converts the Token to a string that resembles the original source. See TokenToStringMode for descriptions of the different modes of conversion.

Note: Depending on the language and the chosen mode of conversion, the result might not be valid code in the original language, or may have subtly changed meaning. Not all modes of conversion are suitable for all purposes or all languages. Depending on the language or purpose, it may be that none of these are appropriate and you'll have to create a string by walking the Token tree manually. These functions are merely provided as a convenience.

semitwist.treeout.TreeNode toTreeNode()

TreeNode is a type from SemiTwist D Tools that provides an easy way to convert a tree to a text format such as JSON or XML.

Example: To convert a Token to JSON: import semitwist.treeout; string tokenToJSON(Token tok, bool prettyPrint) { if(prettyPrint) return tok.toTreeNode().format(formatterPrettyJSON); else return tok.toTreeNode().format(formatterTrimmedJSON); }

Note, however, if you wish to use the resulting JSON in JsonViewer, and get JsonViewer's enhanced source-viewing features, then you'll need to add a few things to the returned TreeNode before formatting it to a string. See the source of the Parse tool for an example.

module {user-specified package}.token

module {user-specified package}.token

{languageName} = Name of static-style language
{symbol} = [ SymbolType staticSymbolType, ] string staticName=null

This type is for tokens representing a specific Symbol in a static-style language.

This is a templated type. Instantiation example: // Assume the language is named "calc" // For a SymbolType.Terminal symbol named "Ident": // These are the SAME type: Token_calc!"Ident" Token_calc!(SymbolType.Terminal, "Ident") // For a SymbolType.NonTerminal symbol named "<Add Exp>" // These are the SAME type (but different from the above types): Token_calc!"<Add Exp>" Token_calc!(SymbolType.NonTerminal, "<Add Exp>") // All the above share common base-types: Token_calc and Token. // This only shares a common base-type of Token // (since it's from a different language). Token_anotherCalc!"Ident"

The two-parameter form is needed if there are two Symbols with the same name.

Attempting to instantiate a Token_{languageName}!{symbol} with a symbol that doesn't exist in the language will result in a compile-time error.

static enum string StringOf

This is a workaround for DMD Bug #1748.

Evaluates to Token_{languageName}!(SymbolType.{symbolType}, "{symbolName}").

Example: void showStringOf(Token_foo!"Ident" tok) { // Output: Token_foo!(SymbolType.Terminal, "Ident") writeln(typeof(tok).StringOf); }

module {user-specified package}.token

{languageName} = Name of static-style language
{rule} = string staticName, ( int staticRuleId | subTokenTypes... )

Nonterminals have one Token_{languageName}!{rule} for each rule that can create them.

This is a templated type. See Static And Dynamic Styles: Types and Inheritance for an explanation of how it works.

Instantiation example: // Assume the language is named "calc" // These three are all the SAME type, and // are for a nonterminal Token created from // this reduction rule: // <Add Exp> ::= <Add Exp> '+' <Mult Exp> Token_calc!("<Add Exp>", "<Add Exp>", "+", "<Mult Exp>") Token_calc!("<Add Exp>", "<Add Exp>", Token_calc!(SymbolType.Terminal, "+"), "<Mult Exp>") Token_calc!("<Add Exp>", ruleIdOf_calc!("<Add Exp>", "<Add Exp>", "+", "<Mult Exp>")) // This is a different type, but shares the common // base-class of Token_calc!"<Add Exp>" with the above: Token_calc!("<Add Exp>", "<Mult Exp>") // This is another different type, but the only base-types this // one shares with the above are Token_calc and Token (because // it has a different reduction symbol, ie the first argument). Token_calc!("<Mult Exp>", "<Negate Exp>") // The only base-type this shares with any of the above is Token, // since it's from a different language: Token_anotherCalc!("<Add Exp>", "<Mult Exp>") // Use null to refer to a rule that has no sub-tokens, such as in this: // <OptionalHello> ::= 'Hello' // | Token_foo!("<OptionalHello>", "Hello") // First rule Token_foo!("<OptionalHello>", null) // Second rule

See also the documentation on Ambiguous Symbols.

Attempting to instantiate a Token_{languageName}!{rule} with a rule that doesn't exist in the language will result in a compile-time error.

this(Token[] sub, Language lang)
Constructor. Normally, only GoldieLib itself needs to instantiate tokens, unless you want to create/modify a parse tree or Token array manually.
readonly property Token_{languageName}!{symbol} sub(int index)

Type-safe static-style counterpart to Token.subX.

Sample usage: myToken.sub!2

Example: // Assume the language "calc": // <Mult Exp> ::= <Mult Exp> '*' <Negate Exp> // | <Mult Exp> '/' <Negate Exp> // | <Negate Exp> // <Negate Exp> ::= '-' <Value> // | <Value> void foo(Token_calc!("<Mult Exp>", "<Mult Exp>", "*", "<Negate Exp>") tok) { // The third subtoken is known (even at compile-time) to be // a <Negate Exp>. The others are also known. // These are actually checked at compile-time. // If you get them mixed up, you'll get a type-mismatch error when compiling. Token_calc!"<Negate Exp>" negateTok = tok.sub!2; Token_calc!"<Mult Exp>" multTok = tok.sub!0; Token_calc!"*" operatorTok = tok.sub!1; // Determine exact type of the <Negate Exp> subtoken: // Can't know this at compile-time because it depends on // the actual code that was parsed. if( cast(Token_calc!("<Negate Exp>", "-", "<Value>")) negateTok ) { writeln("negateTok came from: <Negate Exp> ::= '-' <Value>"); } else if( cast(Token_calc!("<Negate Exp>", "<Value>")) negateTok ) { writeln("negateTok came from: <Negate Exp> ::= <Value>"); } else writeln("Forgot to handle some other rule!"); }

static enum string StringOf

This is a workaround for DMD Bug #1748.

Evaluates to Token_{languageName}!(SymbolType.NonTerminal, "{symbolName}", ...).

Example: void showStringOf(Token_calc!("<Negate Exp>", "-", "<Value>") tok) { // Output: Token_calc!(SymbolType.NonTerminal, "<Negate Exp>", ...) writeln(typeof(tok).StringOf); }