Parser (package)

The package represents parsing functionality of Splicer and contains following classes:

TM1RuleParser - implements custom actions that get called from the generated parser to create object tree representing parsed TM1 rule.
ArgumentParser - implements CLI parsing and is used by Splicer.py to run required actions by instantiating TM1SpliceExecutor and running its method corresponding to the action required from CLI.

`TM1RuleParser` (class in `TM1RuleParser.py`)

This class is responsible to translate parsed rule into objects in memory that represent the parsed syntax. Methods with name starting with make_ or add_match custom actions as defined in TM1RuleGrammar.peg (described later). These methods are called during parsing of the TM1 rule and are fed with data from the parser defined in TM1RuleGrammar.py.

Representation of Parse Tree

Let’s explain how the parser stores the parsed rule in memory on following example.

Note: The below code is a simplified rule content to demonstrate parsing and resulting representation in memory, it doesn’t represent a functional rule.

SKIPCHECK;
FEEDSTRINGS;
#Expand-Area-Definition(ACT_YEARS, id_ACT_YEARS):'T Year':@mdx

#Region ACT_YEARS
[ 'T Month':'T Month':'M00', 'T Year':'T Year':'2023' ] = 
  N:0;
#EndRegion ACT_YEARS

This rule will be parsed by the TM1RuleParser method get_parse_tree and the parsing result will be returned in an instance of TM1Splice.Parser.Grammar.TM1RuleGrammar.TreeNode class. The class contains elements property, which is a list of parsed tokens. You may see a simplified representation of the parsed rule on below picture.

It is important to understand how the parser accesses the parsed tokens during the custom actions. Let’s demonstrate it on a following example:

Following simplified code represents a PEG rule that we will further refer to in the custom action methods implemented in TM1RuleParser.

DirDimElemPairList      <- DirDimElemPair DirDimElemPairs* %add_dim_to_map
DirDimElemPairs         <- _ "," _ DirDimElemPair %add_dim_to_map2
DirDimElemPair          <- DimIdentifier _ ":" _ DirElementList %make_dim_elem_pair
DimIdentifier           <- "'" [^\'\"\@\<\!\[\]\^\*\>\=\\\|\?\/\;\,\~\%\&\:\n\r]+ "'" %make_ident

The parser generated by Canopy interfaces all tokens trough an elements list. This list is always passed as an argument to a custom action method and contains tokens indexed according to a PEG parsing rule that activated the custom action.

For a better understanding let’s explain how the first parsing rule will affect content of the element list when add_dim_to_map method is called. Note: This method will called once the parser has parsed a dimension/element pair or pairs.

    def add_dim_to_map(self, input, start, end, elements):
        dim_map = dict()
        if elements[0] and elements[1]:
            for dim, elem in [*[elements[0]], *(elements[1].elements)]:
                dim_map[dim] = elem
        return dim_map

As you can see from the method header and implementation, the elements list is used in the code logic to access DirDimElemPair content by accessing elements[0]. Index 0 corresponds with position of DirDimElemPair non-terminal symbol in the rule. It is first symbol that has been parsed and since the elements list is zero based, the index of the symbol is 0. It is important to understand that content will be available after full expansion of the DirDimElemPair non-terminal symbol. That means the parser has to expand all non-terminal symbols until it reads terminal symbols from the input rule.

Similarly for DirDimElemPairs, the symbol is second in the rule and thus its index will be 1. Note that DirDimElemPairs is followed by * in the PEG rule, which indicates zero or n repetitions of the non-terminal symbol. That's the reason why we access elements[1].elements as we intend to read content of all occurrences of the DirDimElemPairs non-terminal symbol as have been parsed.

Non-terminal symbol as per definition needs to be further expanded, so we can follow the parsing logic of DirDimElemPair until DimIndetifier rule. You may see from the rule itself, once it will be parsed, the parser will call a custom method make_dim_elem_pair.

    def make_dim_elem_pair(self, input, start, end, elements):
        if type(elements[4]) is list:
            pairs = []
            for (hierarchy, element) in elements[4]:
                pairs.append((hierarchy if hierarchy else elements[0], element))
            return (elements[0], pairs)
        else:
            if not elements[4][0]:
                elements[4] = (elements[0], elements[4][1])
            return (elements[0], [elements[4]])

Again you may notice the custom action method header is following the same pattern as in previous case. We have elements list from which we can access all tokens as parsed during expansion of DirDimElemPair. In this case we can find DimIdentifer contents in elements[0], elements[1] would contain all whitespace characters that could follow a dimension identifier (or if there are none, elements[1] will be empty). Next, elements[2] would contain a single character : that is used to divide an element name from the dimension name. And similarly, elements[3] would contain all whitespace characters that could follow a : separator. Now, elements[4] refers to tokens that were collected when parsing DirElementList non-terminal symbol.

We followed expansions of each non-terminal symbol in above examples, but the parser needs to use terminal symbols to be able to properly select which parsing rules apply. Let’s demonstrate this on the last rule the parser would use when reading a dimension name. The rule that applies in this case is expansion of DimIdentifier. This rule can be interpreted as follows: DimIdentifier is a string enclosed in single quotes consisting of at least one allowed character. If the parser reads such a token, it will call custom action method make_ident. Let’s have a look how the method is implemented:

def make_ident(self, input, start, end, elements):
    ident = ""
    for char in elements:
        ident = ident + char.text
    # ignore leading/trailing apostrophes
    ident = ident.strip("'")
    return ident

The method has the same header as in above cases, elements list will now contain only terminal symbols as the rule for DimIdentifier suggests. Note: Terminal symbols are basically characters that are read from the source text being parsed.

That means the parser will “consume” the terminal symbols until the last matching character, which is a single quote that concludes a dimension name used in an area statement or a directive definition. The consumed characters will be available in elements list. Again we must refer to the matching PEG grammar rule to identify how the characters read from source will map to individual elements in the list. In our case, let’s say we have parsed the following string:

'T Year'

The elements list will consist of terminal symbols as in below table.

	`elements`

	`elements`
Index	`0`	`1`	`2`	`3`	`4`	`5`	`6`	`7`
`TreeNode.text`	`'`	`T`		`Y`	`e`	`a`	`r`	`'`

Since terminal symbols are represented as TreeNode objects by the parser, the actual content is stored in text property of each TreeNode object contained in the elements list. Therefore to get the identifier, we must concatenate the text property of each member. As a very last step we will strip leading and trailing single quotes as they have no value for splicing.

TM1RuleParser Properties and Methods

The class defines following attributes:

Attribute	Type	Usage

Attribute	Type	Usage
`directives_resolver`	`TM1DirectiveResolver`	Instance of directive resolver used for splicing (https://apliqoux.atlassian.net/wiki/spaces/AFD/pages/2540863497) of the rule.
`regions`	`TM1RegionContainer`	Regions container for splicing of the rule.
`parser`	`Parser`	Instance of parser (Parser/Grammar (package)) to be used to parse the rule.
`current_rule_id`	`TM1RuleID`	Checksum of the current rule calculated from the rule source code (unused).
`generate_rule_id`	`bool`	Flag indicating whether to generate rule IDs or not.

The class exposes following methods:

Method	Usage

Method	Usage
`make_empty_line`	Custom action: Create a `TM1RuleToken` instance representing an empty line.
`make_comment`	Custom action: Creates a `TM1RuleToken` instance representing a comment or a directive.
`make_command`	Custom action: Creates a `TM1RuleToken` or `TM1RuleCommand` instance based on type of parsed token. `TM1RuleCommand` instance will be created for calculation rule or a feeder rule (CalcRule non-terminal symbol in grammar), an instance of `TM1RuleToken` in other cases (`skipcheck`, `feeders`, `feedstrings`).
`begin_region`	Custom action: Creates instance of `TM1RegionStart` when `#Region` was parsed.
`end_region`	Custom action: Creates instance of `TM1RegionEnd` when `#EndRegion` was parsed.
`make_directive`	Custom action: Will call `directives_resolver`.`consume_directive` to create an instance of a new `TM1Directive` in the `directives_resolver`.`directives_by_scope` internal store when the parser parsed a directive.
`make_calc_rule`	Custom action: Creates a `TM1CalcRule` instance when a calculation rule statement was parsed.
`make_feeder_rule`	Custom action: Creates a `TM1CalcRule` instance when a feeder rule statement was parsed.
`make_area_statement`	Custom action: Creates a `TM1RuleAreaStatement` instance when an area statement of calculation or a feeder rule was parsed. The instance of the class is `area_statement`property of `TM1CalcRule` created by `make_calc_rule` or `make_feeder_rule`.
`make_ident`	Custom action: Returns a string representing an identifier parsed by the parser - for example it may be a dimension, hierarchy or element name.
`make_catalog_key`	Custom action: Returns a string representing an ID to use to retrieve MDX query from `fpm.json` `Catalog` object that is associated with a directive that has been parsed.
`make_region_definition`	Custom action: Returns a string representing name of a region after `#Region` or `#EndRegion` have been parsed.
`add_dim_to_map`	Custom action: Returns a dictionary of elements that follow a dimension name in a directive or in an area statement indexed by the dimension name.
`add_dim_to_map2`	Custom action: Returns a tuple containing a dimension and an element name.
`make_dim_elem_pair`	Custom action: Returns a tuple containing a dimension identifier and element list.
`make_hier_elem`	Custom action: Returns a hierarchy name (optional) and element name.
`ignore_dim_elem_pair`	Custom action: Returns an element name with no hierarchy.
`make_elem_list`	Custom action: Returns a list of element names.
`make_elem_list2`	Custom action: Returns a list of element names.
`make_mdx_query`	Custom action: Returns `@mdx` flag indicating unlimited spicing scope within a dimension/hierarchy.
`make_hex_footprint`	Custom action: Returns a hex footprint parsed from a placeholder before actual rule for which the checksum was calculated.
`make_rule_id`	Custom action: Returns an instance of `TM1RuleID` representing a checksum calculated for actual rule when the checksum has been parsed.
`splice_rule`	Returns a spliced rule as string. The method will first parse the rule, then it will retrieve parse tree and in turn it will run `convert` method on all objects included in the parse tree.
`get_rule`	Returns a rule as string. The method will parse the rule, then retrieve parse tree and run `get_text` method on all objects included in the parse tree.
`get_parse_tree`	Returns a parse tree, root element of the tree is a `TreeNode` instance representing entire rule file. The method runs `parser.parse()` to get the parse tree from the `Parser` instance.
`locate_region_subtrees`	Returns a list of parse subtrees that match a supplied region name.
`exclude_subtree`	Returns a modified parse subtree that will not contain a specified subtree that is removed.
`inject_subtree`	Returns a modified parse subtree that will contain a specified subtree that has been added to a location of supplied region.

`ArgumentParser` (class in `ArgumentParser.py`)

The class implements command line parsing and is used by main application defined in Splicer.py.

The class is using argparse library which doesn’t allow too much flexibility as library click does. Consider changing the library to allow better CLI user experience. The class might also serve as support to expose Splicer as a web service and provide basic parsing of web service headers.

Parser (package)

TM1RuleParser (class in TM1RuleParser.py)

Representation of Parse Tree

TM1RuleParser Properties and Methods

ArgumentParser (class in ArgumentParser.py)

`TM1RuleParser` (class in `TM1RuleParser.py`)

`ArgumentParser` (class in `ArgumentParser.py`)