Home

Copper User Manual

image

Contents

1. S lex terminal INT int terminal FLOAT float terminal String IDENTIFIER a z RESULT lexeme disambiguate ID_FLOAT FLOAT IDENTIFIER FLOAT disambiguate ID_INT INT IDENTIFIER INT S lex 3 5 Context free syntax blocks Context free syntax blocks are enclosed in the markers cf and ect They may include one declaration of a start symbol and any number of declarations of nonterminals operator precedence relations and productions With very few exceptions these take the same form as in CUP 3 5 1 Nonterminal start symbol declarations Nonterminal declarations take the familiar form non terminal nttype ntnamel ntname2 This declares one or more grammar nonterminals If a type ntt ype is provided the variable RESULT declared in the semantic action of any production with one of these non terminals on its left hand side will be of type ntt ype If a type is not provided the default is Object The declaration of a grammar s start symbol takes the self explanatory form start with ntname 15 3 5 2 Operator precedence associativity declarations Operator precedence and associativity declarations take the familiar form precedence left right nonassoc terml term2 Terminals listed on the same line have identical operator precedence while terminals listed on successi
2. 3 5 3 Production declarations o oo e 3 5 3 1 Semantic actions SS SS SS eee 3 6 User code block ie ceca nn ne een we ee ee 3 01 KAS EN ss A BEE ae koe do SV MEE ER eN GES od EA A LG DEER amp 3 7 Parser attributes Ss 4 We u al ee Se a EER Re EER ob 4 Running Copper 41 Requirements idas EERBARE RR BR RAE ELE RES 4 2 Command amp 3 oo card a EO MES WE OOR EERS 4 2 1 Quick start Co Co oo SS ee 42 2 SWATCHES 234 ob et RE bed Rede Bed ed ban dr Bed re Get 4 2 2 1 Gande Vs EER nn hee ab ahh Bo Bs iN SATIS acest N AO ge hy Hoss Meth toda EE 4228 package AE corte ae Ee RE ve eet tow wl were wer net tow es 4 2 24 SWAT SST Goce es en Wile ce len Whe ee ie 42 29 tS WO Gta Sis tte avon Atlee A ashy ord ta weekly nd 4 226 SSSI TT u AE RE A Ge ha ae a ee Ale 4 1 BEROERING eae A 4 3 Copper ANT task Soke Sih BE RAD BE eee Sag eS 4 4 Grammar troubleshooting BERE As 4 4 1 Heap overflow td RE ds 442 Conflicting definition a Br Er 4 4 3 Contradiction involving terminals 2 2 2222 4 4 4 Parse table conflict aa 443 Lexical ambiguity 2 6 4 224 644544 82 2 ER a 8 8 1 644 Running a Copper parser BI iReguirenients is a a RE BE A A A na Sc Sa wieg N Bik Ml Ee Be ED EER BR Hl a 5 3 Parse methods a4 mr ER ER DE OR EE DE OE BES RS go A CUP skin grammar B Mini Java example specification 20 20 20 22 Chapter 1
3. x Semantic action for ntname RHS1 x Sprec termname Slayout terml 16 RHS2 A This form is identical in most respects to that used in CUP The declaration starts with a nonterminal giving the left hand side of the productions to follow followed by Then come one or more sequences of zero or more terminals and nonterminals right hand sides separated by vertical bars Each right hand side may optionally have a semantic action and two attributes e Custom operator As in CUP adding the attribute Sprec termname to a production will change the production s operator from the default of the last terminal on the right to termname e Custom layout The only bit of production syntax differing from CUP s this allows specification of custom layout on productions Adding the attribute Slayout terml termn to a production will change the layout on that production from the universal layout set to the set terml termn 3 5 3 1 Semantic actions Semantic actions on productions are identical to those in CUP Any right hand side symbols that have been labeled may be accessed inside the semantic action using the label name as demonstrated in Algorithm 2 3 6 User code blocks 3 6 1 Auxiliary Auxiliary code is inserted in the body of the parser class It is meant to hold fields and methods accessed by semantic actions and or outside classes such as additional construc tors
4. Introduction 1 1 Copper in a nutshell This manual contains instructions on how to use Copper a Java based LALR 1 parser generator with expanded parsing capability compared to the Java based CUP http www2 cs tum edu projects cup or the C based Bison http www gnu org software bison Like CUP and Bison Copper takes the specification of a formal grammar and generates from it a program specifically a Java class that can parse the language of that grammar However unlike CUP and Bison Copper gives you everything necessary to do so Parsers from most generators require an external scanner built by another tool a scanner generator JLex http www cs princeton edu appel modern java JLex is usually used with CUP and Flex http flex sourceforge net with Bison in order to work Copper generates both the scanner and the parser from a single specifica tion and puts them in a single Java class this integration enables Copper to parse a larger class of grammars than CUP or Bison This manual assumes a working knowledge of LR parsing knowledge of CUP and JLex may also be helpful 12 Example specification Copper is designed to support several skins or different formats for input to suit a wide range of grammar writers There are two such skins the native skin meant for use with machine generated grammar specifications and a skin mimicking the input styles of JLex and CUP as closely as possible me
5. termname regex in terminal classes lt submit list gt dominate list os sige Sprefix prefixname This declares a terminal that is a member of all terminal classes on the list following in with submit and dominate lists containing at least the terminals provided on the lists fol lowing lt and gt respectively a semantic action returning a designated type and a transpar ent prefix Submit and dominate lists may contain the names of terminal classes as well as the names of terminals Placing a terminal class on the list is shorthand for placing all the members of that class on the list 3 4 2 1 Semantic actions on terminals Semantic actions on terminals work differently in Copper than in other scanner generators While in JLex semantic actions are specified on regexes and return an object identifying the matched terminal as described above in Copper the semantic action is only run after it is certain what terminal has been matched Therefore the semantic actions of terminals take on an identical format to those of productions in CUP A variable RESULT of the type specified by termt ype default is Ob ject is available inside the semantic action block what is written to RESULT will be returned and is available to access in production semantic actions In JLex s semantic actions the matching lexeme is referred to by the name yytext In Copper the name lexeme is used instead 13 A
6. 1 x e RESULT e RESULT n ss for this grammar will recognize arithmetic operations over integers by the four arithmetic operations as well as unary negation Note that there are features of this specification that would not be found in grammar specifications written for traditional tools such as two terminals sharing the same regex these will be discussed in further detail below The structure of the rest of the manual is as follows Chapter 2 discusses the novel features of Copper and how to utilize them Chapter 3 contains an exhaustive list of the various components of a Copper specification Chapter 4 contains information about run ning Copper such as command line syntax and how to interpret errors Chapter 5 contains information about utilizing the generated parser Appendix A contains the grammar of the CUP skin s concrete syntax while Appendix B contains a more elaborate example of Cop per s use in the form of a grammar for the Mini Java language from Andrew Appel s Modern Compiler Implementation in Java Chapter 2 Copper s parsing algorithms 2 1 Context aware scanning The most crucial difference between Copper and the standard LALR 1 parser generator is the addition of context aware scanning The typical scanner will scan through the input file and separate it into a stream of tokens with no feedback from the parser A scanner in Copper by contrast contains a distinct su
7. If the semantic action returns an object that object the token is added to the stream of tokens being returned to the parser If no object is returned the regex is judged to represent what we call layout parts of the input that are not supposed to be invisible to the parser and have no meaning thereto The most common forms of layout are whitespace and comments 2 2 2 How Copper handles layout Copper has a more sophisticated method for specifying and handling layout While build ing a parser Copper keeps track of which terminals have been designated to appear as layout in which contexts and builds a sub scanner for each parser state that scans only for the layout that is valid at that location Then each time the parser calls out to the scanner for a new token it will scan first using the layout sub scanner matching many tokens of layout series of spaces comment blocks etc may be present Then when no more layout is present it will scan using the sub scanner for non layout tokens 2 2 3 How to specify layout in Copper 2 2 3 1 Universal layout The simplest sort of layout in Copper is universal layout which should suit the needs of most grammar writers To designate a terminal as universal layout simply place the modifier ignore in front of its declaration as is done on the terminal WS in Algorithm 1 This has a similar effect to giving a terminal no semantic action in Lex or JLex with the important exception tha
8. more than one terminal e g the group z y and 2 and this ambiguity is not able to be resolved through dominate and submit lists the scanner will check to see if there has been a disambiguation function for z y and z specified If so it will execute the function which takes in the matched lexeme and returns exactly one of x y and z One use for disambiguation functions is the typename identifier ambiguity occurring when parsing C typenames and identifiers share a regex if a name has been defined as 10 a type using a typedef it is scanned as a typename and otherwise it is scanned as an identifier This ambiguity may be resolved with a disambiguation function specified for type names and identifiers which returns typename if the lexeme is on a list of typenames and identifier otherwise A disambiguation group is a special case of the disambiguation function instead of a function returning a terminal it simply specifies the terminal to return N B Disambiguation functions and groups are context sensitive i e 1f there is a disam biguation group on terminals x and y specifying that terminal x should be returned in a context where only terminal y is valid y will be matched 2 4 Transparent prefixes The concept of a transparent prefix can best be described by example Suppose that in some grammar there was a terminal ntConst matching integers regex 0 9 and a terminal FloatConst matching floati
9. to guarantee that there is no lexical ambiguity in its scanners if certain compile time checks pass When any such checks fail it is reported as a compilation error of this form Danger at parser states es Lexical ambiguity between among tokens RO This means that the the set of terminals given are not on each other s dominate or submit lists and there is no disambiguation function or group assigned to the set There are three ways to resolve the ambiguity e Modify the dominate and submit lists of the set of terminals e Add a disambiguation function or group to disambiguate the set appropriately e Alter the context free syntax so this set of terminals do not appear in the same con text 23 Chapter 5 Running a Copper parser 5 1 Requirements To run a Copper parser you need e Java Runtime Environment v 1 5 or greater e CopperRuntime jar or CopperCompiler jar on the classpath 5 2 Constructors No parameters need to be passed to a Copper parser on construction It is possible to specify additional constructors in the auxiliary code however the constructor with no parameters cannot be specified in that manner 5 3 parse methods Each parse engine provides several methods used to run the parser e parse Reader input e parse Reader input String inputLabel e parse Reader input String inputLabel edu umn cs melt copper runtime auxiliary ErrorReporter
10. An auxiliary code block takes this form Saux code block Saux 17 3 6 2 Initialization Initialization code is inserted in the body of a method run when the parser is started It is meant to hold initializations of parser attributes An initialization code block takes this form Sinit code block Sinit 3 7 Parser attributes A parser attribute is a variable meant for use exclusively in semantic actions Unlike fields specified in auxiliary code parser attributes can be accessed neither from auxiliary code nor from outside classes A parser attribute is declared as follows Sattr attrtype attrname Both type and name are mandatory 18 Chapter d Running Copper 4 1 Requirements To compile a Copper parser you need e Java Runtime Environment v 1 5 or greater e 256MB of memory 512 768 recommended if compiling large grammars e CopperCompiler jar on the classpath 42 Command line Copper s full command line syntax is java jar location CopperCompiler jar q v runv package packagename parser classname logfile file skin cup native engine stripped moded specfile gt parserfile 4 2 1 Quick start The simplest usage of Copper is java jar location CopperCompiler jar specfile gt parserfile This command takes a grammar specification in specfile written in the CUP skin and compiles it to a parser class of th
11. Copper User Manual August Schwerdfeger Started March 12 2008 Contents Introduction 1 1 Copper in a nutshell 2 2 SE GER SEE ote a 1 2 SExXamplespeciheation u 2 5 2 22 2 2 EER e Se ae Se Copper s parsing algorithms 2 1 Context aware scanning oia a ee eS 2 2 Specification of whitespace and other layout 2 2 1 Layout in traditional tools o 2 2 2 How Copper handles layout uo we eee u u 0 2 4 2 2 3 How to specify layout in Copper 2 2 3 1 gt Universal layout sia as eek er ee ae 2 2 3 2 Layout per production 2 3 Lexical precedence paradigm sex eo a GE RR 2 3 1 Dominate submit lists 22 Sad ai ee 2 3 2 Disambiguation functions groups 2 4 Transparent prefixes a oe Sie ek a Format of grammar specifications 3 1 Comments and whitespace ease a ea RE a 3 2 Preamble 2 3 2 5 c et A EE AR ae A Parser name 8 ese ee a ee ee Sok at Sos 3 4 Lexical syntax blocks 4 4 44 4 4 4044954 86448 64544 244544 3 4 1 Terminal class declarations 0 0 2 00002 eee 3 4 2 Terminal declarations 2 22 Co Comm 3 4 2 1 Semantic actions on terminals 343 Disambiguation functions groups 222mm nennen 3 5 Context free syntax blocks RM ME ER ID ERA RE 3 5 1 Nonterminal start symbol declarations 3 5 2 Operator precedence associativity declarations
12. ant for use by flesh and blood grammar writers This manual concerns itself exclusively with the latter skin Input to Copper consists loosely of preamble materials package and import declara tions etc lexical syntax terminal symbols and regexes used to build the scanner and context free syntax nonterminal symbols and productions used to build the parser Op tionally semantic actions may be supplied with productions and terminals For an example of a grammar specification written for the CUP skin of Copper see Algorithms 1 no semantic actions and 2 with semantic actions The parser compiled Algorithm 1 Recognizer for simple arithmetic grammar package math x This is a RECOGNIZER for a simple arithmetic language it will give errors when invalid strings are entered but takes no action on valid strings AO o Sparser ArithmeticParser x Lexical syntax Slex Whitespace ignore terminal WS x Grammar terminals terminal PLUS SSH PNG terminal UNARY_MINUS ie terminal BINARY_MINUS ite f terminal TIMES ie x terminal DIVIDE ie terminal LPAREN is terminal RPAREN is terminal NUMBER 0 1 9 0 9 x 5 Slex Context free syntax cf Nonterminals non terminal expr x Start symbol start with expr Precedences precedence left PLUS BINARY_MINUS precedence left TIMES DIVIDE precedence
13. at static precedence disambiguator scanner state Contradiction involving terminals Ser on graph This means that 1 there is a cyclic precedence relation among the listed terminals i e there is no way to say that one of the terminals has the maximum precedence and 2 they can all occur in the same context The precedence graph output with the contradiction gives the precedence relations among the terminal set A 1 in row z and column y of the graph means that terminal x is on terminal y s submit list takes precedence over y For instance the following graph Vertices ELO 82 I MA 22 ae Adjacency matrix 22 Or MH OO O OH means that x2 is on x s submit list x3 on x2 s and x on x3 s 4 4 4 Parse table conflict As in a traditional parser generator a parse table conflict occurs when two actions are placed in the same cell of the LR parse table such a conflict is usually resolved by speci fying precedence and associativity on terminals Unlike a traditional parser generator however Copper does not make any attempt to resolve such conflicts automatically Using the CUP skin reduce reduce conflicts are au tomatically resolved by the order in which conflicting productions appear in the file as is done in CUP Any shift reduce conflicts are reported as compilation errors undefined behavior occurs if a parser compiled from a conflicting table is run 4 4 5 Lexical ambiguity Copper is able
14. b scanner for every state of the parser scanner and parser work in lock step and for each token a different scanner will run scanning only for those terminals that are valid syntax at that location This enables such constructs as the two minus terminals in the arithmetic grammar of Algorithm 1 UNARY_MINUS occurs before an expression while BINARY_MINUS occurs between expressions However it also requires more careful planning of lexical syntax as described in Sections 2 2 and 2 3 2 2 Specification of whitespace and other layout 2 2 1 Layout in traditional tools With a scanner generator such as Lex or JLex the specification consists of a list of regexes optionally with semantic actions attached A layout_regex No semantic action regex3 regexl return tok sym TERMI1 yytex regex2 return tok sym TERM2 yytex jo 3 When the scanner runs on an input it will check its list of regexes in downward order until 1t finds one that matches the head of the input It will then dequeue the matching part of the input the lexeme which in Lex and JLex is stored in the variable yytext and run the If no such terminal is matched the scanner will then scan for all terminals to procure information for an error message semantic action associated with that regex it will then scan again at the point in the input immediately following that scan s lexeme
15. d engine would run only three 1 2 On the other hand on the input 1 2 with spaces both engines would run five In this way the moded engine adheres most closely to the JLex convention 4 3 Copper ANT task In addition to the command line interface Copper provides an ANT task extending the class org apache tools ant Task See Table 4 1 for a list of the task bean s pa rameters and what switches they correspond to 4 4 Grammar troubleshooting In this section five problems encountered when compiling grammars in Copper are dis cussed 21 4 4 1 Heap overflow The JVM is usually assigned a maximum heap size of 256MB This is adequate for Copper when compiling smaller grammars but on a large one it is inadequate Therefore if com pilation terminates with an OutOfMemoryError add a switch to the JVM allocating 512MB or 768MB java Xmx512m jar The Xmx switch is nonstandard and subject to change without notice 4 4 2 Conflicting definition On occasion Copper will give an error of this form Error at file line col Conflicting definition involving grammar construct multiple specifications of attributes When compiling specifications in the CUP skin this means that the same name has been given to two constructs The invalid duplicate construct is at the given location 4 4 3 Contradiction involving terminals On occasion Copper will give an error of this form Error
16. e package and class name specified in the specification itself the source code of this parser class is then output to parserfile The other settings are at defaults 19 4 2 2 Switches 4 2 2 1 q and v By default when running Copper outputs a series of progress indicators to standard error detailing what phase of parser compilation it is in e The q switch turns these off e The v switch causes much debugging information grammar descriptions lexical precedence graphs state by state descriptions of LR DFAs parse tables et to be output to standard error This switch is meant mainly for use by Copper developers 4 2 2 2 runv Compiles the parser to output debugging information when run Meant for use in debugging parsers 4 2 2 3 package Specifies what package the output parser should be placed in N B Do not specify packages both on the command line and in the specification this will cause an error when compiling the parser source 4 2 2 4 parser Specifies what the name of the parser class should be Overrides the name specified by a parser directive 4 2 2 5 logfile Specifies a file to which standard error should be redirected 4 2 2 6 skin Chooses the skin to use when reading specfile The default is cup 4 2 2 7 engine Chooses the parsing engine i e implementation of the parsing algorithm for which the parser should be built Options are e stripped The default Parsers built
17. eded package or import declarations as well as any non public classes to be included in the file The preamble is terminated by the string alone on a line as shown in Algorithm 1 3 3 Parser name The name of the parser class is provided by a line of the form sparser classname occurring on the line directly after the ending the preamble 3 4 Lexical syntax blocks Lexical syntax blocks are enclosed in the markers S lex and S lex They may include any number of declarations of terminals disambiguation functions and disambiguation groups 12 3 4 1 Terminal class declarations For convenience terminals may be formed together into non disjoint sets known as ter minal classes Terminal classes are declared with a line of this form class tclassl tclass2 Note that such a line only declares the classes as opposed to specifying which terminals a class contains That is done in terminal declarations 3 4 2 Terminal declarations The simplest terminal declaration is of this form terminal termname regex This declares a terminal with a specified regex that is a member of no terminal classes does not specify any precedence relations with other terminals although another terminal may include it on its dominate or submit list and does not have a transparent prefix or semantic action A terminal declaration specifying all optional attributes is of this form ignore terminal termtype
18. left UNARY_MINUS expr expr PLUS expr expr BINARY_MINUS expr expr TIMES expr expr DIVIDE expr UNARY_MINUS expr Slayout LPAREN expr RPAREN NUMBER Sct Algorithm 2 Parser for simple arithmetic grammar package math x This is a PARSER for a simple arithmetic language when run on a valid string it will return the value of the expression x represented by that string Sparser ArithmeticParser x Lexical syntax Slex Whitespace x ignore terminal WS Grammar terminals x terminal PLUS is terminal UNARY_MINUS el terminal BINARY_MINUS ceed A terminal TIMES ie rl terminal DIVIDE Mee NA AG terminal LPAREN mn Ahle terminal RPAREN IN LG terminal Integer NUMBER 0 1 9 0 9 x ds RESULT Integer parselnt lexeme th Slex Context free syntax SCEI x Nonterminals x non terminal Integer expr x Start symbol start with expr Precedences x precedence left PLUS BINARY_MINUS precedence left TIMES DIVIDE precedence left UNARY_MINUS expr expr l PLUS expr r Sct expr expr expr TIMES expr r DIVIDE expr r UNARY_MINUS expr e Slayout LPAREN expr e RPAREN NUMBER n BINARY_MINUS expr r RESULT 1 r RESULT 1 rj RESULT 1x r RESULT 1 r RESULT
19. lgorithm 3 Terminal declaration example Slex class keywords terminal INT int in keywords lt gt terminal FLOAT float in keywords lt gt 0 x Return a String the token s lexeme terminal String IDENTIFIER a z in lt keywords gt RESULT lexeme 2 lex Example Consider a language with two keywords INT and FLOAT and identifiers de fined as strings of one or more lowercase letters This language is defined by the lexical syntax block in Algorithm 3 3 4 3 Disambiguation functions groups A disambiguation function takes this form disambiguate groupname terml term2 term3 dics body of Java method returning one of terml term2 235 A disambiguation group takes on this form disambiguate groupname terml term2 term3 one of terml term2 Example Consider once again the example from above As specified there with dominate and submit lists INT and FLOAT are reserved keywords i e they cannot be used as iden tifiers even in contexts where INT and FLOAT are invalid syntax Suppose that instead the strings int and float should only be interpreted as key words in contexts where they are valid and as identifiers everywhere else Disambiguation groups may be used to implement this as shown in Algorithm 4 14 Algorithm 4 Disambiguation group example
20. nal x s dominate list is a list of terminals taking precedence over x while x s submit list is a list of terminals over which x takes precedence Formally x is on y s submit list iff y is on z s dominate list however in the actual grammar specifications one of these will do for both For details of how to specify these lists in Copper see Section 3 4 2 N B The precedence relation created by dominate and submit lists is intransitive i e if terminal x is on terminal y s submit list and z on y s it does not follow that z is on x s submit list z must be placed on that list explicitly in such a case N B The precedence relation created by dominate and submit lists is context insensitive i e if terminal y is on terminal x s dominate list then even in a sub scanner that is scanning for x but not y nothing matching y will match z 2 3 2 Disambiguation functions groups The other kind of lexical precedence declarations in Copper are disambiguation functions and disambiguation groups A disambiguation function is a function a Java method in the case of Copper s imple mentation specified for a set of terminals to disambiguate that particular set It is meant as a second choice if dominate and submit lists do not fit the task Disambiguation functions implement the second sort of statements described above A disambiguation function works as follows if the input to the scanner at a given point matches the regex of
21. ng point numbers regex 0 9 1 Clearly any number without a decimal point matches both so there is also a disambiguation group on the set IntConst FloatConst specifying that ntConst should be returned Now in the absence of a decimal point IntC onst will be matched Now suppose that there must be some way for the user of the parser to indicate that a number without a decimal point is a floating point number This is done using a transparent prefix a terminal FloatPre fix with regex float is defined and assigned to be the transparent prefix of FloatConst Now the integer 214 would be entered as 214 while the floating point 214 would be entered as float 214 The float prefix is scanned and thrown away like layout the parser never sees it hence the transparent but unlike layout when it is scanned it produces a narrower context that allows FloatConst to be the only valid terminal N B Never use transparent prefixes to disambiguate between two terminals that are on each other s dominate and submit lists this does not work due to context insensitivity 11 0 9 Chapter 3 Format of grammar specifications 3 1 Comments and whitespace Java style comments followed by a comment and a newline and comments enclosed in x and x are also recognized as layout in the CUP skin 3 2 Preamble The preamble is a block of Java code that will begin the parser source file to be output It should contain any ne
22. nt also matches the regex given for identifiers In the operation of a scanner from a traditional scanner generator as described above no ambiguities are possible because the regex list is gone through one at a time and the first one that matches is always used Only if two terminals share the exact same regex is any further kind of disambiguation possible Lexical precedence on terminals is here defined as a relation that determines whenever several terminals have a regex matching a certain lexeme which terminal should match The above approach mandates a linear order on terminals each terminal must take a place on a line and the terminal closest to the front of the line always matches Copper on the other hand allows a more generalized lexical precedence relation In stead of putting terminals in a line lexical precedence is specified in Copper by individual statements of one of the following forms 1 Terminal x has precedence over terminal y or 2 If an ambiguity occurs among terminals x y and z return one of them Context aware scanning with its many sub scanners scanning for restricted sets of regexes eliminates most ambiguities and makes this scheme practical 2 3 1 Dominate submit lists The primary sort of lexical precedence declarations used in Copper are dominate lists and submit lists specified on terminals They implement the first sort of statements listed above As the names might suggest a termi
23. on the st ripped engine run approximately three times slower than their CUP analogues e moded Still an experimental engine parsers run at speeds approaching that of CUP but in compiling and running consume much more memory than those built on the stripped engine 20 Command line Task parameter Type 3 Notes equivalent compileVerbose Boolean EA engine String engine Fully qualified name sets package fullClassName String package and class parser unqualified name class only input Reader None Input source of any kind Name for source specified inputLabel String None P by input e g a filename inputFile String specfile Input file by name output PrintStream None Output sink of any kind RR None Name for sink specified fi EE E g by output e g a filename outputFile String specfile Output file by name runVerbose Boolean runv skin String skin Table 4 1 ANT task parameters N B There are differences between the ways the two engines handle layout with the following effect The moded engine will run the semantic action on an optional regex matching the empty string layout terminal only when the layout is nonempty while the stripped engine will run it in all cases For example in the parser generated from the specification in Algorithm 1 on the input 1 2 no spaces the stripped engine would run five semantic actions 1 WS WS 2 but the mode
24. reporter e parse String text e parse String text String inputlabel e parse String text String inputLabel ErrorReporter reporter 24 Each function returns an Object that is the RESULT of the last production reduced in the parse i e the root of the parse tree and throw exceptions java io IOException and java util zip DataFormatException The arguments to the methods are as follows e text is a string containing text to parse e input isa Reader containing text to parse e inputLabel defaults to lt stdin gt is a label for input or text as in Table 4 1 e reporter is the logger that will be used to log any messages or errors that are output during the parse This argument is only necessary when sending the text of errors to some output sink other than standard error and will probably be altered in the near future thus for the moment it remains undocumented 23 Appendix A CUP skin grammar e Grammar of the CUP skin 26 Appendix B Mini Java example specification e Appel s Mini Java grammar specified in CUP skin 27
25. t it does not per se prevent the parser from using that terminal in a non layout capacity Any number of grammar wide layout tokens may appear at the beginning and end of the input It may also appear between any two input tokens except where explicitly specified otherwise see section 2 2 3 2 N B In a traditional scanner layout is always optional and therefore is specified as a nonempty regex e g optional whitespace might be specified as one or more spaces In Copper layout is able to be made mandatory and thus optional layout should be specified as a possibly empty regex e g optional whitespace might be specified as zero or more spaces Implemented in a naive manner this arrangement would increase the number of scans performed when there was little to no whitespace in a file In practice Copper is able to scan for layout and non layout concurrently 2 2 3 2 Layout per production Copper allows each production to override the grammar wide layout and specify which terminals may appear as layout between the strings derived from symbols on its right hand side An empty layout set may be explicitly specified as is done on the production expr UNARY MINUS expr in Algorithm 1 In this example spaces are not per mitted between the negative sign and the expression it negates although they are permit ted inside the latter If instead layout sets contain one or more terminals they behave similarly to uni
26. ve lines have successively higher precedence e g in Algorithm 1 TIMES has a higher precedence than PLUS while PLUS and BINARY_MINUS have equal precedence All terminals on a line have the operator associativity specified on that line These precedences and associativities are used to resolve shift reduce conflicts using the following logic e Operators for the shift and reduce actions are defined The shift action s operator is the terminal that would be shifted The reduce action s operator is the operator of the production that would be reduced This is by default the last terminal on the right hand side of the pro duction e g in NT NT NT NT but this can be overridden see the Sprec attribute in the next section e If the two operators have different precedence resolve the conflict in favor of the action whose operator has the highest precedence e If the two operators have the same precedence and the same associativity If the associativity is left resolve in favor of the reduce action If the associativity is right resolve in favor of the shift action Ifthe associativity is nonassoc remove both actions the operator is meant to have its associativity defined through parentheses or some other manner e Otherwise report the conflict as unresolvable 3 5 3 Production declarations Production declarations take the form ntname syml labell des
27. versal layout in their designated contexts For example if the productionexpr expr PLUS expr in Algorithm 1 had a layout terminal with regex one or more underscores the string 3 1___ __2 would be valid input while the string 3 1 2 would not because the space between and 2 is not valid layout in that context However the empty string between 1 and is valid because the upcoming could as well be any other arithmetic operator and thus all the layout from every production representing an arithmetic operator is valid there The algorithm calculating what layout is valid where is quite a complex one but rules of thumb when specifying layout are as follows e In any context where a terminal from the right hand side of a certain production can be shifted except the leftmost one expect the layout of that production e In any context where a nonterminal from the right hand side of a certain production can be reduced except the rightmost one expect the layout of that production e Layout specified on productions with zero or one symbols on the right hand side is meaningless For details of how to implement layout per production in Copper see Section 3 5 3 2 3 Lexical precedence paradigm It is possible for the languages of regexes to overlap creating ambiguities in which several regexes match a given lexeme The most common of these is the keyword identifier ambi guity where a language keyword such as i

Download Pdf Manuals

image

Related Search

Related Contents

User Manual for USBJTAG NT September 2010 (0.42)    1. - Vivitek    

Copyright © All rights reserved.
Failed to retrieve file