This is the home page for the ANTLR 3 space.
ANTLR 3 is the latest version of a language processing toolkit that was originally released as PCCTS in the mid-1990s. As was the case then, this release of the ANTLR toolkit advances the state of the art with it‘s new LL(*) parsing engine. ANTLR (ANother Tool for Language Recognition) provides a framework for the generation of recognizers, compilers, and translators from grammatical descriptions. ANTLR grammatical descriptions can optionally include action code written in what is termed the target language (i.e. the implementation language of the source code artifacts generated by ANTLR).
When it was released, PCCTS supported C as it‘s only target language, but through consulting with NeXT Computer, PCCTS supported C++ after 1994. It‘s immediate successor ANTLR 2 supported Java, C# and Python in addition to C++. Although it is still in beta, ANTLR 3 has already demonstrated support for Java, C#, Objective-C, C, C++ and Ruby as target languages. As of July 2006, the Java target is complete and the C#, Objective C, Ruby and C targets are nearly complete. Support for additional target languages including C++, Perl6 and Oberon (yes, Oberon) is either expected or already in progress.
Put simply, ANTLR 3 generates - the source code for - language processing tools from a grammatical description. To this end, it is commonly categorised as a compiler generator or compiler compiler in the tradition of tools such as Lex/Flex and Yacc/Bison). ANTLR 3 can generate the source code for various tools that can be used to analyze and transform input in the language defined by the input grammar. The basic types of language processing tools that ANTLR can generates are Lexers (a.k.a scanners, tokenizers), Parsers and TreeParsers (a.k.a tree walkers, c.f. visitors).
Because it can save you time and resources by automating significant portions of the effort involved in building language processing tools. It is well established that generative tools such as compiler compilers have a major, positive impact on developer productivity. In addition, ANTLR v3‘s improved analysis engine, it‘s significantly enhanced parsing strength via LL(*) parsing with arbitrary lookahead, it‘s vastly improved tree construction rewrite rules and the availability of the simply fantastic AntlrWorks IDE
Download and install ANTLR 3 from the ANTLR 3 page of the ANTLR website
Java | grammar SimpleCalc; tokens { PLUS = ‘+‘ ; MINUS = ‘-‘ ; MULT = ‘*‘ ; DIV = ‘/‘ ; } @members { public static void main(String[] args) throws Exception { SimpleCalcLexer lex = new SimpleCalcLexer(new ANTLRFileStream(args[0])); CommonTokenStream tokens = new CommonTokenStream(lex); SimpleCalc parser = new SimpleCalc(tokens); try { parser.expr(); } catch (RecognitionException e) { e.printStackTrace(); } } } /*------------------------------------------------------------------ * PARSER RULES *------------------------------------------------------------------*/ expr : term ( ( PLUS | MINUS ) term )* ; term : factor ( ( MULT | DIV ) factor )* ; factor : NUMBER ; /*------------------------------------------------------------------ * LEXER RULES *------------------------------------------------------------------*/ NUMBER : (DIGIT)+ ; WHITESPACE : ( ‘\t‘ | ‘ ‘ | ‘\r‘ | ‘\n‘| ‘\u000C‘ )+ { $channel = HIDDEN; } ; fragment DIGIT : ‘0‘..‘9‘ ; |
---|---|
C# | grammar SimpleCalc; options { language=CSharp; } tokens { PLUS = ‘+‘ ; MINUS = ‘-‘ ; MULT = ‘*‘ ; DIV = ‘/‘ ; } @members { public static void Main(string[] args) { SimpleCalcLexer lex = new SimpleCalcLexer(new ANTLRFileStream(args[0])); CommonTokenStream tokens = new CommonTokenStream(lex); SimpleCalc parser = new SimpleCalc(tokens); try { parser.expr(); } catch (RecognitionException e) { Console.Error.WriteLine(e.StackTrace); } } } /*------------------------------------------------------------------ * PARSER RULES *------------------------------------------------------------------*/ expr : term ( ( PLUS | MINUS ) term )* ; term : factor ( ( MULT | DIV ) factor )* ; factor : NUMBER ; /*------------------------------------------------------------------ * LEXER RULES *------------------------------------------------------------------*/ NUMBER : (DIGIT)+ ; WHITESPACE : ( ‘\t‘ | ‘ ‘ | ‘\r‘ | ‘\n‘| ‘\u000C‘ )+ { $channel = HIDDEN; } ; fragment DIGIT : ‘0‘..‘9‘ ; |
Objective-C | To be written. Volunteers? grammar SimpleCalc; options { language=ObjC; } OR : ‘||‘ ; |
C | To be written. Volunteers? grammar SimpleCalc; options { language=C; } OR : ‘||‘ ; |
java org.antlr.Tool SimpleCalc.g
To be written. Volunteers?
parser grammar SimpleCalc;@header{package antlr3.tutorial.simple;}OR : ‘||‘ ;
Construct | Description | Example |
---|---|---|
(...)* | Kleene closure - matches zero or more occurrences | LETTER DIGIT* - match a LETTER followed by zero or more occurrences of DIGIT |
(...)+ | Positive Kleene closure - matches one or more occurrences | (‘0‘..‘9‘)+ - match one or more occurrences of a numerical digit LETTER (LETTER|DIGIT)+ - match a LETTER followed one or more occurrences of either LETTER or DIGIT |
fragment | fragment in front of a lexer rule instructs ANTLR that the rule is only used as part of another lexer rule (i.e. it only builds a fragment of a recognized token) | fragment DIGIT : ‘0‘..‘9‘ ; NUMBER : (DIGIT)+ (‘.‘ (DIGIT)+ )? ; |
To be written. Volunteers?
Sure, we‘ll use one from the examples-v3 distribution