Five minute introduction to ANTLR 3

This is the home page for the ANTLR 3 space.

What is ANTLR 3?

ANTLR 3 is the latest version of a language processing toolkit that was originally released as PCCTS in the mid-1990s. As was the case then, this release of the ANTLR toolkit advances the state of the art with it‘s new LL(*) parsing engine. ANTLR (ANother Tool for Language Recognition) provides a framework for the generation of recognizers, compilers, and translators from grammatical descriptions. ANTLR grammatical descriptions can optionally include action code written in what is termed the target language (i.e. the implementation language of the source code artifacts generated by ANTLR).

When it was released, PCCTS supported C as it‘s only target language, but through consulting with NeXT Computer, PCCTS supported C++ after 1994. It‘s immediate successor ANTLR 2 supported Java, C# and Python in addition to C++. Although it is still in beta, ANTLR 3 has already demonstrated support for Java, C#, Objective-C, C, C++ and Ruby as target languages. As of July 2006, the Java target is complete and the C#, Objective C, Ruby and C targets are nearly complete. Support for additional target languages including C++, Perl6 and Oberon (yes, Oberon) is either expected or already in progress.

What does ANTLR 3 do?

Put simply, ANTLR 3 generates - the source code for - language processing tools from a grammatical description. To this end, it is commonly categorised as a compiler generator or compiler compiler in the tradition of tools such as Lex/Flex and Yacc/Bison). ANTLR 3 can generate the source code for various tools that can be used to analyze and transform input in the language defined by the input grammar. The basic types of language processing tools that ANTLR can generates are Lexers (a.k.a scanners, tokenizers), Parsers and TreeParsers (a.k.a tree walkers, c.f. visitors).

Why should I use ANTLR 3?

Because it can save you time and resources by automating significant portions of the effort involved in building language processing tools. It is well established that generative tools such as compiler compilers have a major, positive impact on developer productivity. In addition, ANTLR v3‘s improved analysis engine, it‘s significantly enhanced parsing strength via LL(*) parsing with arbitrary lookahead, it‘s vastly improved tree construction rewrite rules and the availability of the simply fantastic AntlrWorks IDE

offers productivity benefits over other comparable generative language processing toolkits.

How do I use ANTLR 3?

1. Get ANTLR 3

Download and install ANTLR 3 from the ANTLR 3 page of the ANTLR website

2. Run ANTLR 3 on a simple grammar

2.1 Create a simple grammar

Java	grammar SimpleCalc; tokens { PLUS = ‘+‘ ; MINUS = ‘-‘ ; MULT = ‘‘ ; DIV = ‘/‘ ; } @members { public static void main(String[] args) throws Exception { SimpleCalcLexer lex = new SimpleCalcLexer(new ANTLRFileStream(args[0])); CommonTokenStream tokens = new CommonTokenStream(lex); SimpleCalc parser = new SimpleCalc(tokens); try { parser.expr(); } catch (RecognitionException e) { e.printStackTrace(); } } } /------------------------------------------------------------------ * PARSER RULES ------------------------------------------------------------------/ expr : term ( ( PLUS \| MINUS ) term )* ; term : factor ( ( MULT \| DIV ) factor )* ; factor : NUMBER ; /------------------------------------------------------------------ LEXER RULES ------------------------------------------------------------------/ NUMBER : (DIGIT)+ ; WHITESPACE : ( ‘\t‘ \| ‘ ‘ \| ‘\r‘ \| ‘\n‘\| ‘\u000C‘ )+ { $channel = HIDDEN; } ; fragment DIGIT : ‘0‘..‘9‘ ;
C#	grammar SimpleCalc; options { language=CSharp; } tokens { PLUS = ‘+‘ ; MINUS = ‘-‘ ; MULT = ‘‘ ; DIV = ‘/‘ ; } @members { public static void Main(string[] args) { SimpleCalcLexer lex = new SimpleCalcLexer(new ANTLRFileStream(args[0])); CommonTokenStream tokens = new CommonTokenStream(lex); SimpleCalc parser = new SimpleCalc(tokens); try { parser.expr(); } catch (RecognitionException e) { Console.Error.WriteLine(e.StackTrace); } } } /------------------------------------------------------------------ * PARSER RULES ------------------------------------------------------------------/ expr : term ( ( PLUS \| MINUS ) term )* ; term : factor ( ( MULT \| DIV ) factor )* ; factor : NUMBER ; /------------------------------------------------------------------ LEXER RULES ------------------------------------------------------------------/ NUMBER : (DIGIT)+ ; WHITESPACE : ( ‘\t‘ \| ‘ ‘ \| ‘\r‘ \| ‘\n‘\| ‘\u000C‘ )+ { $channel = HIDDEN; } ; fragment DIGIT : ‘0‘..‘9‘ ;
Objective-C	To be written. Volunteers? grammar SimpleCalc; options { language=ObjC; } OR : ‘\|\|‘ ;
C	To be written. Volunteers? grammar SimpleCalc; options { language=C; } OR : ‘\|\|‘ ;

2.2 Run ANTLR 3 on the simple grammar

java org.antlr.Tool SimpleCalc.g

2.3 Revisit the simple grammar and learn basic ANTLR 3 syntax

To be written. Volunteers?

parser grammar SimpleCalc;@header{package antlr3.tutorial.simple;}OR : ‘||‘ ;

Construct	Description	Example
`(...)*`	Kleene closure - matches zero or more occurrences	`LETTER DIGIT*` - match a `LETTER` followed by zero or more occurrences of `DIGIT`
`(...)+`	Positive Kleene closure - matches one or more occurrences	`(‘0‘..‘9‘)+` - match one or more occurrences of a numerical digit `LETTER (LETTER\|DIGIT)+` - match a `LETTER` followed one or more occurrences of either `LETTER` or `DIGIT`
`fragment`	`fragment` in front of a lexer rule instructs ANTLR that the rule is only used as part of another lexer rule (i.e. it only builds a fragment of a recognized token)	`fragment` `DIGIT : ‘0‘..‘9‘ ; NUMBER : (DIGIT)+ (‘.‘ (DIGIT)+ )? ;`

How about a more complex ANTLR 3 grammar?

To be written. Volunteers?

Sure, we‘ll use one from the examples-v3 distribution

本站僅提供存儲服務，所有內(nèi)容均由用戶發(fā)布，如發(fā)現(xiàn)有害或侵權內(nèi)容，請點擊舉報。

打開APP，閱讀全文并永久保存查看更多類似文章

Why I don’t use a Parser Generator | Musing Mortoray

language agnostic

Parsing C++ at nobugs.org

ANTLR筆記1

Tony Bai

比開源快30倍的自研SQL Parser設計與實踐

更多類似文章 >>

Java	grammar SimpleCalc; tokens { PLUS = ‘+‘ ; MINUS = ‘-‘ ; MULT = ‘‘ ; DIV = ‘/‘ ; } @members { public static void main(String[] args) throws Exception { SimpleCalcLexer lex = new SimpleCalcLexer(new ANTLRFileStream(args[0])); CommonTokenStream tokens = new CommonTokenStream(lex); SimpleCalc parser = new SimpleCalc(tokens); try { parser.expr(); } catch (RecognitionException e) { e.printStackTrace(); } } } /------------------------------------------------------------------ * PARSER RULES ------------------------------------------------------------------/ expr : term ( ( PLUS \| MINUS ) term )* ; term : factor ( ( MULT \| DIV ) factor )* ; factor : NUMBER ; /------------------------------------------------------------------ LEXER RULES ------------------------------------------------------------------/ NUMBER : (DIGIT)+ ; WHITESPACE : ( ‘\t‘ \| ‘ ‘ \| ‘\r‘ \| ‘\n‘\| ‘\u000C‘ )+ { $channel = HIDDEN; } ; fragment DIGIT : ‘0‘..‘9‘ ;
C#	grammar SimpleCalc; options { language=CSharp; } tokens { PLUS = ‘+‘ ; MINUS = ‘-‘ ; MULT = ‘‘ ; DIV = ‘/‘ ; } @members { public static void Main(string[] args) { SimpleCalcLexer lex = new SimpleCalcLexer(new ANTLRFileStream(args[0])); CommonTokenStream tokens = new CommonTokenStream(lex); SimpleCalc parser = new SimpleCalc(tokens); try { parser.expr(); } catch (RecognitionException e) { Console.Error.WriteLine(e.StackTrace); } } } /------------------------------------------------------------------ * PARSER RULES ------------------------------------------------------------------/ expr : term ( ( PLUS \| MINUS ) term )* ; term : factor ( ( MULT \| DIV ) factor )* ; factor : NUMBER ; /------------------------------------------------------------------ LEXER RULES ------------------------------------------------------------------/ NUMBER : (DIGIT)+ ; WHITESPACE : ( ‘\t‘ \| ‘ ‘ \| ‘\r‘ \| ‘\n‘\| ‘\u000C‘ )+ { $channel = HIDDEN; } ; fragment DIGIT : ‘0‘..‘9‘ ;
Objective-C	To be written. Volunteers? grammar SimpleCalc; options { language=ObjC; } OR : ‘\|\|‘ ;
C	To be written. Volunteers? grammar SimpleCalc; options { language=C; } OR : ‘\|\|‘ ;

国产一级a片免费看高清,亚洲熟女中文字幕在线视频,黄三级高清在线播放,免费黄色视频在线看