Overview of LEX and YACC - Department of Computer Engineering

Page 1

International Institute of Information Technology, Pune Department of Computer Engineering

Systems Programming & Operating Systems

Unit – III

Case Study: Overview of LEX and YACC Prof. Deptii Chaudhari Assistant Professor Department of Computer Engineering

1


LEX & YACC • What is Lex? • Lex is officially known as a "Lexical Analyser". • It's main job is to break up an input stream into more usable elements. • Or in, other words, to identify the "interesting bits" in a text file. • What is Yacc? • Yacc is officially known as a "parser". • In the course of it's normal work, the parser also verifies that the input is syntactically sound. • YACC stands for "Yet Another Compiler Compiler". This is because this kind of analysis of text files is normally associated with writing compilers.

Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 2 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 3 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


LEX Program Structure Definitions

%{ C global variables, prototype, Comments %}

Production Rules

%% ------------------------------------%%

User Subroutine Section (Optional) Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 4 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


• In the rules section, each rule is made up of two parts : a pattern and an action separated by whitespace. • The lexer that lex generates will execute the action when it recognizes the pattern. • The user subroutine section, consists of any legal C code. • Lex copies it to the C file after the end of the lex generated code. • Lex translates the Lex specification into C source file called lex.yy.c which we compile and link with lex library –ll. • Then we can execute the resulting program to check that it works as we expected.

Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 5 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


Example %{ #include <stdio.h> %} %% [0123456789]+

printf("NUMBER\n");

[a-zA-Z][a-zA-Z0-9]* printf("WORD\n"); %% • Running the Program $ lex example_lex.l gcc lex.yy.c –ll Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park ./a.out MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in

6


Pattern Matching Primitives Metacharacter Matches . any character except newline \n newline * zero or more copies of the preceding expression + one or more copies of the preceding expression ? zero or one copy of the preceding expression ^ beginning of line $ end of line a|b a or b (ab)+ one or more copies of ab (grouping) "a+b" literal "a+b" (C escapes still work) [] character class

Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 7 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


Pattern Matching Examples Expression abc abc* abc+ a(bc)+ a(bc)? [abc] [a-z] [a\-z] [-az] [A-Za-z0-9]+ [ \t\n]+ [^ab] [a^b] [a|b] a|b

Matches abc ab abc abcc abccc ... abc, abcc, abccc, abcccc, ... abc, abcbc, abcbcbc, ... a, abc one of: a, b, c any letter, a through z one of: a, -, z one of: - a z one or more alphanumeric characters whitespace anything except: a, b a, ^, b a, |, b a, b

Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 8 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


Operation of yylex() • When lex compiles the input specification, it generates the C file lex.yy.c that contains the routine int yylex(void). • This routine reads the input string trying to match it with any of the token patterns specified in the rules section. • On a match associated action is executed. • When we call yylex() function, it starts the process of pattern matching. • Lex keeps the matched string into the address pointed by pointer yytext. • Matched string's length is kept in yyleng while value of token Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park is Deptii kept in MIDC variable Phase 1, Hinjawadi,yylval. Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in 9


%{ int com=0; %} %% "/*"[^\n]+"*/" {com++;fprintf(yyout, " ");} %% int main() { printf("Write a C program\n"); yyout=fopen("output", "w"); yylex(); printf("Comment=%d\n",com); return 0; }

$ cc lex.yy.c -ll $ ./a.out Write a C program #include<stdio.h> int main() { int a, b; /*float c;*/ printf(“Hi”); /*printf(“Hello”);*/ } Comment=2 $ cat output #include<stdio.h> int main() { int a, b; printf(“Hi”); }

Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 10 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


Lex Predefined Variables

Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 11 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


YACC • YACC is a parser generator that takes an input file with an attribute-enriched BNF (Backus – Naur Form) grammar specification. • It generates the output C file y.tab.c containing the function int yyparse(void) that implements its parser. • This function automatically invokes yylex() everytime it needs a token to continue parsing. Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 12 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 13 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


Structure of YACC Program Definitions Context free grammar & action for each production

%{ C global variables, prototype, Comments %} %% ------------------------------------%%

Subroutines/Functions

Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 14 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


Arithmatic.l %{ #include<stdio.h> #include "y.tab.h" extern int yylval; %} %% [0-9]+ { yylval=atoi(yytext); return NUMBER; } [\t] ; [\n] return 0; . return yytext[0]; %% int yywrap() { return 1;}

How To Run: $yacc -d arithmatic.y $lex arithmatic.l $gcc lex.yy.c y.tab.c $./a.out

Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 15 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


References • https://www.epaperpress.com/lexandyacc/ • John. R. Levine, Tony Mason and Doug Brown - Lex and Yacc‖, O'Reilly

Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park 16 MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in


THANK YOU For further details, please contact Deptii Chaudhari deptiic@isquareit.edu.in Department of Computer Engineering Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 www.isquareit.edu.in | info@isquareit.edu.in

17


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.