[程式] lex基本概念-Practice之2

**Table 3: Lex Predefined Variables**
Name	Function
`int yylex(void)`	call to invoke lexer, returns token
`char *yytext`	pointer to matched string
`yyleng`	length of matched string
`yylval`	value associated with token
`int yywrap(void)`	wrapup, return 1 if done, 0 if not done
`FILE *yyout`	output file
`FILE *yyin`	input file
`INITIAL`	initial start condition
`BEGIN condition`	switch start condition
`ECHO`	write matched string

這個程式什麼事都不做。
所有的輸入都會比對，但是不會對任何樣式(pattern)做任何的關連動作，所以沒有任何輸出。

%%
.
\n

以下的例子先考慮行號給每一行，lex的實作中的預先定義和計算用yylineno表示。
lex輸入檔案指令為yyin且預設為stdin.

%{
int yylineno;
%}

^(.*)\n printf(“%4d\t%s”, ++yylineno, yytext);

%%

int main(int argc, char *argv[]) {
    yyin = fopen(argv[1], “r”);
    yylex();
    fclose(yyin);
}

定義段落由substitutions,code和start states組成，Code在定義段落中將會保持原樣的複製到C檔案並且用%{和%}包起來。
而替換區(substitutions)可以簡單的表示樣式比對(pattern-matching)的規則。
舉例而言，我們可以定義數字和字母：
digit [0-9]
letter [A-Za-z]

%{
int count;
%}

%%
/* match identifier */
{letter}({letter}|{digit})* count++;

%%
int main(void) {
    yylex();
    printf(“number of identifiers = %d\n”, count);
    return 0;
}

空白處必須分為定義術語(defining term)和關連敘述(associated expression)
在規則段落(rule section)當中，使用大括號包圍替換區語句來辨別文字。如{letter}
當我們在規則段落(rules section)中有一個配對，則相關的C程式碼會被執行。

這是一個掃描器(sacnner)可以計算一個檔案中型別、單字、行的數量(有點類似Unix wc)。

%{
int nchar, nword, nline;
%}
%%
\n         { nline++; nchar++; }
[^ \t\n]+  { nword++, nchar += yyleng; }
.          { nchar++; }
%%
int main(void) {
yylex();
printf("%d\t%d\t%d\n", nchar, nword, nline);
return 0;
}

About the author

蕾咪

Leave a Comment X

About the author

蕾咪

You may also like

Leave a Comment X