Scanner for the EPSP Database Management Programming Project
Implementation of the project usually requires some code that can read
and parse input, e.g. to determine what command is being issued and the
command's arguments.
This process, while not difficult, can sometimes be a time-consuming
process. I have prepared some C code that you can use as a starting-point
for writing the scanning functions that you might need.
The implementation of the scanner is an example of a general class
of parsing and scanning techniques known as "recursive-descent parsing".
The organization and structure of the code is dictated by the formal
grammar that describes the input that the code is to process.
Source code
Download the following zip file. It
contains the following files:
- epspscan.c
- Source code of the scanner
- epspscan.h
- include file with scanner's declarations
- simpdemo.cpp
- Simple program demonstrating how to use the scanner.
- hwdemo.c
- A more complex example showing how to create the parser for
one of the sections of the final project (reading orders). Read
the comments at the top to see what kind of input it
expects. Terminate it by typing Ctrl-Z.
Warning
If you use the EPSP Scanner, you should not use any other input
functions. Get_Token will be enough for you. Otherwise
you might experience unexpected behaviour from your application.
Definitions
token_type
enum token_type {ALPHA, NUMERIC, OTHER, END_OF_LINE, END_OF_FILE};
A token can be of 5 different types:
- ALPHA
- alphanumeric: it starts with an alphabetic character (as defined
by the C++ function alpha --check its documentation--) followed
by an alphabetic or numeric character.
- NUMERIC
- a sequence of numbers. It can be preceded by a + or - sign
- END_OF_LINE
- the scanner found an end of line
- END_OF_FILE
- it found the end of the input
- OTHER
- Anything else falls into this category.
Spaces and tabs are used as separators and are not included in any token.
Get_Token
token_type token_type Get_Token(char *token);
It reads the next token and returns its type. Its parameter is set
to the value of the token (as a string). Get_Token assumes
that there is space allocated for the token, it does not allocate
memory for it
Get_Token returns the type of token read (see token_type
above).
Reset_Get_Token
Sometimes, when an error is detected, it is desirable to skip until
the beginning of the next line. Reset_Get_Token resets the
scanner and the next token read is at the beginning of the next line.
Original implementation: Daniel M. German, Computer Systems Group, University of Waterloo
Date: May 26, 1999
Current maintenance: TRG