![]() |
Sprex![]() |
||||||
|
| |||||||
|
News: Downloadable PDA Demos
Introduction |
gxc is a grammar/action compiler.
gxc converts a grammar with actions into a speech recognition
word lattice (HTK SLF format .lat file) and a C program that
carries out actions associated with the recognition results.
The grammar and actions are provided in a .gx (grammar with actions,
a.k.a. GX) input file, written by you, the developer.
A GX file contains a datatype declaration section, followed by a
grammar/action section. The datatype declarations specify the
return values for each grammar rule's actions.
The grammar/action section defines the grammar for the recognizer in
terms of an ordered set of rules, each of which has one or more
associated actions in the form of C code.
Let's look at a simple example in order to see how GX grammars work.
We'll explain what you need to know about simple.gx over the next few
paragraphs.
Line# ------------ begin simple.gx -----------------
1 TYPE "int" $YES $NO $ANSWER
2 %%
3 $YES = yes | yep | yeah | uhuh {1} | yes sir {1};
4 $NO = no | nope | nah | nuh_uh {0} | no sir {0};
5 $ANSWER = $YES {confirm_yes(); process_answer($1); $1}
6 | $NO {confirm_no(); process_answer($1); $1} ;
7 // comment, end of file
------------ end simple.gx ------------------
Line 1 says that all actions for $YES, $NO, and $ANSWER must return
an integer. (gxc needs to know a datatype for every action.)
Line 2, containing nothing but %%, separates the datatype
declaration section from the grammar/action section.
Line 3 says that if you see any of the words "yes, "yep", "yeah",
"uhuh", or "yes sir" then you have an instance of $YES; the
associated action is pretty dumb, it just says return a 1 (which is
indeed an integer, as we promised in the TYPE line, line 1).
Similarly Line 4 says that any of "no", "nope", "nah", and "nuh_uh"
and "no sir" count as a $NO.
Line 7 is a comment, which extends from // to the end of the line.
Comments are considered to be whitespace and are ignored.
Lines 5 & 6 make use of the other lines, in a particular way.
$YES and $NO tell the recognizer what words exactly to look for when
it needs a $YES or a $NO. So they define the *RULES* that the
recognizer must follow. Rules can be named by any string of letters,
numbers, and underscore, beginning with $ and a letter. So $YES, $NO,
$ANSWER, $X and $THIS_IS_A_LONG_RULE_NAME_01, are all rule names. By
convention rulenames are in ALL CAPS.
EVERY RULE HAS TO BE DEFINED BEFORE IT CAN BE USED IN ANOTHER RULE.
So $YES is defined before it is itself used in the subsequent
definition of the $ANSWER rule in lines 5-6. Thinking backwards, the
last rule must be the one to contain everything else. Since it is the
last one, it can never be used in any other rule; instead, the last
rule is taken as the all-inclusive definition of the whole grammar.
That is, the purpose of the earlier rules is to make it easy for the
last rule to specify the complete list of things to listen for.
Whereas the purpose of the very last rule is to tell the recognizer
what its job is: Look for an $ANSWER. The recognizer ends up looking
for particular words and sequences of words because the all-inclusive
final rule can be expanded using other previously-defined rules, and
those also expanded, until finally all you have left is a set of
alternative sequences of actual words.
The $ANSWER rule has a couple of other things going on. First, it
contains multiple actions: one action for the $YES alternative and an
different action for the $NO alternative. In general the different
alternatives can have their own distinct actions.
Second, actions can contain special variables, named $1, $2, $3, etc.
(the number counts out the elements in the sequence). These variables
refer to the values of items found the expansion of the rule. Since
alternatives might have different numbers of words, $1 might refer to
something completely wrong for the other alternative.
We need a technical vocabulary for talking about these things. Useful
terms include "rule", "rule name", "expansion", "element", "action",
"alternative", "TYPE", "value", and "action variable". Let's consider
each one, first with reference to an example, and then by defining it
more precisely. For an example, let's re-examine line 3 from simple.gx.
Here it is underscored with different words corresponding to different
pieces, that we would like to have clear names for.
line 3: $YES = yes | yes sir | yep | yeah | uhuh {1} ;
--------------------- rule -----------------------------
rulename ------------------expansion-----------
alt --alt-- alt alt ---alt---
ele ele ele ele ele ele act
"rule": The whole line is a "rule". A rule is made up of a
rule name, "=", an "expansion", and ";".
"rule name": $YES is the "rule name". A rule name begins with $,
followed by a letter, followed by any string of letters, numbers, and
underscore. A rule name in one place is used to define a rule, and
in another place is used to refer to that rule.
"expansion": Everything between "=" and ";" is the "expansion".
An expansion is a sequence of "alternatives" separated by "|".
"alternative": Here "yes", "yes sir", "yep", etc., are each different
"alternatives". Alternatives are the parts of an expansion which are
separated by the symbol, "|" (read it as "or"). (So the last
"alternative" is not "uhuh", but "uhuh {1}", including the action "{1}").
Alternatives *contain* actions, which allows us to say that
two alternatives can "have" two different actions. Each alternative
is a sequence of elements, followed optionally by an action.
"element": "yes sir" is a sequence of two elements, and within the
definition of $ANSWER, $YES is an "element" (but when it occurs in
the definition of $YES, we won't call it an "element"). So an
"element" is a literal word or a rule name; rules are expanded
into their various alternatives's element sequences.
"action": "{1}" is an "action", which is to say, it corresponds to a
function which gxc will generate for you, in this case a very
simple function which does nothing but return a 1. So an "action" is
a curly-brace-bracketed bit of code, which will be automatically
converted into a function by gxc. An action is an optional part of
an alternative. The action's code is converted directly into a C
function by adding a declaration including its return data type,
a unique function name, its arguments and their data types, and
by replacing the last expression, E, with "return E;". The
last expression in an action is going to be its return value;
it should NOT be followed by a semicolon, and its data type should
match the TYPE for the rule.
So the hierarchy is this: a rule contains an expansion, which contains
one or more alternatives, each of which can contain an action.
"TYPE": the TYPE for $YES, $NO, and $ANSWER is "int". A TYPE is
a C data type in double-quotes, declared for a rule at the beginning
of the .gx file.
"value": "sir"'s value is just the string, "sir". $YES's value is the
integer 1 (namely, the last expression in its action, which must have
the right TYPE). Actions need to do things with actual values,
whether strings, numbers or other data types; all of them are derived
from what the recognizer heard. So every element, whether word or
rule, must have a value, so that actions can use them. Literal words
have a boring value, namely the literal string itself. Rules have more
interesting values, which are of a declared TYPE, and which are returned
by the action function associated with a particular alternative expansion of
the rule. When you're writing all those actions, don't forget
that every action function associated with any alternative expansion
of a single rule must have the same return data type, namely the
rule's TYPE. Otherwise there will be big trouble!
"action variable": in "process_results($1)" in line 6, $1 is an action
variable, referring to the value of the 1st element in that particular
alternative, namely $NO's value. $2, $3, etc., would refer to the
values of the 2nd, 3rd, etc., elements in the sequence within a
particular multi-element alternative.
Action variables are very useful; they make it possible to
build very smart and powerful systems. They are used to pass
data into actions, perhaps from other actions that have been
executed elsewhere in the rule hierarchy. The way it works is
that each action variable which is found to be mentioned in an action
is converted into an argument for the function that is automatically
derived from the action. So for example, the automatically-generated
stub function for the second alternative in $ANSWER, on line 6,
will look (effectively) like this:
int ANSWER_2(int x1) { confirm_no(); process_answer(x1); return x1; }
Using the action as a starting point, a valid C function was created
by adding a function name, arguments, datatypes, and return statement.
gxc got the function's return type from the TYPE line;
it created the function name from the rule name
combined with which numbered alternative it is, within the rule;
it got the datatype for its argument (the $1 element, namely $NO)
from the TYPE declaration for $NO;
it substituted the argument name x1 for every place $1 occurrs,
and it modified the last statement to be a return statement of the
appropriate datatype.
gxc needs to make things line up in both the recognizer and the
stub results processor, so it keeps the task of setting the function names,
data types, and return values for itself, without letting you name the
functions or declare the action variables so that things can't be
inconsistent.
All you need to remember is that the action must be written as a valid
block of C statements, except that you can use action variables
and instead of writing "return X;" at the end, just write "X".
The other automatically generated functions will look something like
this:
int YES_1(void) { return 1; }
int NO_1(void) { return 0; }
int ANSWER_1(int x1) { confirm_yes(); process_answer(x1); return x1; }
The stub program is made to take a recognition result representing
some multi-level expansion of the top level rule, and to execute the
actions associated with each expansion, finding the function
from and passing it the recognized words (and other action-results)
according to the action variables you used in writing the action.
Thus for example, given the recognition result
"yes sir"
it will execute the function hierarchy:
ANSWER_1(YES_1());
The action variable $1 is used to pass the value of the element
$YES (namely the return value of the YES_1() function) to be used
in the appropriate action for $ANSWER, namely as an argument to
the ANSWER_1(int x1) function.
To be precise: An action variable is named with $ followed by a small
positive number. When gxc finds an action variable in an action,
it creates an argument for that action's auto-generated function, by
which the value of the corresponding element can be passed in to the
action's function. This argument's data type is declared as the TYPE
specified for the corresponding element (or char * in the default case
of a literal word). And wherever the action variable is used within
the action, the auto-generated function will have an instance of the
variable argument to the function. In addition, before calling the
action's function, the arguments are filled in with the values
associated with the 1st, 2nd, 3rd, etc., elements, whether those are
plain strings (if the element is a literal word), or with other data
values which have been returned by another rule's action function (if
the element is a rule name).
Footnote: the above description is conceptually correct, but
in implementation, it was found necessary to pass pointers
instead of values. So the various action variables for an
actions are passed into the function via a single vector of
data pointers rather than as a comma-separated list of
variables, which allows all action functions to have the same
argument structure and to be handled uniformly by the calling
code. And the return value of an action is not the value of
the action's final expression, but a pointer to a memory
location sized to hold that value, into which that value has
been copied. Since returned values are fed upstairs to
higher-level actions using these same pointers, this should be
transparent to the user.
Formal Summary of GX File Format
A GX file is a text file comprising a datatype declaration section,
a line containing nothing but "%%", and a grammar/action section.
The datatype declaration section is a sequence of lines
specifying the type for each rulename, in the following format:
TYPE \"type string\" rulename [rulename ..]
The type string is used in the stub program to declare the
return type of C functions which implement the actions of
the named rule. It also is used for runtime type checking
during action processing.
Since the type string may contain spaces, as in \"char *\", it
must be bounded by quotation marks (\"). No quotes is an error.
The grammar/action section is comprised of one or more rules.
A rule is a rulename, equals sign(=), expansion, and a semicolon(;)
A rulename is a dollar sign ($), followed by a name.
A name is a sequence of letters, numbers and underscore (_)
but it must start with a letter.
An expansion is a sequence of alternatives separated by the pipe sign(|).
An alternative is a sequence of names and rulenames
followed by curly-brace-bracketed C code ({..}), call it an ACTION,
modified as described below.
The ACTION may be omitted in which case the next occurring ACTION
in a subsequent alternative for that rule will be used;
if there is no explicit ACTION in the remainder of the rule's expansion,
then the default ACTION, { return (char *)value; }, is assumed, where
value is the concatenation of the string-valued tokens & non-terminals
in the expansion. (It is an error for an ACTION-less expansion to
contain a nonterminal with a data type that is not a (char *)string;
if you want to return some other data type as the value of that
non-terminal, then you should provide a ACTION to interpret it.
The ACTION may contain references of the form $N, where N is
a number, which pick out the Nth (counting from 1) element (token or
rulename) in that alternative, and which are substituted for by variables
of the appropriate type (char * for tokens, TYPE types for nonterminals),
whose values are provided to the function through pointers to the value
of that element (a string in the case of tokens, and another ACTION's
return value in the case of non-terminals).
The ACTION must contain a final expression, not followed by a semicolon
(though other statements within the ACTION are separated from each other
by a semicolon). Call that expression E. E is replaced
by "return (type string) E;", where "type string" was TYPE declared
for the rule associated with the ACTION. It is an error when
the declared TYPE does not match the data type of the final expression.
Hopefully gxc will catch that, but don't count on it: check that
final expressions have the TYPE data type.
The ACTION is used to construct a function named by the rulename and
a sequence number.
The arguments to the function will be the return values of those
elements in the expansion which have been referred to using $N
expressions in the C code. The $N expressions will be replaced by
variable names. Those variables' data type declarations will be,
for tokens, "char *", and for rules, the return value of the rule
given in TYPE statements.
|
||||||
Copyright © 1996-2005
Sprex, Inc.
All rights reserved. Sprex, Speech in the Network, TallyGram and ANSR are trademarks of Sprex, Inc.
|