A2E::SArb - Parsing a String into a Syntactic Tree, recursively defined as a reference to a list of trees
Parse and transform lines of text that contain nested parenthesese, function calls and the like, into semantic tree structures of arbitrarily many levels of hierarchy.
Analyse text where regular expressions are not good enough and generating a full-fledged Yacc-based parser would be overkill.
Usually this is done by writing a subclass to A2E::SArb, see e.g. A2E::SArb::Make and A2E::SArb::MLHT.
use A2E::SArb::Make;
my $vars = {};
my $p = new_ready A2E::SArb::Make vars => $vars;
$vars->{lib} = 'A2E::Prog';
$p->transform('my library is ${lib}')
=> my library is A2E::Prog
Of course much more complicated nested parsing is possible, see e.g. the A2E::SArb::Make application which imitates and extends the syntax of GNU Make.
http://a2e.de/adv/perl/A2E/SArb.pm
A2E::Prog(3)
Scalar::Util(3)
Data::Dumper(3)
A2E::SArb::MLHT(3)
A2E::SArb::Make(3)
A2E::SArb::Deplate(3)
This allows plugins but is line-based and carries a lot of baggage that made it seem less likely to be efficient for the applications envisaged here. The others are again probably overkill for the purpose of line-to-tree parsing envisaged here.
perl-byacc is not on CPAN as it is a C-based yacc compiler that produces perl programs.
Hartmut Pilch
Copyright (c) 2008 Hartmut Pilch (phm at a2e de)
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Register a parser, i.e. a function that will try to find a syntactic expression at a given position and return the new position, to which has advanced if successful, and the syntactic expression ($arb) which it found.
Currently 'try_paren' and 'try_keyvals' are defined here, and 'make' is defined in A2E::SArb::Make.
$m->switch_parser('paren', 1)
$m->switch_parser('paren', 0)
activate and deactivate a parser
Try to find a syntactic expression at the position p and return
If nothing was found, p0 is unchanged, p moves ahead by 1, and arb is undefined.
Find an expression like
a=1 b="2.0"
and render it as [ 'keyvals', 'a', '1', 'b', '2.0' ].
This is [probably] designed for Deplate and not useful for MLHT.
Return a pre-stored variable value; common subroutine of getfun and getarb.
args := typ, nom, vars
rets := val
typ => 'arbs' or 'vars' or 'funs', default 'vars'
typ :: name of global symbol table to search in
nom => 'url'
nom :: variable name
vars => { url => 'http://a2e.de' }
vars :: local symbol table
val :: string value
val => 'http://a2e.de'
Return stored variable value arg1 is key, arg2 is local symbol table (hashref)
Return the already-parsed arb corresponding to nom or parse+store+return it.
args := nom, vars?
rets := arb
arb :: tree structure corresponding to a dynamic variable (function)
arg1 is key, arg2 is local symbol table (hashref)
Help deal with constructions like
@tr = |pet|${TRIZ}|ahurl+${url}|
i.e. evaluate variables therein and allow recursion into multilayer structure on the last function argument, which is dealt with as lastfun
in pred
.
This takes the ahurl
verb in a construction like
$(ahurl|http://a2e.de|Home)
or, after definition of
oam2osm := $(fill oam2osmS,$(s|ahurl|../spez))
the locally defined s
verb in the value of oas2osmS
as first argument and returns a corresponding Perl function with its initial arguments.
args := verb, vars
verb #= 'ahurl' or 's' from examples above
vars := hashref((nom => lst)+)
rets := funkorp, funarg+
funkorp := perl_function_code
funarg := commandline arguments of funkorp that come before any user-supplied command arguments
analyse a string into a tree/arb
Recurse into a potentially multilevel last argument as in
@tr = |pet|${TRIZ}|ahurl+${url}|
@shout = |underline|+boldface+ahurl/${url}+|
such that the text arguments of this function are applied to the inner-most level.
This is a subroutine of pred
which is supported by subroutine maptrafo
of getfuns
.
Render the syntactic tree arg2 into a string in the target format, using the local definitions found hashref arg1 as well as the global ones found in the namespaces $m->{vars}, $m->{arbs} and $m->{funs} of the object.
This is recursively invoked by many of the command verbs.
Parse and render: like render, but starts from a string (in the source format) rather than a tree as first argument. Do this only once.
Parse and render: like render, but starts from a string (in the source format) rather than a tree as first argument. Keep transforming until the result string is the same as the source string, i.e. until there is nothing more to parse.
This is now being used experimentally in the _* variables aka dynavars of A2E::Tmplfil
Used in directives like
itemlist_join = $(aresto|${BRA}list type=ul:|1|join|:call:ul_item|1|${ARB})
special miniverb ul itemlist_join
to store the special short form 'ul' of verb 'itemlist_join' in a hashref $m->{miniverbs} that is consulted (by getfuns) when a function is applied to text, allowing users to put some special verbs into a namespace where they do not interfere with lingual text chunks (lits). The local miniverb definitions performed by the fill function work in the same way and take precedence over the global miniverb definitions performed here. However the global miniverbs are pre-compiled into a tree and thus execute faster. This also means that in order to change the verb's meaning later, the miniverb assignement also has to be invoked once more.
This requires that function $m->{funs_intern}->{arb_args_call} is defined by the implementing library.
Support for something like the following
ul = <itemizedlist><title>${1}</title>${2@<listitem><para>@</para></listitem>}</itemizedlist>
will require an extension of &A2E::SArb::Make::pars_make, &A2E::SArb::Make::render_call and possibly &A2E::SArb::render, see A2E::SArb::Make(3) for more.
Hey! The above document had some coding errors, which are explained below:
=over should be: '=over' or '=over positive_number'
You forgot a '=back' before '=head3'
=back without =over