NAME "String::MatchInterpolate" - named regexp capture and interpolation from the same template. SYNOPSIS use String::MatchInterpolate; my $smi = String::MatchInterpolate->new( 'My name is ${NAME/\w+/}' ); my $vars = $smi->match( "My name is Bob" ); my $name = $vars->{NAME}; print $smi->interpolate( { NAME => "Jim" } ) . "\n"; DESCRIPTION This module provides an object class which represents a string matching and interpolation pattern. It contains named-variable placeholders which include a regexp pattern to match them on. An instance of this class represents a single pattern, which can be matched against or interpolated into. Objects in this class are not modified once constructed; they do not store any runtime state other than data derived arguments passed to the constructor. Template Format The template consists of a string with named variable placeholders embedded in it. It looks similar to a perl or shell string with interpolation: A string here with ${NAME/pattern/} interpolations The embedded variable is delmited by perl-style "${ }" braces, and contains a name and a pattern. The pattern is a normal perl regexp fragment that will be used by the "match()" method. This regexp should not contain any capture brackets "( )" as these will confuse the parsing logic. If the variable is not named, it will be assigned a name based on its position, starting from 1 (i.e. similar to regexp capture buffers). If a variable does not provide a matching pattern but the constructor was given a default with the "default_re" option, this will be used instead. Outside of the embedded variables, the string is interpreted literally; i.e. not as a regexp pattern. A backslash "\" may be used to escape the following character, allowing literal backslashes or dollar signs to be used. The intended use for this object class is that the template strings would come from a configuration file, or some other source of "trusted" input. In the current implementation, there is nothing to stop a carefully-crafted string from containing arbitrary perl code, which would be executed every time the "match()" or "interpolate()" methods are called. (See "SECURITY" section). This fact may be changed in a later version. Suffices By default, the beginning and end of the string match are both anchored. If the "allow_suffix" option is passed to the constructor, then the end of the string is not anchored, and instead, any suffix found by the "match()" method will be returned in a hash key called "_suffix". This may be useful, for example, when matching directory names, URLs, or other cases of strings with unconstrained suffices. The "interpolate()" method will not recognise this hash key; instead just use normal string concatenation on the result. my $userhomematch = String::MatchInterpolate->new( '/home/${USER/\w+/}/', allow_suffix => 1 ); my $vars = $userhomematch->match( "/home/fred/public_html" ); print "Need to fetch file $vars->{_suffix} from $vars->{USER}\n"; CONSTRUCTOR $smi = String::MatchInterpolate->new( $template, %opts ) Constructs a new "String::MatchInterpolate" object that represents the given template and returns it. $template A string containing the template in the format given above %opts A hash containing extra options. The following options are recognised: allow_suffix => BOOL A boolean flag. If true, then the end of the string will not be anchored, and instead, an extra suffix will be allowed to follow the matched portion. It will be returned as "_suffix" by the "match()" method. default_re => Regexp or STRING A precompiled Regexp or string defining a regexp to use if a variable does not provide a pattern of its own. delimiters => ARRAY of [Regexp or STRING] An array containing two precompliled Regexps or strings, giving the variable openning and closing delimiters. These default to "qr/\$\{/" and "qr/\}/" respectively, but by passing other values, other styles of template string may be parsed. delimiters => [ qr/\{/, qr/\}/ ] # To match {name/pattern/} METHODS @values = $smi->match( $str ) $vars = $smi->match( $str ) Attempts to match the given string against the template. In list context it returns a list of the captured variables, or an empty list if the match fails. In scalar context, it returns a HASH reference containing all the captured variables, or undef if the match fails. $str = $smi->interpolate( @values ) $str = $smi->interpolate( \%vars ) Interpolates the given variable values into the template and returns the generated string. The values may either be given as a list of strings, or in a single HASH reference containing named string values. @vars = $smi->vars() Returns the list of variable names defined / used by the template, in the order in which they appear. BENCHMARKS The template is compiled into a pair of strings containing perl code, which implement the matching and interpolation operations using normal perl regexps and string contatenation. These strings are then "eval()"ed into CODE references which the object stores. This makes it faster than a simple regexp that operates over the template string each time a match or interpolation needs to be performed. The following output compares the speed of "String::MatchInterpolate" against both direct hard-coded perl, and simple regexp operations. Comparing 'interpolate': Rate s/// S::MI native s/// 81938/s -- -44% -90% S::MI 145232/s 77% -- -82% native 806800/s 885% 456% -- Comparing 'match': Rate m// S::MI native m// 35354/s -- -46% -73% S::MI 65749/s 86% -- -50% native 131885/s 273% 101% -- (This was produced by the benchmark.pl file in the module's distribution.) SECURITY CONSIDERATIONS Because of the way the optimised match and interpolate functions are generated, it is possible to inject arbitrary perl code via the template given to the constructor. As such, this object should not be used when the source of that template is considered untrusted. Neither the "match()" nor "interpolate()" methods suffer this problem; any input into these is safe from exploit in this way. SEE ALSO The following may be used to provide just "interpolate()"-style operations: * String::Interpolate - Wrapper for builtin the Perl interpolation engine * Text::Sprintf::Named - sprintf-like function with named conversions The following may be used to provide just "match()"-style operations: * Regexp::NamedCaptures - Saves capture results to your own variables * perlre(1) - named capture buffers in perl 5.10 (the "(?<NAME>pattern)" format) AUTHOR Paul Evans <leonerd@leonerd.org.uk>