Pattern Matching

3 Pattern Matching

GXPARSE pattern-action rules that can be used like similar rules in awk and perl.

3.1 Recursive Pattern Matching

The scanner can match recursively, because a ScanRule can invoke other ScanRules before returning a sequence of characters as an instance of MatchedText. The example below shows how to use recursive invocation of ScanRules to recognize nested sets of opening and closing parentheses. The essential feature of this example is that ParenText invokes ParenInterior, which can invoke ParenText again to recognize an interior set of parentheses. The inner ParenText will report a match when it sees the closing parenthesis of the interior set and the outer ParenText will report a match when it sees the closing parenthesis of the exterior set.

    /*
     * The classes in this example can use static initialization
     * (delegation to single ScanMatch instances) because the
     * ScanMatch instances are immutable.  Static initialization
     * is required to prevent recursive invocation of the
     * constructors of ParenText and ParenInterior.
     */

    /** Match parentheses and nested parentheses. */
    private static class ParenText implements ScanMatch
    {
        private final static ScanMatch scanMatch
                = quantifier.concat(new TextOpenParen(),
                                    new ParenInterior(),
                                    new TextCloseParen());

        public MatchedText match(ScanBuffer scanBuffer)
                throws IOException
        {
            return scanMatch.match(scanBuffer);
        }
    }

    /** match any text except parentheses. */
    private static class TextExceptParen implements ScanMatch
    {
        private final static ScanMatch regexMatch
                = regexFactory.newRegexMatch("[^()]+");

        public MatchedText match(ScanBuffer scanBuffer)
                throws IOException
        {   return regexMatch.match(scanBuffer); }
    }

    /** Match one "open" parenthesis character. */
    private static class TextOpenParen implements ScanMatch
    {
        private final static ScanMatch regexMatch
                = regexFactory.newRegexMatch("\\(");

        public MatchedText match(ScanBuffer scanBuffer)
                throws IOException
        {   return regexMatch.match(scanBuffer); }
    }

    /** Match one "close" parenthesis character. */
    private static class TextCloseParen implements ScanMatch
    {
        private final static ScanMatch regexMatch
                = regexFactory.newRegexMatch("\\)");

        public MatchedText match(ScanBuffer scanBuffer)
                throws IOException
        {   return regexMatch.match(scanBuffer); }
    }

    /** Match the interior of parenthesized text, including any
     * nested parenthesized text. Interior may be of zero length.
     */
    private static class ParenInterior implements ScanMatch
    {
        private final static ScanMatch interiorMatch
                = quantifier.anyOf(new ParenText(),
                                   new TextExceptParen());
        private final static ScanMatch scanMatch
                = quantifier.zeroOrMore(interiorMatch);

        public MatchedText match(ScanBuffer scanBuffer)
                throws IOException
        {
            return scanMatch.match(scanBuffer);
        }
    }