Stream I/O Examples

Most of the Java stream classes support sequential input and output but the sequence of XML input is not always directly related to the sequence of the required output. Moreover, the processing required for any part of the imput may be influenced by content that occurs later in the document. The package ca.gorman.io provides helper classes support the use of Java stream classes in XML applications that require nonsequential processing.

A WriterStack can be used to reverse the order of lines from an input file. Before writing each line to the WriterStack, a CharArrayWriter is pushed on the WriterStack. After all lines have been read, each CharArrayWriter is popped and copied to the final output. The method that writes the line to the WriterStack has no knowledge of the actual destination.

A Resequencer (implemented here by a ResequencingWriter) can be used as a substitute for random access to the output stream. When data is required at a particular point in the output stream, but is not available, a place marker is written instead. The value of the place marker can be set at any time before or after it is written. When the Resequencer is closed, all of the markers are replaced by their corresponding values. The final result is the same as if the output stream (which is a Writer) had been written with random access.

Stream Examples
Example Java code Input Output
Using WriterStack to reverse the order of lines in a file StackLines.java lines.txt stacklines.txt
Using ResequencingWriter to support nonsequential output to a Writer ResequencingWriterDemo.java lines.txt resequencingdemo.txt

Pattern Matching Examples

Java supports regular expression pattern matching on java.lang.CharSequence, which is a superclass of java.lang.String and of java.nio.CharBuffer.

The pattern-action rules operate in a manner very similar to the rules in awk and perl, where a pattern is followed by a code block that is to be executed on the input that matches the pattern. The principal differences are:

  1. The rules are mutually exclusive, if two rules can recognize the same sequence of characters, the first rule in the list is the only one to do so.
  2. The rules support recursive pattern matching. A rule can be written to recognize text enclosed by pairs of parentheses, and to re-invoke itself if that text itself contains text enclosed by a pair of parentheses. The outer invocation will be suspended until the inner invocation is complete, and will then resume, thereby recognizing the complete nested sets of parentheses.

Package ca.gorman.util.scan extends pattern matching to include the application of multiple pattern-action rules to a CharBuffer or a Reader, producing output to a Appendable or Writer. The package can be used by itself, or with the XML parsing package.

The reference implementation is based on java.util.regex, but can handle multiple patterns in the same pass, and is sufficiently powerful and flexible to do recursive-descent parsing of an input stream.

Pattern Matching
Example Java code Input Output
Implementing a Four-Function Calculator with Pattern Rules (Still In Development) Calculator.java Not Available Yet Not Available Yet
JUnit test for recognizing nested parentheses. (This will be replaced later by an example.) NestedParenthesesTest.java Input is part of test Output is part of test

XML Parsing Examples

GXPARSE allows a programmer to use a sequential processing paradigm (like SAX) while treating elements and other structures as single objects (like DOM) and provides easy access to structural information (like DOM).

The Elements example illustrates the basic principles of parsing with GXPARSE.

Idrefs is a more elaborate example showing how to do more complex processing while still working in a stream-processing paradigm. A Resequencer is used to support the processing of ID and IDREF attributes without the need for complex programming. A WriterStack is used to simplify programming by allowing temporary redirection of output. An ElementMapper makes it unnecessary to test for the name of every element.

Taglist uses WriterStacks and a Resequencer to print summary information about an XML document.

XML Parsing
Example Java code Input Output
Translate XML to plain text, giving some elements special handling and and passing the rest through a single handler Elements.java example.xml elements.txt
Handle ID and IDREF attributes using Resequencer, WriterStack, and ElementMapper as described above Idrefs.java example.xml idrefs.txt
List element tags with element character count Taglist.java example.xml taglist.txt

Docbook XML Examples

The following examples are taken from the docbook-to-HTML translator that is used to produce the HTML version of the GXPARSE user manual. They show how a docbook itemizedlist with listitem members is translated to a HTML unordered list (UL) with LI elements.

Because of the large number of elements, elements are mapped to classes, instead to methods. In this mapping, namespaces map to packages, instead of to classes. It is not necessary to specify a handler for every element, if the unspecified elements can all be handled in the same way by a default handler.

Docbook XML Examples
Description Java code
Setting up and running the parser Transform.java
Invoking the ElementMapper DocbookListener.java
Processing itemizedlist Element_itemizedlist.java
Processing listitem Element_listitem.java
Processing an element that has no declared handler DefaultHandler.java
Discarding the content of an element Element_author.java