It is easy for a single designer to overlook some need or problem. If you see such a need or problem, please post your comments to Feature requests or Bugs
This is an interim release with extensive changes to the pattern-matching package and some changes to the tests, but is not fully tested.
The scanning/pattern-matching package is still sparsely documented in the
user guide, but the JUnit tests (package
ca.gorman.util.scan.junit
) provides some useful programming
examples.
Change | Impact |
---|---|
Extensive redesign of packages ca.gorman.util.scan and
ca.gorman.util.scan.spi . Includes new classes and
interfaces: AbstractScanRule ,
AsynchronousScanState , Modifier ,
ScanState AbstractScanState |
Easier to use, more flexible, supports asynchronous scanning, but breaks previous application code |
Revised junit testing of pattern-matching/scanning package, new
ScannerExeceptionTest . |
No impact on appplication code. |
Changed javadoc in ca.gorman.xml.parse , classes
Element , ElementListener ,
ElementMapper , ListenerException , and
ListenerRuntimeException |
None. |
New interface ca.gorman.xml.parse.NamespaceMapper extends
ca.gorman.xml.parse.ElementMapper |
To support combination of element mappers, see
ca.gorman.xml.parse.util.NamespacePackageMapper and
ca.gorman.xml.parse.util.ElementClassMapper |
Added XML support classes to ca.gorman.xml.parse.util :
ElementClassMapper and NamespacePackageMapper .
| Support for automatic recognition and loading of element handlers |
This release includes design changes to package
ca.gorman.util.scan
to simplify the code required for pattern
matching. The default implementation, based on java.util.regex
,
now supports "possessive" pattern matching with quantifiers and logical
operators, as described in the javadoc for
java.util.regex.Pattern
. It is not possible to fully suppport
"greedy" or "reluctant" pattern matching.
java.util.regex.Pattern
:
Change | Impact |
---|---|
ScannerFactory can now return a Scanner
instance based on a Reader |
No longer necessary to create a buffer on the Reader
and then create a Scanner on the buffer. |
Replaced Scanner method getWriterStack by
method out returning Writer instead of
WriterStack , and modified AbstractScanner
require an Appendable instead of a Writer
as the default output. |
Requires change of method name in application code. Severs a not very
useful connection to package ca.gorman.io . Allows a wider
range of outputs without requiring any substantial programming
changes. Scanner can be used for output to NIO buffer,
as well as to a Writer . |
Added class ca.gorman.util.scan.Modifier , with operators
to create new ScanMatch instances from other instances. |
Simplifies the design of complex pattern matching classes
(subclasses of ScanMatch ) by providing equivalents of the
quantifiers and logical operators in Java regex pattern matching. |
Removed variable mapping feature from class MatchedText to
classes Scanner and ScanBuffer , replaced
by a capture method in class Modifier |
Better implementation of subsequence capture as named character sequences. |
Removed lookahead method from ScanMatch class,
replaced by lookahead and lookaheadNot methods
in class Modifier |
Simplifies the design of complex pattern matching classes
(subclasses of ScanMatch ) by providing equivalents of the
quantifiers and logical operators in Java regex pattern matching.
Also allows a lookahead (or negative lookahead failure) to be captured
as a named character sequence. |
Removed classes AbstractScanMatch and
AbstractScanRule because, without the lookahead
method, class AbstractScanMatch implments no methods, and
because class AbstractScanRule adds very little
implementation. |
Breaks the same code did not replace the lookahead method
or did not replace the action method. |
This release is a design change that breaks previous application code,
replacing an API class (ParseReader
) by a Java SDK class
(java.lang.CharSequence
). The CharSequence
class:/p>
Change | Impact |
---|---|
Replaced ParseReader ) by
java.lang.CharSequence ) |
Breaks previous application code, but supports closer integration of pattern-matching with XML parsing, and supports parsers that do not keep data in memory. |
This release includes, as a major enhancement, scanning and pattern-matching tools that can do recursive pattern-matching, such as the recognition of nested sets of parentheses. Recursive descent pattern matching can be used to construct simple recursive descent parsers. The length of data that can be matched is limited only by the length of the input buffer created by the application.
This release includes a change (upgrading to support SAX 2.0.1) that was previously planned for a beta release. The change was moved up to an alpha version because it could break application code that worked with a previous release.
Change | Impact |
---|---|
Text scanning software now operational, tested against
java.lang.CharSequence and tested with
java.io.Reader |
Supports scanning rules similer to the action rules (see a pattern, do something) of awk and perl. Also supports recursive pattern matching. |
Modified Listener and packages
ca/gorman.xml.parse.sax
ca/gorman.xml.parse.toolkit to suppart SAX 2.0.1
(Java 1.5.0) interfaces Attributes2 ,
EntityResolver2 , and Locator2) |
Breaks code that uses
Listener.resolveEntity(String name, String systemId) and
that must now use Listener.resolveEntity(String name,
String publicId, String baseURI, String systemId) .
Applications can now use the information from the SAX interfaces
Attributes2 , DefaultHandler2 ,
EntityResolver2 , and Locator2 when the
underlying SAX parser supports them. |
Renamed DtdValidatingParserFactory to
ValidatingParserFactory |
Breaks code that used DtdValidatingParserFactory |
Changed XMLParser to an abstract class. |
Breaks code that used any of the constructors, but makes it easier for end users to replace the class in future applications. |
Removed "public" tags from the members of all interfaces | none |
Moved some methods out of Listener to new subclasses
CharacterListener , SkippedEntityListener |
No impact on existing code. Supports delegation of character methods and skipped entity method. |
Revised SkipListener and moved it from package
ca.gorman.xml.parse.toolkit to package
ca.gorman.xml.parse.util |
Breaks code using previous version.
CurrentElement.skipContent can now be used with any
implementation, by wrapping the original Listener in a
SkipListener |
More detailed testing (JUnit) of Listener.doElement |
None |
Removed generic types from package ca.gorman.io.regex ,
renamed package to code>ca.gorman.util.scan |
No impact on existing code because package is not yet working.
Generic type to support subclasses of WriterStack added very
little benefit, removing it greatly enhances the reusability of
ScanAction and ScanAction subclasses, because
application code will not be constrained to a particular subclass of
WriterStack . |
A preliminary (and very rudimentary) user guide is included. An example application is included to illustrate the use of GXPARSE for converting Docbook XML to HTML.
Some Junit tests have been added.
The source and build directory structure have been modified, to conform
more closely with the structure that be used in the beta release. More
information is given in the main build file
(gxparse/admin/build.xml
) and in the preliminary user
guide.
Change | Impact |
---|---|
Added a prototype of a package to provide supplementary pattern matching services on input streams and output streams. | None, because package is not yet working. |
Modified package ca.gorman.xml.parse to use interfaces and
abstract classes, except for classes Input and
ParseReader .Moved ErrorListener to package
ca.gorman.xml.parse and as abstract class
AbstractErrorListener . |
Breaks backward compatibility. Porvides greater separation between application code and the implementatin of the API. |
Removed iterator method from Resequencer ,
ResequencingWriter , and ResequencingWriter
because of ambiguous meaning: does it return all Mark s in
the MarkGroup (definition actually used) or only those
Mark that have been written to the particular
Resequencer .
|
Application code must replace invocation of iterator() by
invocation of getMarkGroup().iterator() |
Changed ParseReader methods toString ,
toCharArray , appendTo , and
WriteTo . to not remove the remaining characters from the
ParseReader |
Existing code must be modified if it depends on these methods clearing
the ParseReader . However, the methods will behave more like
other methods of the same name in package java.io . |
This version includes full namespace support. The namespace support has been used to translate a very minimal "book" from docbook XML to HTML, but has not been tested in any other way.
This version compiles with Java 1.5.0 beta 2 and Java 1.5.0 RC ("Release Candidate"). It should be used with Java 1.5.0 RC or later.
Change | Impact |
---|---|
Moved AbstractAttribute , AbstractAttribute ,
and AbstractAttribute from package
ca.gorman.xml.parse to package
ca.gorman.xml.parse.toolkit |
Reduced clutter in package ca.gorman.xml.parse .Implementations need to be recompiled. |
Added runtime exceptions ReparsedContentException (thrown
when an element handler tries to parse content more than once) and
UnparsedContentException (thrown when an element handler
does not parse content at all). |
Easier debugging of application code. No change required to existing applicaitons. |
Added CoroutineTransferException to transfer checked
exceptions between parser coroutine and application. |
Easier debugging of application code because of reduced clutter in
stack traces. No change required to existing applicaitons. |
add skipContent() , getRepeatCount() , and
getSiblingCount() , and getChildCount() methods
to CurrentElement. |
Breaks previous implementations, and requires applications to be
recompiled. Allows an application to ignore content without explicitly discarding the content. Simplifies development of applications that need to handle the children of an element in ways that depend on the sequence or number of the children. |
Add another ElementMapper implementation
(PackageNamespaceElementMapper ), to map XML namespaces to
Java packages, and to map XML elements of each namespace to individual
Java classes in the corresponding packages. Each element can be handled
by a single Java class. |
Support for namespaces, and support for modular design in applications that deal with large numbers of elements. The new implementation will be used to support the docbook-to-HTML translator. |
"prefixed name" operators removed from Element ,
Attribute and Parser interfaces.Added DtdUtilities.getPrefixedName(QName) for use in
applications where prefixes can be reliably used to identify
namespaces. |
Removes a source of application bugs when handling namespaces, because
a prefixed name may not represent the same namespace at different points
in the document. Breaks application code that used these operators. |
Replaced method getQuotedValue by method
getQuotableValue and removed method
getNameAndValue from interface Attribute and
class AbstractAttribute .
| Methods did not work correctly when a value contained both single and
double quotes. New method is more flexible because it replaces single
and double quotes by corresponding entities, and the resulting value is
not quoted. Breaks application code that used the previous methods. |
Change | Impact |
---|---|
Change to support namespaces.Global interface added to identify a user-defined object
carries a global scope for application data and methods.Parser interface modified to carry a subclass of
Global , Listener interface modified to pass
Parser to every method.Generic (parameterized) types are used to avoid the need for typecasting, and to give stronger typing.. | All Listener methods will
have access to a global application scope even when element methods for
each namespace are in a different class (one namespace per class) or a
different package (one namespace per package).Application code must be modified to accept an extra parameter in each Listener method. |
Change to support namespaces.ca.gorman.xml.parse.ElementMapper class replaced by
ca.gorman.xml.parse.ElementMapper interface and
ca.gorman.xml.parse.util.SimpleElementMapper class. |
Application code will have a choice of ElementMapper
implementations with or without namespace support.Application code that previously instantiated ElementMapper with
"new ElementMapper " must now instantiate with
"new SimpleElementMapper ". |
Good Java practice requires that current versions of an API remain compatible with code that worked for prior versions of the API. Some of the planned changes are incompatible with this requirement, and GXPARSE will remain in "alpha" status until those changes are completed.
Change | Impact |
---|---|
Preliminary implementation of Catalog |
Partial support for docbook applications |
Move demo packages out of GXPARSE into preliminary user guide Move XMLCheck and XMLDump from package
ca.gorman.xml.parse.demo to package
ca.gorman.xml.parse.util |
Less clutter for experienced users |
Change | Impact |
---|---|
Fully implement Catalog |
Full support for XML catalogs |
User guide | Easier first-time use |
Better exception transfer between application and parser coroutine | Current exception handling makes debugging difficult |