2.6.4 - Selective XML Parser - JdlSelectiveXMLParser

John W. Campbell

2.6.4.1 Introduction

This class enables a selective parsing of XML type data. It was designed to be used in the JdlDocument object to enable selective parsing of documents which contain a mixture of special tags and XHTML tags. It makes no reference to any DTD. For this selective parsing, XML type elements and tokens are divided into four categories as follows:
  1. Tree elements. These are the elements which form the basic structure of the document as viewed from the perspective from which the parsing is to take place. They, and their associated data, will be stored in an appropriately nested manner.
  2. Pass elements. These elements are not parsed but are passed without modification from the input to the output. Though they are not parsed as such, these elements must be children of the tree elements and may not have tree elements as their children.
  3. Cut elements. If these elements are encountered, they are cut out of the document and are not passed to the output.
  4. Cut tokens. These tokens are not passed to the output but any data from the elements associated with these tokens is passed unmodified to the output.
To illustrate the treatment of the last three cases take token 'I' and the string:
<I> Test Case </I>
In the 'Pass elements' case, the following string will be passed to the output:
<I> Test Case </I>
In the 'Cut elements' case, no string will be passed to the output.
In the 'Cut tokens' case, the following string will be passed to the output:
Test Case

Class, constructor and methods:

Class Details
Accessible Fields
Constructor
Methods

2.6.4.2 Class Details

Package:
Jdl.JdlLib;
Class name:
JdlSelectiveXMLParser
Class definition:
public class JdlSelectiveXMLParser
Extends:
Object
Implements:
none
Actions:
none

2.6.4.3 Accessible Fields

No fields with public, package or protected access defined.

2.6.4.4 Constructor

2.6.4.4.1 Introduction

A single constructor is available.

Constructor:

Standard constructor

2.6.4.4.2 Standard constructor

This constructs a JdlSelectiveXMLParser object with four lists of element names in the categories described above.

Constructor Definition:
public JdlSelectiveXMLParser(String[] tree_els, String[] pass_els, String[] cut_els, String[] cut_toks)
Parameters List:
tree_els
List of token names for elements to be treated as forming the structure of the document.
pass_els
List of token names for elements to be passed without modification to the output. (Note that these may not have tree elements as children).
cut_els
List of token names for elements to be cut out.
cut_toks
List of token names for which the tokens but not their associated data are to be cut out.

2.6.4.5 Methods

2.6.4.5.1 Introduction

This main method parses the input document as described above and returns the parsed data in a JdlDocObject object. The other methods enable details of error conditions to be found.

Methods:

Parse document - parseXML
Elements ok - elementsOK
List of unknown elements - unknownElements
Nesting ok - nestingOK
Children ok - childrenOK
Error message list - returnErrorMessages

2.6.4.5.2 Parse document - parseXML

This method parses the input document as described above and returns the parsed data in a JdlDocObject object.

Method Definition:
public JdlDocObject parseXML(String doc_str, JdlError err)
Parameters List:
doc_str
The input document as a String.
err
This object returns details of the first error found. If there was an error 'err.err' will be set to true and 'err.flag' will give an indication of the number of errors encountered. The messages in 'err.msg1' and 'err.msg2' will give a description of the first error found. Further error messages may be retrieved using the returnErrorMessages() method.
Method Return:
Returns the parsed document as a JdlDocObject object.

2.6.4.5.3 Elements ok - elementsOK

This method returns a flag indicating whether or not the elements encountered were all recognised.

Method Definition:
public boolean elementsOK()
Parameters List:
none
Method Return:
Returns true if the elements were recognised or false if unknown elements were encountered.

2.6.4.5.4 List of unknown elements - unknownElements

This method will return a list of any unknown elements encountered.

Method Definition:
public String[] unknownElements()
Parameters List:
none
Method Return:
Returns a list of unknown tokens or null if there were none.

2.6.4.5.5 Nesting ok - nestingOK

This method indicates whether or not the nesting of the tokens was consistent i.e. no unclosed tokens; no end tokens before start tokens; no overlapping elements. As no DTD is associated with the parsing, this is a general check and further checks may be needed at a later stage when the parsed data are being processed.

Method Definition:
public boolean nestingOK(JdlError err)
Parameters List:
err
This returns details of the first nesting error found if there was one. In such a case 'err.err' will be returned as true, 'err.flag' will be returned as -1 and 'err.msg1 and 'err.msg2' will give details of the error and its context.
Method Return:
Returns true if the nesting was OK or false if there were one or more nesting errors.

2.6.4.5.6 Children ok - childrenOK

This method will indicate whether or not all the children were valid i.e. none of the 'Pass elements' were parents of 'Tree elements' in the input document.

Method Definition:
public boolean childrenOK(JdlError err)
Parameters List:
err
This returns details of the first child error found if there was one. In such a case 'err.err' will be returned as true, 'err.flag' will be returned as -1 and 'err.msg1 and 'err.msg2' will give details of the error and its context.
Method Return:
Returns true if the children were OK or false if there were one or more children errors.

2.6.4.5.7 Error message list - returnErrorMessages

This method returns a list of strings with pairs of error messages and strings indicating the context where the error was encountered.

Method Definition:
public String[] returnErrorMessages()
Parameters List:
none
Method Return:
Returns a list of error messages and conext strings (null if none).

⇑ Up 2   ⇑ Up 1   ⇑ Top of this