John W. Campbell
This class enables a selective parsing of XML type data. It was designed
to be used in the JdlDocument object to enable selective parsing of
documents which contain a mixture of special tags and XHTML tags. It makes
no reference to any DTD. For this selective parsing, XML type elements
and tokens are divided into four categories as follows:
- Tree elements. These are the elements which form the basic structure
of the document as viewed from the perspective from which the parsing
is to take place. They, and their associated data, will be stored in an
appropriately nested manner.
- Pass elements. These elements are not parsed but are passed without
modification from the input to the output. Though they are not parsed
as such, these elements must be children of the tree elements and
may not have tree elements as their children.
- Cut elements. If these elements are encountered, they are cut out
of the document and are not passed to the output.
- Cut tokens. These tokens are not passed to the output but any data
from the elements associated with these tokens is passed unmodified to
the output.
To illustrate the treatment of the last three cases take token 'I' and
the string:
<I> Test Case </I>
In the 'Pass elements' case, the following string will be passed
to the output:
<I> Test Case </I>
In the 'Cut elements' case, no string will be passed to the output.
In the 'Cut tokens' case, the following string will be passed
to the output:
Test Case
Class, constructor and methods:
Class Details
Accessible Fields
Constructor
Methods
- Package:
- Jdl.JdlLib;
- Class name:
- JdlSelectiveXMLParser
- Class definition:
- public class JdlSelectiveXMLParser
- Extends:
- Object
- Implements:
- none
- Actions:
- none
No fields with public, package or protected access defined.
A single constructor is available.
Constructor:
Standard constructor
This constructs a JdlSelectiveXMLParser object with four lists of element
names in the categories described above.
- Constructor Definition:
- public JdlSelectiveXMLParser(String[] tree_els, String[] pass_els, String[] cut_els, String[] cut_toks)
- Parameters List:
- tree_els
- List of token names for elements to be treated as forming
the structure of the document.
- pass_els
- List of token names for elements to be passed without
modification to the output. (Note that these may not have tree elements
as children).
- cut_els
- List of token names for elements to be cut out.
- cut_toks
- List of token names for which the tokens but not their
associated data are to be cut out.
This main method parses the input document as described above and
returns the parsed data in a JdlDocObject object. The other methods
enable details of error conditions to be found.
Methods:
Parse document - parseXML
Elements ok - elementsOK
List of unknown elements - unknownElements
Nesting ok - nestingOK
Children ok - childrenOK
Error message list - returnErrorMessages
This method parses the input document as described above and
returns the parsed data in a JdlDocObject object.
- Method Definition:
- public JdlDocObject parseXML(String doc_str, JdlError err)
- Parameters List:
- doc_str
- The input document as a String.
- err
- This object returns details of the first error found. If there
was an error 'err.err' will be set to true and 'err.flag' will give an
indication of the number of errors encountered. The messages in 'err.msg1'
and 'err.msg2' will give a description of the first error found. Further
error messages may be retrieved using the returnErrorMessages() method.
- Method Return:
-
Returns the parsed document as a JdlDocObject object.
This method returns a flag indicating whether or not the elements encountered
were all recognised.
- Method Definition:
- public boolean elementsOK()
- Parameters List:
- none
- Method Return:
-
Returns true if the elements were recognised or false if unknown
elements were encountered.
This method will return a list of any unknown elements encountered.
- Method Definition:
- public String[] unknownElements()
- Parameters List:
- none
- Method Return:
-
Returns a list of unknown tokens or null if there were none.
This method indicates whether or not the nesting of the tokens was
consistent i.e. no unclosed tokens; no end tokens before start tokens;
no overlapping elements. As no DTD is associated with the parsing, this
is a general check and further checks may be needed at a later stage
when the parsed data are being processed.
- Method Definition:
- public boolean nestingOK(JdlError err)
- Parameters List:
- err
- This returns details of the first nesting error found if there was
one. In such a case 'err.err' will be returned as true, 'err.flag' will be
returned as -1 and 'err.msg1 and 'err.msg2' will give details of the error
and its context.
- Method Return:
-
Returns true if the nesting was OK or false if there were one or
more nesting errors.
This method will indicate whether or not all the children were valid i.e.
none of the 'Pass elements' were parents of 'Tree elements' in the input
document.
- Method Definition:
- public boolean childrenOK(JdlError err)
- Parameters List:
- err
- This returns details of the first child error found if there was
one. In such a case 'err.err' will be returned as true, 'err.flag' will be
returned as -1 and 'err.msg1 and 'err.msg2' will give details of the error
and its context.
- Method Return:
-
Returns true if the children were OK or false if there were one or
more children errors.
This method returns a list of strings with pairs of error messages and strings
indicating the context where the error was encountered.
- Method Definition:
- public String[] returnErrorMessages()
- Parameters List:
- none
- Method Return:
-
Returns a list of error messages and conext strings (null if none).
⇑ Up 2
⇑ Up 1
⇑ Top of this