[All Packages] [Previous]
SAX is a standard interface for event-based XML parsing, developed collaboratively by the members of the XML-DEV mailing list.
There are two major types of XML (or SGML) APIs:
A tree-based API compiles an XML document into an internal tree structure, then allows an application to navigate that tree using the Document Object Model (DOM), a standard tree-based API for XML and HTML documents.
An event-based API, on the other hand, reports parsing events (such as the start and end of elements) directly to the application through callbacks, and does not usually build an internal tree. The application implements handlers to deal with the different events, much like handling events in a graphical user interface.
Tree-based APIs are useful for a wide range of applications, but they often put a great strain on system resources, especially if the document is large (under very controlled circumstances, it is possible to construct the tree in a lazy fashion to avoid some of this problem). Furthermore, some applications need to build their own, different data trees, and it is very inefficient to build a tree of parse nodes, only to map it onto a new tree.
In both of these cases, an event-based API provides a simpler, lower-level access to an XML document: you can parse documents much larger than your available system memory, and you can construct your own data structures using your callback event handlers.
To use SAX, an xmlsaxcb structure is initialized with function pointers and passed to the xmlinit() call. A pointer to a user-defined context structure may also be included; that context pointer will be passed to each SAX function.
The SAX callback structure:
typedef struct
{
sword (*startDocument)(void *ctx);
sword (*endDocument)(void *ctx);
sword (*startElement)(void *ctx, const oratext *name, const struct xmlarray *attrs);
sword (*endElement)(void *ctx, const oratext *name);
sword (*characters)(void *ctx, const oratext *ch, size_t len);
sword (*ignorableWhitespace)(void *ctx, const oratext *ch, size_t len);
sword (*processingInstruction)(void *ctx, const oratext *target, const oratext *data);
sword (*notationDecl)(void *ctx, const oratext *name,
const oratext *publicId, const oratext *systemId);
sword (*unparsedEntityDecl)(void *ctx, const oratext *name, const oratext *publicId,
const oratext *systemId, const oratext *notationName);
sword (*nsStartElement)(void *ctx, const oratext *qname,
const oratext *local, const oratext *nsp,
const struct xmlnodes *attrs);
} xmlsaxcb;
typedef unsigned char oratext;
typedef signed int sword;
typedef struct xmlattrs xmlattrs;
Note: The contents of xmlattrs are private and must not be accessed by users.
PURPOSE
This callback function receives notification of character data inside an element.
SYNTAX
sword (*characters)(void *ctx, const oratext *ch, size_t len);
PARAMETERS
ctx (IN) - client context pointer
ch (IN) - the characters
len (IN) - number of characters to use from the character pointer
COMMENTS
PURPOSE
This callback function receives notification of the end of the document.
SYNTAX
sword (*endDocument)(void *ctx);
PARAMETERS
ctx (IN) - client context
COMMENTS
PURPOSE
This callback function receives notification of the end of an element.
SYNTAX
sword (*endElement)(void *ctx, const oratext *name);
PARAMETERS
ctx (IN) - client context
name (IN) - element type name
COMMENTS
PURPOSE
This callback function receives notification of ignorable whitespace in element content.
SYNTAX
sword (*ignorableWhitespace)(void *ctx, const oratext *ch, size_t len);
PARAMETERS
ctx (IN) - client context
ch (IN) - whitespace characters
len (IN) - number of characters to use from the character pointer
COMMENTS
PURPOSE
This callback function receives notification of a notation declaration.
SYNTAX
sword (*notationDecl)(void *ctx, const oratext *name, const oratext *publicId, const oratext *systemId);
PARAMETERS
ctx (IN) - client context
name (IN) - notation name
publicId (IN) - notation public identifier, or null if not available
systemId (IN) - notation system identifier
COMMENTS
PURPOSE
This callback function receives notification of a processing instruction.
SYNTAX
sword (*processingInstruction)(void *ctx, const oratext *target, const oratext *data);
PARAMETERS
ctx (IN) - client context
target (IN) - processing instruction target
data (IN) - processing instruction data, or null if none is supplied
COMMENTS
PURPOSE
This callback function receives notification of the beginning of the document.
SYNTAX
sword (*startDocument)(void *ctx);
PARAMETERS
ctx (IN) - client context
COMMENTS
PURPOSE
This callback function receives notification of the beginning of an element.
SYNTAX
sword (*startElement)(void *ctx, const oratext *name, const struct xmlattrs *attrs);
PARAMETERS
ctx (IN) - client context
name (IN) - element type name
attrs (IN) - specified or defaulted attributes
COMMENTS
PURPOSE
This callback function receives notification of an unparsed entity declaration.
SYNTAX
sword (*unparsedEntityDecl)(void *ctx, const oratext *name, const oratext *publicId, const oratext *systemId,
const oratext *notationName);
PARAMETERS
ctx (IN) - client context
name (IN) - entity name
publicId (IN) - entity public identifier, or null if not available
systemId (IN) - entity system identifier
notationName (IN) - name of the associated notation
COMMENTS
PURPOSE
This callback function receives notification of the start of a namespace for an element.
SYNTAX
sword (*nsStartElement)(void *ctx, const oratext *qname, const oratext *local, const oratext *namespace,
const struct xmlattrs *attrs));
PARAMETERS
ctx (IN) - client context
qname (IN) - element fully qualified name
local (IN) - element local name
namespace (IN) - element namespace (URI)
attrs (IN) - specified or defaulted attributes
COMMENTS