doc

A document. Each file contains one or more documents (typically, there is one document per file for books, ephemerals, poems, etc., but possibly hundreds of documents per file for a newspaper, where one file contains the whole daily issue, and each document corresponds to an article.)

A document is identified by a (numerical) id attribute (documents are simply numbered within a file, starting at 1). For ease of local reference, the filename in which the document resides is repeated at every document in the file in the file attribute. Full path to the archive is used for the file reference, even though care has been taken to uniquely identify all files in the CNC (and thus, in the PDT as well) just by the filename.

The document contains one header (<a>) and its contents (<c>). The header contains information about the genre, time period, and other bibliographical and classification information as well as additional markup processing information (if any). The contents then contains a sequence of paragraphs (<p>) and sentences (<s>) within the paragraphs containing the linguistic material proper.


Content


ATTRIBUTES
CONTENT DECLARATION

Tag Minimization
Open Tag: REQUIRED
Close Tag: REQUIRED

Parent Elements


Top Elements
All Elements


csts DTD