Patterns for descriptive documents: a formal analysis

A. Dattolo, A. Di Iorio, S. Duca, A. A Feliziani, F. Vitali


University of Bologna (Italy). Department of Computer Science.

Combining expressiveness and plainness in the design of web documents is a difficult task. Validation languages are very powerful and designers are tempted to over-design specifications. This paper discusses an offbeat approach: describing any structured content of any document by only using a very small set of patterns, regardless of the format and layout of that document. A segmentation model, called Pentafor- mat, underpins our ideas and is presented in the first part of the paper. The core of this work is rather a formal analysis of some structural patterns, based on grammars and language theory. The study has been performed on XML languages and DTDs and has a twofold goal: coding empirical patterns in a formal representation, and proving their completeness.