This paper discusses engineering document fragment mark-up supported by the use of the eXstensible Stylesheet Language – Formatting Objects (XLS-FO). XLS-FO can be used to convert the native format repre-sentation of such documents as Word, Excel and PDF into XML. Once in XML, documents fragments can be retrieved at will in response to a search query. In the paper the process of a document fragment retrieval – based on the authors’ decomposition scheme approach – has been modelled and the issue of converting documents into XML addressed. Additionally, the use of document templates is discussed as a means of ensuring that the transformed XML documents are compliant with the decomposition schemes. Automating the reformatting of documents into XML and the use of templates helps make implementation of a document-fragment approach to retrieval more resource efficient, so making its adoption in industry more practicable.
|Title of host publication||Computational Science and Its Applications - ICCSA 2006|
|Subtitle of host publication||International Conference, Glasgow, UK, May 8-11, 2006. Proceedings, Part II|
|Number of pages||9|
|Publication status||Published - 2006|
|Name||Lecture Notes in Computer Science|