LanguageAnalyzer is the editor part of LanguageExplorer. It is a
comfortable tool for editing documents with the focus being laid
on analysis, segmentation, mark-up and linking of already existing documents.
Like LanguageExplorer, LanguageAnalyzer can handle texts in any language supported by
the Unicode standard. Furthermore facsimile
reproductions and eventually sound files can be processed and tagged in a
LanguageAnalyzer is designed as a configurable framework which can easily be extended
by user defined tools and plugins. The idea behind LanguageAnalyzer is to offer a uniform
and simple interface to the document content and the markup structures of the document which
can be used by the extension tools and plugins to process the content as well as the markup and
enable them to create new structures out of the existing ones. Besides the available tools, the
user has always the possibility to manually edit or refine the content, the different
encodings of the content and the link structures between the different encoding elements.
Finally, the processed documents can be saved
together in the XTE XML format which is a new external XML encoding which allows an arbitrary
number of documents each with an arbitrary number of parallel and even overlapping encodings to be saved into a single XML file.
Furthermore, the LanguageAnalyzer suit contains
tools which allow the automatic interlinking of two documents A and C if there already exists an interlinked
versions of A and B and one of B and C. Another utility takes
(n*n-n)/n interlinked two-document versions and creates a n-document version which
may be subsequently viewed with LanguageExplorer.
For more information on the software architecture and implementation details of LanguageExplorer and LanguageAnalyzer
as well as details of the XTE encoding scheme
the interested reader may refer to the thesis "
A framework for processing and presenting parallel text corpora" which is available in PDF format (~7mb).