Technology
There are two parts to the system driving this database:
- a batch conversion process which converts the corpus of more than 10,000 items of Mueller's correspondence from Microsoft Word format into TEI XML documents, and
- a web application which organises and presents the resulting TEI XML files in the form of this website.
The batch conversion process uses LibreOffice to convert the Microsoft Word documents into OpenDocument format, and then further processes those OpenDocument files into TEI XML using a sequence of XSLT 3.0 stylesheets, organised in an XProc 1.0 pipeline.
This website is also a pipeline written in the XProc 1.0 language, consisting mostly of XSLT 3.0 stylesheets. The pipeline handles requests from web browsers, reading TEI XML files, submitting queries to the search engine, and formatting pages into HTML.
The XProc pipeline is hosted by a Java web Servlet called XProc-Z, which in turn is hosted in the Java web server Apache Tomcat. The web application uses Apache Solr as a search engine to drive the search and browse functionality, and to perform hit-highlighting.
The software for this website was developed for the Gardens by Conal Tuohy, a freelance software developer based in Brisbane.