AugXml
Intro
AugXml is a standalone tool written in Java used to parse files stored in the original NLS binary file format used by the NLS and Augment systems. Once a file has been parsed, it can be transformed into any number of different output formats. Typically, an XML format is chosen.
AugXML is part of the collection of software produced by the HyperScope development team. The primary author of the AugXml code is Jonathan Cheyer, but much assistance in the form of documentation and explanation of the file format was given from key members of the HyperScope community, especially many of the original NLS team.
Rationale
A large number of documents have been written over the years by Doug Engelbart and various members of the NLS and Augment teams. In fact, the source code and documentation for the entire NLS system itself is stored in the NLS binary file format.
Having this information available to the general public in a human-readable, machine-parseable format would be invaluable.
There have been other attempts to do this before.
- Augment itself includes a command (Create Sequential File) which saves an Augment file as standard text. Structured data and meta-data is lost.
- a2h is a tool written in Perl by Eugene Eric Kim around 2001 for the purposes of assisting in the conversion of files that originated in an Augment system. It does not work on Augment files directly. Instead, it requires some front-end processing of Augment files into an "exported Augment file" format which it can interpret. The front-end processing is done within Augment itself. The processing scripts were written by Doug Engelbart but are not currently available. Structured data and NIDs are maintained, but other meta-data is lost.
- AugXml maintains the complete set of meta-data, including NLS-internal meta-data and structures. It can export the complete, lossless version of the file in a human readable manner. It can also export a lossy subset of metadata in a simplified manner which makes it very easy to work with to perform additional transformations upon.
How is AugXml different from a2h?
The a2h tool was an early effort to pull out specific text-related information (and some limited meta-data) from an Augment system and make it available specifically in HTML format. It requires front-end processing of files from within Augment itself, before using a2h.
AugXml was written more recently and is meant to be a complete stand-alone piece of software which operates directly on Augment files. It will parse all information within the Augment file, including all meta-data. It does not require a running NLS system and will run on any machine which has a Java environment.
How the program works
The AugXml program has detailed information about the internal structure of the the NLS binary file format. It walks through the input file and creates an internal representation of the file. It then maps these to high-level objects that can be queried (via Java) or transformed to XML, text, and other file formats.
System Requirements
- Java (JDK or JRE) 1.5.x or newer.
Instructions
Replace [filename.AUG.1] with the actual NLS filename you are trying to transform.
If you are running Linux:
$ augxml-0.1/augxml.sh -a [filename.AUG.1]
If you are running Windows:
C:> java -classpath augxml-0.1/augxml.jar;augxml-0.1/base64.jar org.nlsaugment.augxml.AugXml -a [filename.AUG.1]
License
This software is released under GNU GPLv2.
Download
Status
As of 2007-10-31, AugXml is still in a preliminary state. It will successfully parse some, but not all, NLS files. Some files will cause parsing errors to occur and the program will terminate.
Future
While AugXml is quite useful for many NLS files as is, more work is needed in order to successfully parse and transform all NLS and Augment files. The goal is to be able to perform complete round-tripping of all existing NLS files to lossless XML format and back again while retaining bit-for-bit compatibility with the original files.
Help is needed! If you are interested in working on this project, please contact me at 
- Login to post comments