P.J. Brown & Heather Brown
Department of Computer Science,
Harrison Building, Univ. of Exeter, Exeter EX4 4QF, UK
E-mail: P.J.Brown@ex.ac.uk,
H.Brown@ex.ac.uk.
Hypertext linking has long been a feature of electronic documents. More recently there has been attention to annotation of electronic documents, and a number of annotation systems have been built, e.g. e.g. Annotea, iMarkup and the "note" facility in the Opera browser.
In many ways hyperlinking and electronic annotation are similar, and some of the differences are just pragmatic ones: for example annotations tend to be short whereas hyperlinks can lead to long documents. The purpose of this paper is to examine the two concepts to see whether they can each be regards as an instance of a single generalising concept.
We will start with some definitions of terms. We will use the term base document to mean the document to which an annotation is applied or to which a hyperlink leads. TODO: more
The style of the paper is that we will explore a series of examples of the usage of hyperlinking and annotation, and see how these examples bring out general concepts.
Our first examples cover where annotation and hyperlinking are perhaps closest. Example 1 is where the user wants to add an explanation, within a document they own, of what the acronym FIFO means. This can be done either by making FIFO a hyperlink to a page that explains what it means, or by adding an annotation to FIFO that does the same thing.
Example 2 is the same as Example 1, except that we will assume the current author is not the owner of the base document and therefore cannot modify it.
There are three aspects to the implementation of links: anchoring, display and storage/retrieval.
The needs of anchoring, i.e. attaching links to fragments of base documents, are the same for annotations and hyperlinks. A pragmatic point is that anchors for annotations may be -- and commonly are -- null: they represent a point in the base document rather than a visible fragment. An example would be an anchor for an annotation representing an insertion between two words.
WWW offers limited facilities for specifying anchors, and there is a potential need, for both hyperlinks and annotations, for going beyond what WWW offers. One such need, relevant to Example 1, is to specify an anchor that covers all occurrences of a word ("FIFO", say) in the base document, without the need to mark each individual occurrence in advance. Todo: ref to Hall of DeRose. This can be extended beyond a single document to cover all occurrences in a given set of documents, or even to all occurrences in any document loaded. Thus the time at which anchors are fixed is moved from the time of authorship to the time of loading.
A second anchoring need, particularly relevant to annotations, is for overlapping anchors, e.g. one anchor covering a paragraph and another covering a sequence of words within that paragraph. This is easiest to implement when the overlapping anchors are such that one lies entirely within the other.
A third need, again specially relevant to annotations, is for a dispersed anchor, which is logically a single anchor but refers to more than one fragment of the base document. An example would be the annotation `these two contradict each other', whose dispersed anchor specifies two separate sentences, maybe far apart in the base document.
In spite of these three extra needs, anchoring of annotations remains essentially similar to anchoring of hyperlinks.
Display covers three aspects: (1) awareness of anchors; (2) selection of anchors; (3) the destination of the link attached to an anchor.
Everyone is familiar with WWW's method of making the user aware of an anchor by displaying it in a different colour and underlining it. This approach works well, but needs some extension if null anchors and/or overlapping anchors are supported.
Everyone is also familiar with WWW's approach of selecting a link by clicking on its anchor, and this too can be carried over to annotations attached to anchors. There is, however, a case for annotations to be pre-selected, i.e. an annotation is shown without any need to ask for it. (After all, annotations on a paper document are always visible to the reader.) A pre-selection facility could be specified by the author -- perhaps some annotations will be pre-selected and others not -- or by the reader (e.g. by a menu command `Select all annotations'). Nielsen discusses TODO.
In current practice display of the link destination is done differently by hypertext systems and annotation systems. Typically a hypertext system overwrites the current page with the page that is the destination document, though there is often an alternative possibility of creating a new, separate, window.
In general there is a need that an annotation should obscure the base document as little as possible. (Again taking the parallel of annotated paper documents, the reader can see both the base document and all its annotations.) Thus annotations are often displayed in margins or in the gap between lines. Zellweger TODO. TODO: ref to Guide.
For both annotation and hypertext there is a potential usage for pop-up display, i.e. the content of the destination pops up while the mouse is held down over the anchor. Overall there is scope for bringing the display practices for hyperlinks and annotation, currently different, to move together -- to the benefit of both.
TODO: embedded/separate link base; example 2 -- make a copy; two-way linking; general intro
The requirements for naming and retrieving the destination document appear to be the same for both hyperlinking and annotation. Thus if one was designing new software for implementing annotations a good starting point might be, instead of designing yet another way to address annotations, to ride on all the development of WWW. For example WWW URLs could be used to address annotations. Obviously at the moment, when most annotations are personal rather than public, most annotations will be stored locally, but allowing annotations to come from anywhere in the world that can be accessed via a URL is surely a gain in flexibility.
The generality of URLs also brings other benefits. In particular URLs can lead to a scipt that generates a document dynamically. Some examples of how this might be used for annotations are:
TODO: privacy; offsets in a document -- but depends on HTML mark-up; collections of annotations;
Each anchor needs to be associated with its destination (which, we have argued above, might be represented by a URL). In WWW the mechanism for this is simple: anchors and their associated destinations are embeeded in the base document, as in <a HREF="destination URL">name of anchor </a>. For many years it has been argued, particularly by the Southampton University research group, that this model is much too limited, and that the set of anchors plus there links should be stored as a separate entity. The term linkbase is used to describe such as a database of link associations. This has many advantages:
The problem of change is most acuter when:
The problem is that change in the base document may affect the location of anchors. Obviously change can affect the size of the base document. If, for example, an anchor is recorded as being from character 120 to 128 of the base document, then this is likely to go wrong if the base document changes. Numerous ways have been found to make anchoring mechanisms more robust over change [REF}, but inevitable none is perfect: if the base document is replaced y a completely different document, there is no way the previous anchors can sensibly be placed in the new base document.
Dynamic links are more robust over change. TODO: is this dynamic anchors?
Peter and Heather Brown are both Visiting Professors in the Computer Science department of Exeter University. Both have been interested in document systems since the early eighties, when, while at Stanford University, Heather worked on the display of TeX documents and Peter worked on embryo hypertext systems. This work came out of the UK initiative for Grand Challenges for Computing Research.