Context-aware retrieval: initial spec of summer job, 2000

Peter J. Brown
and
Gareth J. F. Jones

Department of Computer Science, University of Exeter, Exeter EX4 4PT, UK
P.J.Brown@exeter.ac.uk
and
G.J.F.Jones@exeter.ac.uk

Data sources

Decide example area for project data; this will probably be tourism, and, if so, we may wish to liaise with data provides such as "Exeter on-line" -- see for links: summary of Exeter links. Decide how data is to be encoded (i.e. "marked up"). If the data comes from elsewhere, we may need to change its format, and/or add extra fields to it.

Internal data representation

Decide how the document collection, and the user's current context, are to be represented as Java objects. Decide on data types of fields (e.g. text, data, "location", range of values, collection of people, etc.) The Java program must be easily extendable by adding new types of data and their associated methods, e.g. matching algorithms (e.g. given location X and location Y, compute a score representing how well the two locations match). Values of numerical quantities may be ranges rather than single values: e.g. a time between 2 and 5, a temperature of 15-20, or a location within a certain rectangle or circle. Some previous work has been done at Kent University, and we have the source (which is in Waba, a Java subset for the Palm); we need to explore how much this can be re-used.

The display

The work will be done on a static PC or workstation, but with use on a small PDA or cellphone in mind. The PDA/cellphone screen + its associated buttons will be simulated as a small area within the web-page. For initial work we will assume a device like a cellphone, but with a PDA-sized screen, but it should be easy to plug in simulations of other devices. (Question: could we find a free simulated cellphone interface written in Java on the web, to save re-implementing this?) The rest of the web-page may be a test harness, e.g. it may consist of devices whereby the experimenter can set the user's current context, in order to see what is retrieved and displayed on the simulated cellphone screen. (Sample devices: a map to set location -- or to draw out a path representing a series of locations; a slider to set a simulated temperature; a circular dial to set time-of-day; other fields may be set automatically -- e.g. random number generator to simulate arrival of Virgin trains). The retrieved information on the cellphone screen may follow a "ramping interface", as described by Rhodes.

Retrieval approaches

Must cover both user-driven and author driven retrieval, as described in our current paper. Must be easy to plug in different retrieval strategies so that we can experiment. May incorporate a priority system: e.g. authors can assign relative priorities to different pieces of information, and users can also set priorities (e.g. share prices and traffic information are top priorities now) -- there should at least be hooks for providing this. Priorities may affect order in which documents are presented to the user.

Producing experimental results

Part of the research will be to do some experiments that involve supplying a sequence of contexts, and seeing how well the retrieval works, in terms of retrieving the right documents. Thus the model needs to provide an interface whereby the experimenter can supply a sequence of contexts, and the system can keep a log of these and the documents retrieved. The sequence of contexts used for experiment will probably be descriptions of how each contextual value changes with time, e.g. temperature rises one degree every ten minutes, location proceeds at constant rate from A to B, taking 20 minutes over the journey, time period runs from 2pm to 5pm, traffic state cycles through light, medium, heavy, changing every 15 minutes.

Documentation

We emphasize that good quality documentation is vital throughout this work, as we want to continue to use it and modify it for several years. For imported materials (documents, map of Exeter, etc.) we need to be clear about copyright issues.

Focus

If we explored each of the above in detail, the work could take 8 years rather than 8 weeks. Thus we need to spend the initial period deciding where to focus. In particular this is not a user-interface project, so we just need adequate interfaces rather than fancy ones.