We, Dr. Jones (GJFJ) and Professor Brown (PJB), have been working together on Context-Aware Retrieval since late 1999. The collaboration was strengthened when Professor Brown moved from Kent to Exeter in February 2000. Our early work involved bringing together the ideas of two previously disparate communities: information retrieval and mobile applications. We analysed how delivering information to mobile users, both by proactive triggering and interactive user queries, related to traditional Information Retrieval (IR) and Information Filtering (IF). Our conclusions [Bro01] were that Context-Aware Retrieval (CAR) represented a rather challenging hybrid between IR and IF. The challenge results from the dynamic nature of all the data involved, and from the need to provide near-continuous retrieval to the user. We further concluded that many of the huge advances in IF and IR over the past thirty years, many of them recently driven by web searching, could be exploited in CAR. In particular we concluded that the widely adopted retrieval strategy of "best-match" offered a much better foundation that the Boolean retrieval strategies adopted by most current CAR systems. We published some ideas on the scoring algorithms that could be used in CAR [Jon00]. These ideas have been further refined [Bro01a]. The essence of our new ideas is that, since CAR is so hard relative to conventional IR and IF, we must exploit those advantages that it does have, in particular that the user's context (which translates into a retrieval query) normally changes gradually and semi-predictably. In detail these ideas relate to scoring algorithms (e.g. is it better to score a distance field using an N^2 algorithm?), dynamic weighting algorithms (e.g. rapidly or suddenly changing fields being given higher weight), to the use of context caching, and to the use of a `Context Diary' that straddles the past and future. We give more details below
To support this work we have implemented a prototype CAR system called the Context Matcher; this is designed to act as a flexible testbed in which we can plug algorithms to test and to evaluate our ideas. This work was partly supported by an Emeritus Fellowship from the Leverhulme Foundation, which was awarded to PJB in December 1999 for two years. The implementation work has been carried out, in Java, by Dr. Lindsey Ford (LF).
As a result of this work we have attracted two potential industrial beneficiaries, Trilogy and Xerox, who are both collaborating with this bid.
In terms of combined expertise in retrieval and in mobile applications, we believe that Exeter University is now as well placed as any other research organisation in the world, with the exception of MIT, to contribute real advances in the burgeoning field of CAR. Todo: cite other relevant work at Exeter, especially Antony (temporal aspects) and Zoltan (distributed systems aspects). Possibility of Hungarian collaboration with Zoltan for discovery of relevant document collections; uses JINI.
The relevant track records of Dr. Jones, Professor Brown and Dr. Ford are as follows.
Dr. Jones Todo: existing grant, past awards, knowledge, etc.
Professor Brown has a track record of carrying research ideas into successful products. The most successful of these has been the Guide system, which embodied new research ideas in hypertext (this is becoming increasingly relevant to CAR, given Starner's view that location-aware applications are `physically-based hypertext'). Guide was turned into a product by OWL, a small company based in Edinburgh, and sold widely both to personal and corporate customers. In particular it was used by General Motors for their vehicle-bay information systems. Guide was a winner of the BCS Award. A promising candidate for future exploitation is the CAR system recently developed by Jason Pascoe, a research student working under Professor Brown. This system is used in fieldwork data-capture applications, and embodies new ideas in (a) `stick-e notes', (b) in using context and (c) in user interfaces. The system has already been adopted by Earthwatch, an ecology charity. Professor Brown has recently been a partner in a joint enterprise, involving authors from Xerox, Motorola, DARPA, MIT and the University of Oslo, to categorise context-aware applications [Bro01a].
Dr. Ford Dr. Ford is best known for his R and D work as a Principal Consultant for Logica Cambridge, and then for his HCI research when a Senior Lecturer at Exeter University. Since taking early retirement he has worked on various R&D projects that require technical programming expertise in Java at the interface. He now continues to work on an engine and associated software to efficiently deliver context-aware retrieval data to PDAs.
The idea of context-aware applications was first proposed by researchers from Xerox PARC [Sch94] in 1994. These applications relate to mobile users carrying a PDA that has attached sensors to detect location, nearby equipment, orientation, temperature, etc. The outputs from the sensors, directly or indirectly, represent a current context for each user of an application. An application can use the contextual information to tailor its actions to each specific user. Thus if a sensor detects the user as being in the library, then an application can provide provide information about the library's procedures, and if the user is detected as being near a certain printer, the Print button on their PDA can relate to that printer.
Over the past seven years, a wide range of context-aware applications has been developed. These include guides for tourists and visitors, such as the University of Lancaster's Guide system [Che00], the Sentient Computing work at AT&T Laboratories in Cambridge[Cur99], and the location-based products now being introduced by cellphone service providers (nearest restaurant, local weather). The AT&T work particularly addresses issues of modelling the contextual environment and HCI. This, together with the Georgia Tech work on representing context [Sal99], is useful when designing the fundamentals for our project.
The above projects are not specifically addressed to context-aware retrieval. Probably the largest amount of CAR work performed recently is that of Rhodes and others at MIT [Rho00]. They have built three different CAR systems, and have performed a number of user trials. A key finding is that precision is even more important in CAR than it is in traditional IR. For retrieval they use a hybrid database/IR approach called `fuzzy matching'.
Three developments in hardware are multiplying the future opportunities for context-aware application: (a) the convergence of PDAs and mobile phones; (b) the US Wireless E911 rules [FCC96] requiring that mobile phones be able to report their location, and (c) the increased cheapness and availability of sensors.
In terms of information retrieval, existing CAR applications either have a limited amount of data, such as tourist guides to a single city or attraction, or have a large body of data with identical structure (such as a list of restaurants). Such applications can either use database techniques, or simple Boolean query retrieval techniques. Users, however, could benefit from much larger and diverse sets of data; in addition, in order to tailor information to each user, applications will need to use a much richer context than just location. Overall, therefore, we believe that if the opportunities for context-aware applications are to be realised, much better retrieval methods will be needed. These retrieval methods need to cover both the possible approaches to generating retrieval requests: (a) proactive, where the application triggers information that the user may need (e.g. a document about a church says "trigger me when the user's location is near this church"), and (b) interactive, where requests are initiated by the user. Furthermore the really precise forms of retrieval are likely to be a combination of proactive and interactive (such as a body of triggered information that is selectively retrieved by the user); thus the retrieval methods must allow both these approaches to work in tandem.
Todo: discuss IR and IF research; progress made; relevance to CAR. In CAR the query is derived from the context. Discuss other relevant retrieval systems. Discuss special needs of CAR such as continuous high performance and dynamic data. Why a big challenge.
Two potential reactions by users can kill a CAR system: (1) `the application is too slow: by the time the information was delivered the need for it had passed'; (2) `I waste so much time looking at the irrelevant material delivered by this application, that all potential gains are lost'. Thus our first two aims are (1) to give good performance, remembering that information will usually be delivered on a resource-starved PDA; (2) to obtain good precision.
We have specific ideas for a retrieval system to meet the above aims, but this system would, of course, be only part of the overall application that met the user's needs. Therefore, as a third aim, we also wish to pursue more speculative research, suitable for a Ph.D. student, that examines wider issues. These include (a) the interface to the end-user (which can (i) affect the damage that delivery of an irrelevant document does (e.g. Rhodes' ramping interfaces), and (ii) help to guide the sorts of query the user issues); (b) user modelling; (c) trying to exploit future results of the semantic web community, and in particular trying to capture an associated context from an existing textual document; (d) trying to minimise the amount of re-calculation needed when one retrieval uses only a slightly changed context from its predecessor; (e) temporal aspects (time being a key field of almost any context).
Our specific objectives in realising these three aims are, respectively:
To make our research cost-effective we will exploit existing infrastructure where we can. In particular we will ride on top of the World Wide Web -- we envisage that properly web-enabled cellphones/PDAs will be widely available during the lifetime of our project. We plan that the retrieval engine and document collection will reside on the server side, and the user's current context will be generated on the client side. We think this will be an adequate base for the user testing we wish to do. In the longer term, if our work was commercially exploited, rather more function might be built into the client side. The notations we plan to use for representing contexts, etc., can be readily mapped into XML, thus allowing us to exploit the increasing body of tools for XML, XSL, etc. Our implementation will be based throughout on Java, to give good portability and relatively easy interfacing to the web.
In addition we plan to exploit the Context Matcher we have already built. This has been specially designed to allow plug-ins whereby researchers can plug in new algorithms for scoring, weighting, pre-processing, etc., in order to see if performance can be improved. The Context Matcher is not a finished product, and will need further work as our ideas involve, but it is a good initial platform. The Context Matcher assumes that contexts are represented by a set of fields, each of which is a name/value pair. Values are tuples, and can be of several alternative datatypes, e.g. text, number, location.
Our work is aimed to cover applications where the data is not uniformly structured, and indeed may be largely unstructured. Thus our basic technologies are taken from the words of IR and IF rather than from databases.
The methodology for our research programme will be:
Our novel ideas that we wish to explore are currently:
The programme will be managed by GJFJ. The three participants already have a track record of working together, and have evolved a series of interfaces whereby their contributions can be joined. This collaborative infrastructure proved itself especially during an unusual 3-month period when LF was in Australia, 12,000 miles from the other two. Generally LF will concentrate on the detailed Java implementation work, PJB will focus on the user's context and its pattern of change, and GJFJ will concentrate on retrieval methods. The research student will, we plan, have GJFJ as his/her main supervisor, with support from PJB and LF. If the project starts early in 2002, as we hope, it is quite likely that the research student will not start until October. Overall we feel that this time offset will bring more advantages than disadvantages, particularly as the research student will benefit from a more refined infrastructure. However, given that research programmes rarely start as soon as the researchers hope, we assume below for simplicity that the research student does start at the same time as the overall programme.
The detailed workplan and milestones are as follows:
To avoid the project becoming too diverse there are several research areas that we will not focus on. These include: security and privacy, building distributed systems, agent technology (e.g. for context discovery), structured contexts and levels of abstraction, synthesising high-level context (e.g. that the user is busy) from low-level sensor values, collaborative applications with context-sharing. However, towards the end of the project -- particularly if exploitation is likely -- we may need to look more closely at these areas.
Feed-back from discussions with several parties (Trilogy, Hewlett-Packard, etc.) suggests that 2-5 years from now, suppliers of mobile services will be looking to move on from their current generation of systems to provide high-performance, tailored, high-precision systems, which retrieve from diverse data sources. If the project succeeds we hope that these suppliers will be beating a way to our door.
Dissemination will be by the conventional means of papers in journals/conferences and conferences: to reach a wider audience, and to draw communities together, we plan to publish both in retrieval journals and in mobile computing journals/conferences. As regards exploitation, we do not see patents as the best way of protecting our IPR (though this is a changing world, and we may revise our ideas). Experience also shows that existing research-based implementations, such as our Context Matcher and its plug-ins, are not what exploiters want. Instead it is ideas, experience and expertise. Assuming our work is really successful, we would like to see a spin-off company set up; the University of Exeter now provides extensive support for doing this, and for formalising relationships with industrial partners.
Our largest resource is people, and we plan to use existing named people (Prof. Brown and Dr. Ford). These people are rather more expensive than an average RA -- but not excessively so -- but bring considerable knowledge and expertise, and will not need any time to get up to speed. Todo: justify other resources; conference visits.
Todo: ref to Antony and other Exeter work, ref to Gareth papers.