== Results for cache of size 39 and look ahead of 3 km. ==


Research issues in context-aware retrieval: evaluation of straying outside a context-aware cache

Peter Brown  and Gareth Jones
Department of Computer Science, University of Exeter, Exeter EX4 4QF, UK
P.J.Brown@ex.ac.uk, G.J.F.Jones@ex.ac.uk

Summary

This experiment measures recall degradation when the user's path of locations strays outside the area covered by a cache.

Fixed parameters of the experiment

Assumed purpose of cache:
to cover disconnected operation.
Contextual fields used:
just location.
Document collection:
items relating to tourist attractions in the South West; consists of 626 items; reasonably evenly distributed over the geographical area, but with some concentration on the towns and cities. Each attraction has a point as its location. All attractions are on land, and the border of the area covered consists of coastline, plus county boundaries between those counties deemed to be in the South West and those not.
How the cache was built
the cache for each test was built using a context consisting of a square, 20 km by 20 km, centred on the starting point for the test. We call this the cache-building square. This square set of locations was used as the current context, and matched against the location of each document in the document collection: the documents that matched, according to some threshold, formed the cache. The chosen threshold scores for building caches depended on the matching algorithm used; the aim was to get an average cache size for each test of about one fifteenth of the document collection. (In fact in each set of tests the lowest threshold used for retrieving from the cache was the same as the threshold used for building the cache.) The cache normally included a lot of tourist attractions outside the cache-building square, provided their locations were close to the square and thus got a good matching score. (With the thresholds used for building the cache, all the sites associated with locations within the cache-building square qualified, as did some outside; hence it made no different how big the cache-building square was: a size of 0 would give the same results, i.e. the same cache content. However for high thresholds the size would matter.) It was assumed that the user's device would be big enough to hold any of the caches generated (not an unrealistic assumption given the small side of the document collection, and hence the even smaller size of the caches).
Matching and scoring algorithms:
the basis for one set of tests is the default algorithms built into the Context Matcher. Scores for matching two locations decay roughly linearly as the locations move apart (actual algorithm: score = 2.0 * ((target - abs(target - query)) / target) ). A second set of tests were performed using the "research library" algorithms: these decay as N squared according to distance apart. (The "research library" is a Context Matcher facility for supplying new algorithms that override the default ones.) These algorithms generated much lower scores than the default ones (todo make scores higher and therefore a better basis for comparison?) and thus, to make the tests comparable, lower thresholds were set both for building the cache and for retrieving from it.

Experimental approach

The experiment consisted of a number of individual tests. Each test had a starting point, and the cache for the test was built using this starting point, i.e. it was the centre of the cache-building square. The starting points were chosen by hand, some being in the middle of popular towns and cities and some being `in the middle of nowhere'. The choice was based on hunch of where a tourist aid would be most used, not on the basis of any deep analysis.

In each test the user was assumed to proceed steadily in a straight-line path, performing a retrieval at regular retrieval points, a fixed distance apart. There were 11 retrieval points: the first was at the starting point of the test; the sixth was on the very edge of the square used to build the cache, and the remainder were increasingly further outside this area. (Actually the paths were diagonal ones, that went from the starting point to the NE corner of the square, and then an equal distance beyond.) The straight line paths were chosen so that the user remained in the overall area covered by the document collection, i.e. the South West of England (and did not stray into the sea!). The following picture shows how the user's path proceeds.
Picture of cache and user's progress

The retrieval at each of the 11 retrieval points used the cache; the results of this retrieval were compared with what would have been retrieved if the whole document collection had been used rather that the cache. Each retrieval retrieved all the documents whose score was greater than a given threshold (a document's score would not, of course, be affected by whether it came from the cache or from the original document collection). A count was made of the total number of documents "lost" over the 11 retrieval points (i.e. documents retrieved from the original document collection but not in the cache). We call these lost retrievals. A count was also made of the total number of documents retrieved from the cache.

Each test was repeated -- we call this a sequence of sub-tests -- with different thresholds (the same threshold was used for all 11 retrieval points).

Commentary on results

The first of the 11 retrieval points is at the starting point, and assuming this retrieval does not have a lower threshold than that used to build the cache, will have no failures -- i.e. everything that would have been retrieved from the original document collection is in the cache. The last of the 11 retrieval points is well outside the cache-building square, and this point is likely to have the most failures. Of the remaining points, the nearer they are to the start the less failures are likely. In the tests the best results are likely to be where there are plenty of attractions inside the cache-building square and many fewer just outside. Thus tests that started at the centre of cities had good results. Not surprisingly the worst results were from the test on Dartmoor, which started in a wild area with few attractions; its cache was therefore small, and its rate of failure turned out to be even worse than one would expect from the proportional size of the cache.

Tourists do not flock to the Somerset Levels yet surprisingly the test centred here (at the small town of Somerton) had the worst results, i.e. the most sites missed by the cache. It was especially proportionally worse than the other case for high retrieval thresholds (e.g at a 99% threshold, Plymouth had 26 hits and no losses, whereas Somerton had 6 hits and 2 losses). The direction of progress from the starting point took the user ever closer to more important tourist areas such as Wells, Bath and Glastonbury.

The number of retrievals of course increased as successively lower thresholds were used, but the proportion of lost retrievals increased quite sharply. This was surprising, though some increase would be expected. (To take an illustrative example if you are at the last retrieval point, about 14 kilometres outside the cache-building square, you might, with a low threshold, get sites 20 kilometres away, and these might be 34 kilometres outside the cache-building square; such far-away sites are unlikely to be in the cache, even though the cache is built with a low threshold.)

Todo: more refined conclusions.

Possible text for paper

We have performed some preliminary tests of context-aware caching. In order to reduce the number of variables we concentrated on one contextual field, location. The other contextual fields were kept constant during the experiments, and were not active in matching. Our document collection consisted of information about tourist sites, each of which had an associated location. We chose a set of different starting points, all well within the area covered by the document collection, and each representing a possible place a tourist might start wanting information. For each starting point, we built a context-aware cache that encompasses sites whose location matched a square centred on the starting point. We call this square the cache-building square. The cache-building square had sides of 20 kilometres long. We set a fairly generous threshold of 50% for inclusion in the cache, and as a result about one fifteenth of the documents in our collection went into the cache. (Many of these were outside the cache-building square, since locations outside, but close to, the square would still get a good score.) We assumed the cache was being used during disconnected operation, and thus there was no way of updating it.

Our tests of the cache were quite demanding: we assumed that the user went in a straight-line path, starting at the centre of the cache-building area, proceeding to the edge of the square, and continuing until he was an equal distance outside. We assume he made retrievals at 11 points equally spaced along the way. (Thus the first five points were inside the square, the next one was on the edge, and the remaining five were increasingly far outside.)

We counted the total number of documents retrieved at the 11 points. We then repeated each experiment using the original document collection rather than the cache, and again counted the number of documents retrieved. The difference between the two numbers represented the number of potential retrievals lost because of the use of the cache, i.e. the lost retrievals. We accumulated these numbers for all our starting points.

We repeated these experiments using different threshold scores (e.g. 99%, 98%. 95%, ...) for retrieval. The number of lost retrievals increased dramaticly as the threshold decreased. We expected some increase (a lax threshold would allow the retrieval of documents associated with locations a long way from the cache-building square) but were surprised at its magnitude.

As a final step we tried different algorithms for matching two locations: one algorithm decayed linearly according to the distance apart of the locations, and the other decayed as N squared.

Some preliminary conclusions are:





APPENDIX A: results of individual tests using default algorithm and cache of size 39

Test with starting point at Dartmoor using default (linear) matching algorithm and cache size 39

Cache is built using a square with sides 20 km. User's path has steps of (2000, 2000) metres, with a total of 11 retrievals. Amount of look-ahead (both E and N) is 3km. Cache size is 39. (Size is set as a fixed value, derived from original size of 63).

Point furthest from centre of cache (25300000, 07300000) is 2330563, 0847571; distance away is: 23.1709 kilometres

Results for Dartmoor (OS 250,070) using default (linear) algorithm and 3 look-ahead
Threshold Total benchmark retrievals Retrievals lost
99% 5 2
98% 16 4
96% 51 16
93% 258 93

Default matching algorithm used was: (version pjb May 29 2002) .

Commentary on this starting point: the cache is centred on a wild moorland area.

Test with starting point at Exeter using default (linear) matching algorithm and cache size 39

Cache is built using a square with sides 20 km. User's path has steps of (2000, 2000) metres, with a total of 11 retrievals. Amount of look-ahead (both E and N) is 3km. Cache size is 39. (Size is set as a fixed value, derived from original size of 78).

Point furthest from centre of cache (29500000, 09500000) is 3096091, 0852581; distance away is: 17.5841 kilometres

Results for Exeter (OS 292,092) using default (linear) algorithm and 3 look-ahead
Threshold Total benchmark retrievals Retrievals lost
99% 28 1
98% 49 7
96% 159 20
93% 465 148

Default matching algorithm used was: (version pjb May 29 2002) .

Commentary on this starting point: this may be a favourable case as the cache is centred on an area with lots of attractions, and the area outside the cache has fewer attractions.

Test with starting point at Plymouth using default (linear) matching algorithm and cache size 39

Cache is built using a square with sides 20 km. User's path has steps of (2000, 2000) metres, with a total of 11 retrievals. Amount of look-ahead (both E and N) is 3km. Cache size is 39. (Size is set as a fixed value, derived from original size of 44).

Point furthest from centre of cache (25000000, 05600000) is 2377095, 0801320; distance away is: 27.0573 kilometres

Results for Plymouth (OS 247,053) using default (linear) algorithm and 3 look-ahead
Threshold Total benchmark retrievals Retrievals lost
99% 25 0
98% 49 0
96% 152 13
93% 376 107

Default matching algorithm used was: (version pjb May 29 2002) .

Commentary on this starting point: this may be a favourable case as the cache is centred on an area with lots of attractions, and the area outside the cache has fewer attractions; the cache may, however, be smaller as Plymouth is on the sea, so part of the cache area covers the sea -- which has no tourist attractions.

Test with starting point at Somerton using default (linear) matching algorithm and cache size 39

Cache is built using a square with sides 20 km. User's path has steps of (2000, 2000) metres, with a total of 11 retrievals. Amount of look-ahead (both E and N) is 3km. Cache size is 39. (Size is set as a fixed value, derived from original size of 68).

Point furthest from centre of cache (35300000, 13300000) is 3341510, 1329660; distance away is: 18.9003 kilometres

Results for Somerton (OS 350,130) using default (linear) algorithm and 3 look-ahead
Threshold Total benchmark retrievals Retrievals lost
99% 6 0
98% 25 3
96% 164 20
93% 483 183

Default matching algorithm used was: (version pjb May 29 2002) .

Commentary on this starting point: the cache is centred on non-prime tourist area, but there are prime areas nearby.

APPENDIX B: results of individual tests using research library algorithm and cache of size 39

Test with starting point at Dartmoor using research library (N-squared) matching algorithm and cache size 39

Cache is built using a square with sides 20 km. User's path has steps of (2000, 2000) metres, with a total of 11 retrievals. Amount of look-ahead (both E and N) is 3km. Cache size is 39. (Size is set as a fixed value, derived from original size of 95).

Point furthest from centre of cache (25300000, 07300000) is 2309563, 0594948; distance away is: 25.9494 kilometres

Results for Dartmoor (OS 250,070) using research library (N-squared) algorithm and 3 look-ahead
Threshold Total benchmark retrievals Retrievals lost
50% 0 0
20% 1 1
10% 13 2
5% 340 139

Non-default algorithm used was: Context Matcher with N-squared location-matching algorithm: version of Sept 4.1 2002.

Commentary on this starting point: the cache is centred on a wild moorland area.

Test with starting point at Exeter using research library (N-squared) matching algorithm and cache size 39

Cache is built using a square with sides 20 km. User's path has steps of (2000, 2000) metres, with a total of 11 retrievals. Amount of look-ahead (both E and N) is 3km. Cache size is 39. (Size is set as a fixed value, derived from original size of 112).

Point furthest from centre of cache (29500000, 09500000) is 2955010, 1129010; distance away is: 17.907 kilometres

Results for Exeter (OS 292,092) using research library (N-squared) algorithm and 3 look-ahead
Threshold Total benchmark retrievals Retrievals lost
50% 1 0
20% 10 0
10% 42 4
5% 534 192

Non-default algorithm used was: Context Matcher with N-squared location-matching algorithm: version of Sept 4.1 2002.

Commentary on this starting point: this may be a favourable case as the cache is centred on an area with lots of attractions, and the area outside the cache has fewer attractions.

Test with starting point at Plymouth using research library (N-squared) matching algorithm and cache size 39

Cache is built using a square with sides 20 km. User's path has steps of (2000, 2000) metres, with a total of 11 retrievals. Amount of look-ahead (both E and N) is 3km. Cache size is 39. (Size is set as a fixed value, derived from original size of 71).

Point furthest from centre of cache (25000000, 05600000) is 2733038, 0700543; distance away is: 27.1825 kilometres

Results for Plymouth (OS 247,053) using research library (N-squared) algorithm and 3 look-ahead
Threshold Total benchmark retrievals Retrievals lost
50% 0 0
20% 6 0
10% 41 0
5% 426 130

Non-default algorithm used was: Context Matcher with N-squared location-matching algorithm: version of Sept 4.1 2002.

Commentary on this starting point: this may be a favourable case as the cache is centred on an area with lots of attractions, and the area outside the cache has fewer attractions; the cache may, however, be smaller as Plymouth is on the sea, so part of the cache area covers the sea -- which has no tourist attractions.

Test with starting point at Somerton using research library (N-squared) matching algorithm and cache size 39

Cache is built using a square with sides 20 km. User's path has steps of (2000, 2000) metres, with a total of 11 retrievals. Amount of look-ahead (both E and N) is 3km. Cache size is 39. (Size is set as a fixed value, derived from original size of 105).

Point furthest from centre of cache (35300000, 13300000) is 3396450, 1186420; distance away is: 19.6703 kilometres

Results for Somerton (OS 350,130) using research library (N-squared) algorithm and 3 look-ahead
Threshold Total benchmark retrievals Retrievals lost
50% 0 0
20% 1 0
10% 13 2
5% 561 251

Non-default algorithm used was: Context Matcher with N-squared location-matching algorithm: version of Sept 4.1 2002.

Commentary on this starting point: the cache is centred on non-prime tourist area, but there are prime areas nearby.

APPENDIX C: summary for cache built with 20km square and 3km look-ahead and cache of size 39

RUNS USING DEFAULT (LINEAR) ALGORITHMS

Total number of tables is 4. Total size of the 4 caches is 156. Total number of retrievals (over the 11 retrieval points) from the cache is 2311. Total number of retrievals lost is 617.

RUNS USING N-SQUARED ALGORITHMS

Total number of tables is 4. Total size of the 4 caches is 156. Total number of retrievals (over the 11 retrieval points) from the cache is 1989. Total number of retrievals lost is 721.

APPENDIX D: summary of results for different amounts of look-ahead and cache of size 39

Total size of the 48 caches is 1872. Average cache size is 39. Maximum cache size is 39. Minimum cache size is 39.

Results summary
Algorithm Look-ahead Benchmark retrievals Retrievals lost
default 0 2311 799
default 3 2311 617
default 5 2311 517
default 10 2311 415
default 15 2311 627
default 20 2311 1298
research 0 1989 832
research 3 1989 721
research 5 1989 648
research 10 1989 586
research 15 1989 686
research 20 1989 1095