http://www.dmst.aueb.gr/dds/pubs/jrnl/2003-PC-GTWeb/html/gtweb.html
This is an HTML rendering of a working paper draft that led to a publication. The publication should always be cited in preference to this draft using the following reference:

Citation(s): 25 (selected).

This document is also available in PDF format.

The document's metadata is available in BibTeX format.

Find the publication on Google Scholar

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Diomidis Spinellis Publications


© 2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Position-annotated Photographs: The Geotemporal Web

Diomidis D. Spinellis
Department Management Science and Technology
Athens University of Economics and Business
Patision 76, GR-104 34 Athens, Greece
email: dds@aueb.gr

Abstract

The GTWeb system demonstrates how trip diaries can be created and presented by exploiting the synergies of integrating different information appliances and publicly accessible databases. A GTWeb site consists of a trip overview, timelines, maps, and annotated photographs. The site is created by processesing a user's GPS track log and digital camera pictures, linking them with a gazetteer database, topography, and coastline data. Our choice of consumer-grade appliances demonstrated that many interesting pervasive computing applications can be constructed by combining ordinary equipment in innovative ways, while a survey showed that end-users are interested in tools that allow them to effectively use the new data types generated by their appliances.

Keywords

GPS; digital camera picture; GIS; map; trip diary

1  Introduction

With the advent of digital cameras, photographs are no longer gathering dust forgotten in old shoeboxes, they are instead lying unused in hard disk directories and CDs. The GTWeb system demonstrates how this phenomenon can be addressed by automatically converting raw data from the typical vacation trip into a lively web site. The GTWeb belongs to the ``capture and access'' class of ubiquitous computing applications [1]; it is constructed by integrating pictures taken by a consumer-grade digital camera, a track log recorded by a handheld GPS device [12], and publicly accessible coastline, topography, and gazetteer data. The initial construction of a GTWeb is fully automatic. After the GTWeb has been created, the resulting HTML pages can be manually edited and further enhanced, thus utilizing the application as a tool for creating an initial outline of a trip's diary that will then be annotated with detailed textual descriptions. The GTWeb is accessed through the Internet by a home page providing a trip overview on a topographical map substrate, the trip's location on an azimuthal orthographic projection of the earth globe, a textual description of the trip, and links to detailed timelines, maps, and photograph galleries (Figure 1).

main.gif
Figure 1: The overview page of a personal GTWeb.

The purpose behind the realization of the GTWeb is to examine how trip diaries can be created and presented with minimal effort by exploiting the synergies of integrating different consumer-grade information appliances and publicly accessible databases. An information appliance [22] can be defined as one designed to perform a specific activity, able to share information with other appliances [15]. Many researchers have already examined personal navigation systems and location-based applications; as an example [21] proposes the WorldBoard system for associating information with places on a global scale, [4] describes the use of a personal digital assistant to aid the navigation in exhibition areas and fairs, while [2] describes how sentient computing systems can change their behavior based on their environment. Although the combination of positional data with visual media has been examined as a way to augment existing geographical information systems (GIS) [13], modern research, often performed within the wearable computing community [14], explores how the integration of data from digital cameras and GPS devices can be used to empower human cognition and intelligence. Thus, [20] examine the use of georeferenced photographs in an educational setting as a way to investigate community change, and [24] theorize how tourist photos could be annotated based on GPS-acquired information; an application also proposed by [5] who in addition suggest using the positional data to assist the user in photograph composition and the reconstruction of 3D images. In parallel, more recent research has examined the use of positional data to infer significant locations in a person's life [3], and the automatic summarization of continuously acquired personal multimedia content [10].

The contribution of this article is the concrete demonstration of following ideas:

  1. Combining digital photos with position information in the form of a GTWeb adds value to both. Our application thus exemplifies the documented pervasive computing phenomenon where the whole is more than the sum of its parts [7].
  2. A GTWeb can be readily implemented using existing consumer-grade hardware. With the integration of GPS ports [16,9] and scripting extensions [18] in digital cameras, consumer photography integrated with positional data is becoming ready for the mainstream.
  3. Openly available digital map and gazetteer data provides a rich substrate for building captivating position-based applications.
  4. Positional data representing travel can be usefully combined with stationary images.
  5. The use of static Web pages as the system's output format can avoid some obvious risks regarding the long-term access of the contents [19,11].

In the following sections we present the GTWeb's design and implementation, present examples of its actual use, and discuss the lessons we learned from its implementation.

2  Functional Description

geoweb.gif
Figure 2: The GTWeb functional decomposition.

A GTWeb site consists of the trip overview, timelines, maps, photographs, and a technical summary of the processing details.1 A UML diagram of the GTWeb content tree appears in Figure 2.

The trip overview provides a textual narrative of the whole trip like the following:2

``From 2.08 km S of Kastraki (hill) (topological, street map) (Sun Aug 19, 2001 10:48:55) to 1.74 km W of Metokhion Konstamonitou (populated place) (topological, street map) (Sat Aug 25, 2001 09:14:29) covering a travel distance of 898.02 km at an average speed of 60 km/h over an area of 45909 sq km. Duration 5 day(s), travel time 14:45 (travel map).''

We use timelines to order trip events-like approaches to geographical features and photographs-based on the time of their occurrence thus generating a trip diary:

Wed Aug 22, 2001

12:51:29
Approached (topological, street map) 2.95 km SW of Megali Vigla (hill) (topological, street map) travelling at a speed of 18 km/h.
12:51:30
Photograph. About (most recent fix taken 1 seconds from the picture time) (topological, street map) 2.95 km SW of Megali Vigla (hill) (topological, street map) travelling at a speed of 18 km/h.
[ ... ]
12:57:53
Approached (topological, street map) 2.73 km W of Thivais (populated place) (topological, street map) travelling at a speed of 17 km/h.
13:08:56
Photograph. About (most recent fix taken 5 seconds from the picture time) (topological, street map) 1.43 km SE of Thivais (populated place) (topological, street map) travelling at a speed of 18 km/h.
13:10:25
Approached (topological, street map) 1.52 km SW of Monoxilitai (populated place) (topological, street map) travelling at a speed of 18 km/h.
13:25:16
Approached (topological, street map) 5.48 km S of Moni Khiliandhariou (monastery) (topological, street map) travelling at a speed of 18 km/h.
[ ... ]
13:40:00
Photograph. About (most recent fix taken 3 seconds from the picture time) (topological, street map) 0.39 km NW of Moni Xenofondos (monastery) (topological, street map) travelling at a speed of 19 km/h.

map30.gif
Figure 3: A detailed map of a trip leg.

map31.gif
Figure 4: A boat trip.

Maps and photographs are also ordered in a chronological order and divided into separate pages based on the day the trip was made. GTWeb contains a separate overview map for each trip leg, and detailed maps covering smaller areas. Each detailed map shows the route traveled and geographic features (populated places, streams, hills, etc.) annotated with the time they were approached (figures 3 and 4). Each map is prefixed by a description of the trip part it illustrates.

photos.jpg
Figure 5: Index of boat-trip photographs.

Photographs are indexed in a chronological order using thumbprints and annotated with a description of the time and place they were taken:

Wed Aug 22, 2001 13:42:56
About (most recent fix taken 2 seconds from the picture time) (topological, street map) 0.49 km SE of Moni Xenofondos (monastery) (topological, street map)

The same description, together with links to the corresponding trip leg map and the detailed trip part map, also appears under the full-sized image of each photograph. All descriptions contain links leading to dynamically generated topological and street maps available on public Web sites.

3  Application Design

dfd.gif
Figure 6: Data-flow diagram of the GTWeb generation process.

You can see the data-flow diagram of the GTWeb creation process in Figure 6. The GTWeb software first processes the GPS track log together with the gazetteer database to annotate the track log with the nearest-in Euclidean distance-geographical features for each track point. Topography (a grid of altitude points on the earth globe) and coastline data (closed polygons) is then used to create the various maps. During this phase, the trip track and geographical features are superimposed on the maps drawn by matching the respective longitude and latitude coordinates. Finally, the pictures are allocated into different maps and textually annotated based on the time assigned by the respective appliance to each track log point and each digital picture. The availability of time information for both track log points and the pictures was the crucial factor that allowed us to integrate the two different data sets.

geodata.gif
Figure 7: The GTWeb data model.

The data model used to construct the GTWeb is depicted as a UML diagram in Figure 7. The primary types of data objects are:

To create the GTWeb the three data objects are extended by combining features of their parent classes. Thus

The time and location of the traveler's ``visit'' to the vicinity of a given geographical feature is determined by the track log point that has the smallest Euclidean distance to the given feature. This can be formalized as follows:

  1. The coordinates of all known geographical features form a set F, while the coordinates of the track followed by the user form a set T. Given two coordinate pairs a, b we use the notation |a-b| to denote the Euclidean distance between a and b (Ö[(a2-b2)]).

  2. We then form an annotated track log A by associating each track point t with its nearest feature f:
    A = {(t,f) | t Î T Ùf Î F Ù"f¢ Î F |t-f| £ |t-f¢|}

  3. Finally, a set of ``visits'' V is formed from the annotated track log points that are nearest to each feature:
    V = {(t,f) | (t,f) Î A Ù"(t¢,f) Î A |t-f| £ |t¢-f|}

Most data is stored in its native format, apart from picture metadata where an intermediate program layer transforms filesystem-resident information into XML that is used for further processing. Thus a photograph's details will appear as follows:

<photo>
<name>DSC00007.JPG</name>
<time>998474606</time>
<caption>Ouranoupoli</caption>
<localtime>Wed Aug 22 13:03:26 2001</localtime>
<gmtime>Wed Aug 22 10:03:26 2001</gmtime>
</photo>

In the future standardized schemas based on XML should probably be used for interfacing and accessing all data, thus avoiding incompatibilities between different cameras and GPS devices. Similarly, at the physical level the serial NMEA-based interface that we used for GPS data capture and the compact flash filesystem we used for transferring the photographs could probably be standardized through uniform USB or Bluetooth device profiles.

For the selection of the GTWeb presentation and implementation technologies we had to choose between three different alternatives. A query-based interface would present results (maps, photographs) based on conditions specified by a user (show me where I was on August 17th, 2001). Such an approach however is unsuitable for casual browsing, which we felt was a highly desirable feature. If the above approach was supplanted by a dynamic browsing interface, it would allow users to create content on the fly based on their actions; using this approach users would be able to zoom and pan on the maps and photographs. However, the drawback of both approaches is that they would depend at run-time on a number of large software applications such as a relational database, an application server, and a GIS. The complexity of these applications might be an inhibiting factor for the adoption of a system we designed mainly for personal use. In addition, the platform's software and hardware requirements would introduce maintenance problems and would create a significant preservation risk for material that would typically be archived for decades-who hasn't nostalgically browsed photo albums or diaries recorded 20 or 50 years ago? We feel that the static HTML content we selected as our GTWeb presentation format is a lot more likely to survive a series of system upgrades over a period of 10-50 years, than a perhaps more versatile system that would create content dynamically.

4  Implementation and Actual Use

The implementation of GTWeb relies heavily on a number of publicly available software packages and databases. Specifically, we used the GMT tools to draw the maps [23], the netpbm toolkit and Ghostscript to manipulate images, and the Perl language for composing the Web pages. In addition, we obtained geographical information from the GMT coastline database, the National Geophysical Data Center 5-minute earth topography (ETOPO5)3 and global land one-kilometer base elevation model (GLOBE)4 digital terrain data, and the National Imagery and Mapping Agency's (NIMA) GEOnet Names Server (GNS) gazetteer.5

The system has been implemented in the form described in the previous section and has been used to illustrate a number of trips. In practice it works well for summarizing relatively long (100km) trips; shorter distances are less effectively presented due to the lack of publicly available low-scale digital geographic data. This is also the reason we are currently not providing picture hyperlinks from the maps. A large number of photographs taken at the same location will still be correctly categorized and ordered by the time they were taken, but their geographic annotation will not be very informative. In the future, publicly available coordinate positions for elements such as monuments, town areas, road names, and other notable features could be used to address this shortcoming. A database of such feature details can be created through community cooperative efforts by harvesting elements such as photograph captions. In the form the system is currently realized it is more useful for presenting car, plane, or boat trips, than e.g. hiking or bicycle excursions.

A small, but irritating, problem we encountered concerned the time synchronization of the two appliances. The correct setting of an appliance's clock is a task notoriously neglected, thus when correct time stamps are needed for synchronizing the data of the digital camera with that of the GPS they may be unavailable. Furthermore, timezones and the daylight savings time create additional challenges. Obviously both appliances need to be synchronized to use the same timeframe as a common reference. However cameras typically operate on local time, GPS devices on UTC time, and different PC operating systems on one or the other. In addition, meaningful captions and timelines have to be generated, not based on the local time of the processing computer, but on the local time that was in effect in the place and on the date where the trip was made. Correctly handling and documenting this behavior is a problem that we never handled to our complete satisfaction.

One other important issue that will emerge once GTWebs are published on the web, concerns the creators' privacy. A GTWeb may reveal more data and to more people than its publisher realizes. As an example, a speeding violation can be deduced by examining a trip's timeline. Appropriate measures have to be taken to distinguish between an individual's or a family's intranet hosting personal experiences from the public Internet. The former can be perhaps hosted on CD-ROM media never to be shared on the Internet. However, keep in mind that publishing and sharing the details of a trip is a time honored tradition that we humans seem to enjoy from the early antiquity.

To investigate end-user views of the GTWeb's presentation format we conducted a small informal study by directing around 200 members of our academic community to view a sample trip report and fill-in an online questionnaire. We had 30 responses; a 15% response rate. The (self-selected) population of our survey's respondents can be considered young with ages ranging from 18 to 35 years and an average age of 22 years. Their sex was roughly balanced (46% female, 53% male), as were their perceived IT skills: 13% considered themselves beginners, 66% reported they used computers with confidence, and 20% considered themselves experts. When asked to compare GTWeb with other ways traditionally used to present photographs 63% found GTWeb better than a paper-based album, while 83% found GTWeb better than a plain on-line photograph collection (26% and 13% respectively preferred the alternative presentation). Photograph captions and online maps were found interesting by 80% and 86% of the respondents, while 16% and 6% found them useless, and 3% and 6% found them irritating. Finally, 83% answered in two separate questions that the would like to use GTWeb to present information about their trips and also use it as the only way to present information about their trips. This figure dropped to 63% when asked whether they would like to use GTWeb to present their photograph collection, and to a meager 13% who said they were likely to use GTWeb as the only way to present their photograph collection.

The above figures, although not an outcome of a statistically rigorous study, indicate that the strongest advantage of GTWeb is its presentation of spatial data in the form of annotated maps. The organization of photographs, although not criticized, was mostly considered a ``nice to have'' feature that would probably be supplemented by additional dissemination forms such as photo albums, email, and (increasingly) multimedia messaging (MMS) exchanges. In retrospect, we should have been expecting this finding, since digital photographs, offering most affordances of their paper-based relatives (and some additional ones), live and compete in an already rich ecosystem that has evolved over a period of more than 150 years. In contrast, detailed spatial data of the form generated by GPS devices is a relatively new phenomenon. GPS receivers are gradually being added into consumer mobile electronics [8], but most people have no experience with using, presenting, and disseminating the data these devices generate.

5  Possible Extensions

GTWeb can be enhanced in a number of different ways. The information visualization can be evaluated following the waterfall process described in [6] and improved along the lines suggested by the LifeLines work [17]. With the emergence of digital video recorders and cheap storage devices, GPS-derived positional data could also be used for annotating video sequences. Moreover, the GTWeb maps could be further enhanced with hyperlinks leading directly to the relevant photographs. In addition, topology data could be used as a basis for implementing a three-dimensional virtual tour following the original tracks. Furthermore, digital compass information (available on some GPS devices) can be integrated with the rest of the data to provide directional details about each photograph. Cameras could be further enhanced to detect and record the camera's rotation and inclination for each photograph. Thus, photographs containing three-angle rotational and lens setting metadata could then be automatically annotated to mark interesting features or create image-based hyperlinks. Perhaps the most interesting enhancement would be a facility linking a GTWeb with other people's GTWebs and similar cooperative endeavors such as the global confluence project.6 As an example, one could obtain a list of people who have visited the same place, and links to their respective GTWebs. This last enhancement, also relates with our survey's most interesting result: the perceived need to present and disseminate the spatial data we are increasingly generating.

6  Lessons Learned

The design and implementation of the GTWeb taught us a number of important lessons regarding the presentation of trip diaries and the integration of information appliances.

The informal survey we conducted showed that end-users, although generally positive towards new ways for organizing, displaying, and disseminating digital photographs, are mostly interested in tools that allow them to effectively use the new data types generated by their appliances, in our case the trip log's coordinates. We believe that designers of other applications dealing with novel data types, such as e.g. RFID tag data streams captured from consumer goods, will face in the future similar opportunities.

Standardization played a vital part in our endeavor. All topography elements, gazetteer information, coastlines, and the GPS track log were based the same standard geodesic system (WGS-84) making it possible to superimpose and link elements acquired from the end-user device and different agencies on the same map. In addition, track log information could be downloaded from the GPS device using a common hardware interface, and the photographs could be accessed on the camera's storage device in a filesystem format recognized by our workstation's operating system. Furthermore, modular open-source software and public databases that could be downloaded in their entirety provided the facilities for annotating the photographs and displaying the track logs in a meaningful context.

One other important lesson from building the GTWeb system concerns the importance of what was in effect ancillary data for integrating the two appliances. Both the GPS track log and the photographs were tagged with date and time information that we exploited to link them together. Most of the value-added GTWeb content was derived by joining (in the database sense of the word) data using as a join key approximate time or location matches. We believe that making this type of data, generated by many appliances, available will breed a number of innovative applications.

Our low-end choices of technology were also instructive. The content delivery mechanism we used (static web pages) although not sophisticated when compared to the various active content technologies proved surprisingly effective, portable, and resilient. Similarly, our choice of consumer-grade appliances demonstrated that many interesting pervasive computing applications can be constructed by combining ordinary equipment in innovative ways.

References

[1]
Gregory D. Abowd and Elizabeth D. Mynatt. Charting past, present, and future research in ubiquitous computing. ACM Transactions on Computer-Human Interaction, 7(1):29-58, March 2000.

[2]
Mike Addlesee, Rupert Curwen, Steve Hodges, Joe Newman, Pete Steggles, Andy Ward, and Andy Hopper. Implementing a sentient computing system. Computer, 34(8):50-56, August 2001.

[3]
Daniel Ashbrook and Thad Starner. Learning significant locations and predicting user movement with GPS. In International Symposium on Wearable Computing, October 2002.

[4]
Gerald Bieber and Martin Giersich. Personal mobile navigation systems-design considerations and experiences. Computers & Graphics, 25:563-570, 2001.

[5]
Neill Campbell, Henk L. Muller, and Cliff Randall. Combining positional information with visual media. In Third International Symposium on Wearable Computers (ISWC '99). IEEE Computer Society, October 1999.

[6]
William E. Cartwright and Gary J. Hunter. Towards a methodology for the evaluation of multimedia geographical information products. GeoInformatica, 5(3):291-315, 2001.

[7]
Nigel Davies and Hans-Werner Gellersen. Beyond prototypes: Challenges in deploying ubiquitous systems. IEEE Pervasive Computing, 1(1):26-35, January-March 2002.

[8]
Melvin Diaz. Integrating GPS receivers into consumer mobile electronics. Multimedia, 6(4):88-90, October/December 1999.

[9]
Daintry Duffy. GIS goes worldwide. CIO, 15(20):70, 2002. Online http://www.cio.com/archive/080102/et_article.html (current August 2002).

[10]
M. Gelgon and K. Tilhou. Automated multimedia diaries of mobile device users need summarization. In F. Paternò, editor, 4th International Symposium on Mobile Human-Computer Interaction - Mobile HCI 2002, pages 36-44. Springer-Verlag, September 2002. Lecture Notes in Computer Science 2411.

[11]
Alan R. Heminger and Steven B. Robertson. The digital rosetta stone: A model for maintaining long-term access to static digital documents. Communications of the Association for Information Systems, 3:314-329, 2000.

[12]
Jeffrey Hightower and Gaetano Borrielo. Location systems for ubiquitous computing. Computer, 34(8):57-66, August 2001.

[13]
Menno-Jan Kraak. Integrating multimedia in geographical information systems. IEEE Multimedia, 3(2):59-65, Summer 1996.

[14]
Steve Mann. Wearable computing: Toward humanistic intelligence. IEEE Intelligent Systems, 16(3):10-15, May/June 2001. Special Issue on Wearable Computing and Humanistic Intelligence.

[15]
Donald A. Norman. The Invisible Computer. MIT Press, Cambridge, 1998.

[16]
OPSiS, Alimos, Greece. PhotoGPS Users Guide, July 2001. Available online http://www.digital-opsis.com.

[17]
Catherine Plaisant, Brett Milash, Anne Rose, Seth Widoff, and Ben Shneiderman. Lifelines: visualizing personal histories. In Conference proceedings on Human factors in computing systems, pages 221-ff. ACM Press, 1996.

[18]
Rich Robinson. DigitaScript: A scripting language for digital cameras. Dr. Dobb's Journal, January 2001.

[19]
Jeff Rothenberg. Avoiding technological quicksand: Finding a viable technical foundation for digital preservation. Technical report, Council on Library and Information Resources, 1755 Massachusetts Av., Washington DC 20036, January 1999. Online http://www.clir.org/pubs/reports/rothenberg/pub77.pdf, current December 2002.

[20]
Brian K. Smith, Erik Blankinship, Alfred Ashford III, Michael Baker, and Timothy Hirzel. Inquiry with imagery: Historical archive retrieval with digital cameras. In Proceedings of the seventh ACM International Conference on Multimedia (Part 1), pages 405-408, New York, 1999. ACM Press.

[21]
James C. Spohrer. Information in places. IBM Systems Journal, 38(4):602-628, 1999.

[22]
Roy Want and Gaetano Borriello. Survey on information appliances. Computers Graphics and Applications, 20(3):24-31, May/June 2000.

[23]
P. Wessel and W. H. F. Smith. Free software helps map and display data. EOS Trans. Amer. Geophys. U., 72(41):441, 445-446, 1991.

[24]
Jie Yang, Weiyi Yang, Matthias Denecke, and Alex Waibel. Smart sight: A tourist assistant system. In Third International Symposium on Wearable Computers (ISWC '99). IEEE Computer Society, October 1999.

Biographical Information

Diomidis Spinellis holds an MEng in Software Engineering and a PhD in Computer Science both from Imperial College (University of London, UK). Currently he is an Assistant Professor at the Department of Management Science and Technology at the Athens University of Economics and Business, Greece. He is the author of the book Code Reading: The Open Source Perspective (Addison-Wesley, 2003) and more than 70 journal articles and conference papers. He has contributed software to the BSD Unix distribution, the X Window system, and is the author of a number of open-source software packages, libraries, and tools. His research interests include Ubiquitous Computing, Information Security, and Software Engineering. Dr. Spinellis is a member of the ACM, the IEEE Computer Society, the Greek Computer Society, and the Technical Chamber of Greece.

Footnotes:

1 You can see a GTWeb example at http://www.spinellis.gr/gtweb/Chalkidiki.

2 We use italic characters to denote hyperlinks.

3 http://dbwww.essc.psu.edu/dbndx/tree/data/dem/etopo5.html

4 http://www.ngdc.noaa.gov/seg/topo/

5 http://www.nima.mil/gns/html/index.html

6 http://www.confluence.org