This is an HTML rendering of a working paper draft that led to a publication. The publication should always be cited in preference to this draft using the following reference:
User Interface Development for Interactive Television: Extending a Commercial DTV Platform to the Virtual Channel API
Konstantinos Chorianopoulos, Diomidis Spinellis
Athens University of Economics & Business
47 Evelpidon & Lefkados Str., 113 62 Athens, Greece
We explore the generation of interactive computer graphics at digital set-top boxes in place of the fixed graphics that were embedded to the television video before the broadcast. This direction raises new requirements for user interface development, since the graphics are merged with video at each set-top box dynamically, without the traditional quality control from the television producers. Besides the technical issues, interactive computer graphics for television should be evaluated by television viewers. We employ an animated character in an interactive music television application that was evaluated by consumers, and was developed using the Virtual Channel Control Library, a custom high-level API, that was built using Microsoft Windows and TV technologies.
Digital set-top box, user interface, animated character, music video clip, TiVo
Computer graphics have played an instrumental role in delivering the TV experience  . For example, computer graphics and animation have been used widely in the post-production of television content (Figure 1) for inserting the channel logo, animated intros/endings, and displaying various kinds of information (sport statistics, quiz show status, stock market ticker, news ticker, etc). In most cases, computer graphics are merged with audiovisual content and are converted to video at the production studio or at the broadcast station. The final video is transmitted and displayed from TVs in its fixed form, without any opportunity for local dynamic update of the embedded computer graphics. For supporting high level development for digital STB applications and local generation of dynamic computer graphics, we have defined the Virtual Channel model and implemented the respective Virtual Channel Control Library (VCCLib), which was built on top of the Microsoft Windows and TV (MSTV) platforms.
Up to now, the TV viewers’ interactive experience has been stuck at the Teletext, which is actually an information tool and is usually unrelated to the running television content. Recent advances in STB technology have introduced real time video capturing and rich multimedia at consumers’ homes. Digital STBs like TiVo store television content, while the user controls the television flow with an on-screen user interface. In addition to dynamically embedded graphics for advanced user interfaces, television content can now be enriched with rich computer generated content, like animated characters and Internet information sources. For supporting consumer interactivity, we exploit locally stored music video clips and Internet information about artists and songs, to build an easy to use interactive music television application. We have developed the music ITV application using the VCCLib and we report our experience at two levels; while developing and while evaluating for usability the application.
Figure 1 Computer Graphics applications for television post production: News ticker from Fox news (top left), Teletext information from CNN (top right), Music informaiton from MTV (middle left), Channel Mosaic from Disney (middle right), Financial news from Bloomberg (bottom left), Sport news from EuroSportNews (bottom left)
With the advent of digital television (DTV) transmission, Internet-enabled set-top boxes (STBs) and digital video recorders (DVRs), consumers are starting to have the need for a multimedia experience that seamlessly integrates diverse sources of information and entertainment content. A few years before the emergence and widespread adoption of the Internet and the Web, researchers were suggesting an integrated computer-television product  . Nevertheless, immature technology, technical difficulties, reduced consumer demand, failed business models   and the success of the Web have been postponing the convergence between the computer and the television. Previous research has separately addressed the integration of television a) with the Internet  and b) with locally stored content , but there is still no integrated approach for both types of content.
Digital broadcast, persistent local storage and Internet resources should be used to augment the television as a medium of entertainment and passive information discovery . The television experience usually consists of two layers: 1) the background is reserved for video play-out, while 2) the foreground is used to display overlaid information. Both layers are traditionally created and controlled at the production studio or at the broadcasting station. For future ITV systems, we propose that the digital STB should be imagined as a virtual channel provider, where the perceived TV experience is produced from a combination of local storage, broadcast transmissions and Internet resources of audiovisual content, applications and text data/metadata.
Figure 2 A generic model of a system employing the virtual channel metaphor, contrasted to traditional broadcasting
Thus, the name Virtual Channel refers to the television channels not being static audiovisual experiences that are shared by all TV viewers in the same way, but a dynamic synthesis of discrete video, graphics, and data controlled at the consumer’s STB. The core idea behind the Virtual Channel proposal , as depicted in Figure 2, is that the decision-making about media programming has shifted from the media source to the STB. The television experience is now created and controlled at the STB. In the present paper, we are introducing an implementation of the Virtual Channel model that supports a few of the model’s most important properties:
§ Local storage: A digital STB with a hard disk like TiVo allows the consumer to time-shift, pause and control the otherwise linear flow of the television broadcast.
§ Internet resources: Data broadcasting may be used to provide real-time updates of popular content like stock quotes, but the Internet is more flexible for providing personalized information to a diverse audience.
§ Video overlays: Traditional television content remains at the core of ITV services and can be optionally enhanced with interactive elements (user interfaces) and with additional personalized information that appear inside unobtrusive semi-transparent windows or at the edges of the screen.
§ Continuous video flow: A TV screen that stays still is beyond the previous experience of consumers and will feel unfamiliar. ITV applications should support continuous video flow by default, unless consumers actively select to stop the video.
§ Advertising breaks: The cost of TV production is very high and it has been traditionally supported by advertising, at least at some part, which is the case even for subscription schemes. ITV could also enhance the traditional advertising models with personalization and new advertising schemes.
§ Time-driven user interface: The appearance of an interactive element on screen can be triggered by the user, but for the most part it is the application logic and producers’ rules that define when the consumer may interact with additional content.
§ Relaxed control: Watching TV content has traditionally been a passive and low engagement activity. User interfaces for augmenting TV content should support relaxed use.
The ITV platform market is dominated by simple digital STBs, also known as integrated receivers/decoders (IRDs), running each manufacturer’s real time operating system and offering limited external programmability. IRDs’ market dominance is followed by a few competing, incompatible, and proprietary APIs (e.g. OpenTV, Liberate). In an industry that is driven by sheer volume, application developers have to develop the same application for multiple APIs. There are also a few independent organizations that define standards for ITV application development, like the TV-Anytime forum (http://www.tv-anytime.org), although member organizations are not obligated to conform. Despite the many alternative choices, researchers and engineers with an information technology background will find more flexibility and familiarity with MSTV or MHP that we review next.
MHP is the most widely accepted standard for interactive television applications. Apart from Microsoft, all other technology providers are either opting for MHP application development or are developing their own MHP-compliant implementations. Nevertheless, there is a small installed base of MHP set-top boxes, while early commercial implementations are lacking major features (like digital video recording, which has been available by TiVo for a long time), are having very slow response times and are not very stable. Moreover, MHP authoring environments are very few, and are always expensive, without realistic options for academic or research pricing. The above problems are natural for a new technology, but MHP is also facing regulatory problems in the European Union (EU) marketplace. Despite heavy support by many companies and groups, MHP’s reliance on Java has not allowed it to be pushed by the EU’s regulatory body as the continent’s standard for interactive television applications. Overall, MHP is superior because it has been built from the ground-up as an extension of the commercially successful Digital Video Broadcasting (DVB) standard and is also supported by the respective research community and the manufacturers of broadcasting equipment.
We have chosen to implement the prototype using Microsoft TV (MSTV) technology for a number of reasons. Most importantly, the core components of MSTV are available in the Windows XP operating system and can be run on affordable personal computers (PCs). In addition to the pervasive availability of a television API, MSTV technology can be utilized within a familiar and mature Integrated Development Environment (IDE). Microsoft Visual Studio offers a multitude of tools for designing, developing, testing and deploying an application. We used the .NET edition of Visual Studio and programmed the prototype using the Visual Basic language, although the use of the C# language would not have made any difference at all, since the .NET framework assumes a Common Language Runtime (CLR) for all builds. In contrast to MSTV, other proprietary implementations (like Liberate, Mediahighway, OpenTV) require the respective authoring environments that consist of idiosyncratic and expensive IDE and STB technologies. Against the use of MSTV is the fact that Microsoft has a very limited installed base of STBs compared to competitive implementations. Overall, the core components of MSTV have sufficient features for augmenting them to the Virtual Channel API and for developing ITV applications (Figure 3).
Figure 3 Using a high-level API to make ITV development more friendly to TV producers
A complete reference for the Microsoft controls that we used can be found on the Microsoft Developer Network (MSDN - http://msdn.microsoft.com/). In the present section, we briefly present the MSTV and Windows components used for the construction of a high level ITV API (Figure 4).
The Video Control (MSVidCtl) is an ActiveX control that is used to create and manage analog and digital TV filter graphs. In our design, we assumed a pre-recorded pool of music video clips residing in a directory and we used the Video Control for playing video files from local storage. The SetupMixerBitmap method configures the Video Mixing Renderer (VMR) filter to display an alpha-blended static bitmap on top of the video (video overlay). Time-driven user interfaces require methods for defining events, for handling defined events in the passage of time and for identifying the time-state of the ITV application. We used the Timer Control to construct an object that keeps track of time-instances and raises an event when the Timer reaches the threshold defined for each instance.
We also investigate the use of animated characters as an integral part of ITV, using the Microsoft Agent Control (MS Agent). Animated characters research begun  from the human-computer interaction (HCI) discipline as an alternative to the desktop metaphor. Since then, animated characters research has probed activity in many different disciplines, like computer graphics , while it maintains a strong following in the HCI  and in the Intelligent User Interface (IUI) domain . Animated characters research for desktop computing has been very popular, but the respective commercial implementations (most notably the Microsoft Office Clip) are reported to be annoying to end users  . The explanation might be that the attention grabbing and interrupting nature of animated characters is inappropriate for productivity computing. On the other hand, television content has traditionally been about stories and character development. Therefore, animated characters might be viable for computer mediated entertainment, like interactive television.
Previous failures of ITV systems have been attributed to immature technology, high costs and mainly to information technology driven features coupled with user interface design inspired by the personal computer practice . The VCC Library (VCCLib) is a higher-level ITV API that takes interactive computer graphics further away from the specifics of the underlying implementation and closer to the traditional TV production values (Figure 3). We provide a class diagram (Figure 4) of the VCCLib, so that it is understood for applying in ITV application development and for realizing in other contexts, with alternative implementation tools and platforms. The Virtual Channel Control is the central element of the VCCLib and provides methods and events for handling the flow of a virtual channel.
Figure 4 Class diagram for the Virtual Channel API with references to a MSTV implementation
Event-driven computer programming might feel familiar for the majority of developers who use object-oriented languages to build interactive applications. Nevertheless, event-driven programming for multimedia and ITV applications is different from productivity applications, in the sense of being more time-driven than user-action-driven. ITV applications have a greater need to organize the user interface and the consumer experience temporally, instead of spatially, which has been the norm for computer application development so far . Therefore, an API for ITV applications should support the programming of time driven user interfaces. The Timers Control is based on the Timer Control and enables the definition and handling of time-driven events. The VCAgent Control is a simple wrapper-class around the MS Agent Control.
An overview of the available ITV literature (http://itv.eltrun.aueb.gr/papers/) reveals that the majority of consumer-level applications are user interface and content recommendation engines for Electronic Program Guide (EPG) systems. Smyth and Cotter  argue that EPG design is an important factor for selecting TV programs to watch, given a large channel repertoire and local storage of programming. On the other hand, Carey  maintains that the enhancement of each type of television content and the introduction of new formats can actually drive ITV adoption by consumers. In accordance with the latter, we chose to study music TV, which is a widely available format of TV content.
Figure 5 A virtual music television channel that features video-clip skipping with dynamic advertisement insertion
We designed and implemented an application that allows a television viewer to skip a music video clip —an action that may come at the cost of watching a targeted advertisement if the viewer has chosen not to pay a subscription fee (Figure 5). In addition to music video clip track-skipping, we used the video overlays feature of the Virtual Channel to superimpose information over the music video (Figure 7). Music information contains trivia about the artist, biographical information and discography.
We provide an architecture diagram (Figure 6) and a few screenshots (Figure 7) that offer a visual walkthrough of the features and the events that are available in the current implementation of the VCCLib. The Music class (Figure 4) is domain specific for the current application (music) and was defined to hide the details of the music meta-data implementation. The implementation was based on information manually collected from Internet resources, and stored in static text files, but future implementations, may provide continuously updated music information about the running music video clip.
Figure 6 Architecture for an ITV application that employs the Virtual Channel API
Figure 7 Using the Virtual Channel Control Library to develop an ITV application for music video clips: The scheduled advertising break event has been handled (top left), the end of advertising break event has been handled (top right), currently playing and coming next video clip information using the MS Agent (middle left), mucic video clip related information using the MS Agent (middle right), currently playing and coming next video clip information using a transparent information box (bottom left), mucic video clip related information using transparent information box (bottom right).
The second major argument in our research is the need for evaluating interactive graphics for ITV applications with consumers in a relaxed and natural setting. The central element for our experimental set-up was a portable PC. The ITV application was designed to run in full-screen and windowless mode and was set to display at the TV screen. The PC’s serial port was connected to an infrared sensor (http://www.evation.com/irman/) that receives the signals from the remote control. The whole set-up was unobtrusive and seamless to the television viewer (Figure 8).
Figure 8 Experimental set-up for unobtrusive and seamless usability evaluation of ITV applications
The objective of the study was to evaluate the use of informational overlays for presenting information and interacting with the television viewer. Music-related information was displayed over music video clips in two forms: a) Using information pop-ups, 2) using an animated character (Figure 7). Moreover, we studied consumer opinions about simple interaction (video skipping and asking for information) with a TV program (Figure 5).
We ran usability tests with 30 users (university students, ages: 22-35, 18 men, 12 women); half of them used the animated character user interface, while the other half used the transparent video overlays (all of them could skip a video-clip on-demand). Using five or more users for usability testing has been established as a good trade-off between the cost of a study and the amount of usability issues found . We used 15 users for each user interface, since we had to ‘waste’ a few testing sessions before we discover issues worthwhile of in-depth investigation. The study was performed in a living-room setting, using a TV set and a remote control. We used multiple usability engineering methods: a) Observation, b) log files analysis and c) interviews. In order to ensure selective-exposure the users were allowed a maximum of 1/3 of watching time, out of the total session duration . Users could press the power off button on the remote to end the testing session and they were told to watch as much as they liked, between 10 and 20 minutes.
The animated character user interface raised users’ interest and revealed issues that worth further investigation. Those who have been exposed before to the MS Agent technology (through the Microsoft Office suite of applications) recognized the similarity despite the use of a different character (the genie) and some of them were very negative to the concept of the animated character. Therefore, we can argue that the animated character from the desktop application has a carry-over effect to the ITV application. Nevertheless, most of the users considered the character funny and less obtrusive compared with human presenters. Furthermore, users asked for more characters and the option to select their favorite presenter. Finally, most of the users disliked the solid balloon dialog that stands over the head of the character. The best place for the animated character balloon dialog would be across the bottom of the screen. The positive user evaluations and suggestions complement related research in the home infotainment domain  . For example, a future implementation may include a sub-system for supporting emotion , either synchronized with the video content, or the user preferences or based on additional meta-data provided by the hosting application.
The most interesting suggestions for future improvements concerned the augmentation of the music video skip feature. Users familiar with the PC MP3 players asked for more options when skipping a music video, like repeat a song and play a song from the same artist or from the same music genre. Moreover, a longer list of the upcoming music videos would be welcome and it would also allow organizing their time better, since they could leave the TV open and plan to return back when their favorite song is on. Using the television as a time tool to structure activities and organize time has been also documented before at an ethnographic study of a STB trial . Therefore, providing on-demand information about the upcoming video clips would support the relaxed control of TV as a time management tool, while the ability to alter dynamically the upcoming play-list would support interactive behavior. For example, the user could bring up a play-list of 10 upcoming music videos and alter it dynamically along a number of parameters like genre, artist. Finally, the user could decide whether to skip directly to a music video by pressing the corresponding button on the numeric keypad.
Next, we give an overview of additional important results:
§ The combination of the continuous video flow principle of the Virtual Channel model and an appropriate experimental set-up (television and remote control) may be used to create a seamless interactive television experience.
§ The video skipping feature was a favorite, despite the ad insertion, and provides relaxed control of the interactive music TV application, based on the local storage of the music video clips.
§ Users reported that they used the skip functionality mainly to by-pass a music video that they disliked and at a lesser extent to get to a favorite one. Interestingly, log file analysis revealed that some users tried to skip through advertisements, too.
§ Users had trouble recalling advertisements that were placed dynamically between music video clips, when skipping a video-clip.
§ Users would have liked the option to freely navigate the music-related information, but, they would still prefer the auto-pace style of information presentation for most of their casual watching.
§ Images that support alpha blending for the background color of the information box should be used, thus leaving the font color opaque against the video background.
§ The ideal information box would be 2 or 3 lines long and it would span across the bottom of the screen.
Previous commercial ITV application development has been done without clear direction and has been a derivative from seemingly analogous mediums. Application developers have invested their efforts in trying to transfer Internet applications (like email and web browsing) to the TV audience, which has traditionally been seeking for entertainment and relaxation. In contrast, we proposed the Virtual Channel as an appropriate model for extending TV entertainment into the interactive age of Internet, DTV, and TiVo. ITV researchers and practitioners should employ the Virtual Channel mentality in their implementations and perform usability evaluations with consumers, using a seamless experimental ITV set-up.
Apart from the music video clip content, the proliferation of other thematic channels (news, documentaries) gives many opportunities for applying the Virtual Channel model, given that the content in this type of channels is alike. For the case of general-purpose channels that broadcast diverse types of content, the Virtual Channel API has to be applied on a per-segment basis. It is also obvious that the Virtual Channel is not appropriate for story-driven media content and dynamic synthesis of scenes for the creation of new content items, like movies, soaps and series. Strategies and tools for interactive storytelling have been studied by Agamanolis  and the object-based media group at the MIT Media Lab . Responsive Television research is focusing on the dynamic synthesis of video at the scene level, while the Virtual Channel research defines a framework for dynamic synthesis of an integrated (local video, Internet resources, real time broadcasts) television experience at the content item (e.g. music video clip, news story) level.
In conclusion, this research is based on the realization that despite the technical progress of the current ITV APIs, in terms of mentality, they are still closer to the IT developer than the TV producer. Since compelling ITV applications are most likely to be developed by TV producers it makes sense to develop ITV production tools that make IT friendlier to them. We argue that user interface development for ITV applications will be benefited from the commercial implementation of the Virtual Channel model and the respective set of principles. In the near future, the Virtual Channel model should find its way inside visual authoring environments and digital STBs.
We wish to express our gratitude to the users who participated at the usability evaluations and shared their opinions. We also thank George Kyriazis for reading early drafts and for providing invaluable comments and suggestions.
 Barton C, Rosendahl C, Brandel R, Elin L, Rugtiv S, Towey D. Animated computer graphics in television broadcasting (panel session). In Proceedings of the 12th annual conference on Computer graphics and interactive techniques, page 325. ACM Press, 1985.
 Chorianopoulos K. The digital set-top box as a virtual channel provider. In Proceedings of the extended abstract conference on human factors and computing systems, pages 666–667. ACM Press, 2003.
 Chorianopoulos K, Spinellis D. A metaphor for personalized television programming. In N. Carbonelle and C. Staphanidis, editors, User Interfaces for All, LNCS 2615, pages 187–194. Springer-Verlag, 2003.
 Drucker S.M, Glatzer A, Mar S.D, Wong C. Smartskip: consumer level browsing and skipping of digital video content. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 219–226. ACM Press, 2002.
 Guimarăes N.M, Correia N.M, Carmo T.A. Programming time in multimedia user interfaces. In Proceedings of the 5th annual ACM symposium on User interface software and technology, pages 125–134. ACM Press, 1992.
 O'Brien J, Rodden T, Rouncefield M, Hughes J. At home with the technology: an ethnographic study of a set-top-box trial. ACM Transactions on Computer-Human Interaction (TOCHI), 6(3):282– 308, 1999.
 Rickenberg R, Reeves B. The effects of animated characters on anxiety, task performance, and evaluations of user interfaces. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 49–56. ACM Press, 2000.
 Shoup R, Klimek T, Evans L, Black P, Bley H, Weise D. Computer graphics in television (panel session). In Proceedings of the 7th annual conference on Computer graphics and interactive techniques, page 170. ACM Press, 1980.
 Steinhart J, Burns D, Gosling J, McGeady S, Short R. Set-top boxes the next platform (panel). In Proceedings of the 22nd annual conference on Computer graphics and interactive techniques, page 479. ACM Press, 1995.
 Corresponding author. Tel.: +30-210-8203663; fax: +30-210-8203664; Email: email@example.com
 Hereafter, italics will be used to highlight a property of the Virtual Channel model whenever it is discussed, implemented, or employed in an ITV application.
 Full source code is available for studying, changing, and applying to ITV application development at the Virtual Channel web site (http://itv.eltrun.aueb.gr/lab/virtualchannel/).
 MAD TV executives don’t think that human presenters are going to be replaced anytime soon, but they find the idea of animated characters promising for hosting a specific show and for presenting information during the night or for personalized play-lists, in the future.