The charter
of the Integrated Media Systems Center (IMSC) at the University
of Southern California (USC) is to investigate new methods and
technologies that combine multiple modalities into highly effective,
immersive technologies, applications and environments. One of
the results of these research efforts is the Remote Media Immersion
(RMI) system. The goal of the RMI is to create and develop a complete
aural and visual environment that places a participant or group
of participants in a virtual space where they can experience events
that occurred in different physical locations. RMI technology
can effectively overcome the barriers of time and space to enable,
on demand, the realistic recreation of visual and aural cues recorded
in widely separated locations [MNNS99]. The focus of the RMI effort
is to enable the most realistic recreation of an event possible
while streaming the data over the Internet. Therefore, we push
the technological boundaries much beyond what current video-on-demand
or streaming media systems can deliver.
As a consequence,
high-end rendering equipment and significant transmission bandwidth
are required. However, we trust that advances in electronics,
compression and residential broadband technologies will make such
a system feasible first in commercial settings and later at home
in the not too distant future. Some of the indicators that support
this assumption are, for example, that the next generation of
the DVD specification calls for network access of DVD players
Furthermore, Forrester Research forecasts that almost 15 per cent
of films will be viewed by on demand services rather than by DVD
or video by 2005 [Ric03a]. The infrastructure necessary for these
services is gradually being built as is demonstrated in Utah,
where 17 cities are planning to construct an ultra-high speed
network for both businesses and residents [Ric03b]. The RMI project
integrates several technologies that are the result of research
efforts at IMSC. The current operational version is based on four
major components that are responsible for the acquisition, storage,
transmission, and rendering of high quality media.
Acquisition
of high-quality media streams.
This authoring component is an important part of the overall chain
to ensure the high quality of the rendering result as experienced
by users at a later time. As the saying "garbage in, garbage
out" implies, no amount of quality control in later stages
of the delivery chain can make up for poorly acquired media. In
the current RMI version, authoring is an offline process and involves
its own set of technologies. Because of space constraints, we
will not focus on this part.
Real-time digital storage and playback of multiple independent
streams.
The Yima [SZFY02]
Scalable Streaming Media Architecture provides real-time storage,
retrieval and transmission capabilities. The Yima server is based
on a scalable cluster design. Each cluster node is an off-the-shelf
personal computer with attached storage devices and, for example,
a Fast or Gigabit Ethernet connection. The Yima server software
manages the storage and network resources to provide real-time
service to the multiple clients that are requesting media streams.
Media types include,but are not limited to, MPEG-2 at NTSC and
HDTV resolutions, multi channel audio (e.g., 10.2 channel immersive
audio), and MPEG-4.
Protocols
for synchronized, efficient realtime transmission of multiple
media streams.
A selective data retransmission scheme improves playback quality
while maintaining real time properties. A flow control component
reduces network traffic variability and enables streams of various
characteristics to be synchronized at the rendering location.
Industry standard networking protocols such as Real-Time Protocol
(RTP) and Real-Time Streaming Protocol (RTSP) provide compatibility
with commercial systems.
Rendering
of immersive audio and high resolution video.
Immersive audio is a technique developed at IMSC for capturing
the audio environment at a remote site and accurately reproducing
the complete audio sensation and ambience at the client location
with full fidelity, dynamic range and directionality for a group
of listeners (16 channels of uncompressed linear PCM at a data
rate of up to 17.6Mb/s). The RMI video is rendered in HDTV resolutions
(1080i or 720p format) and transmitted at a rate of up to 45 Mb/s.
In this report we detail some of these components and the techniques
that are employed within each. The hope is that our advances in
digital media delivery will enable new applications in the future,
be that in the entertainment sector (sports bars, digital cinemas,
and eventually the home theater), distance education, or others.
We will focus mainly on the transmission and rendering aspects.