• Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL

Chapter 17. MPEG-4 BINARY FORMAT FOR SCE... > Fundamental Differences with VRML97

Fundamental Differences with VRML97

In the second version of the MPEG-4 standard, BIFS supports all VRML97 nodes. BIFS also extends VRML97 nodes and capabilities with:

2D composition. 2D primitives can be composed in a 2D hierarchical scene graph. The same operations available for 3D nodes are possible for 2D ones: translation, scaling, and rotations. In addition, MPEG-4 provides specific scrolling and automatic layout capabilities for 2D primitives, similar to a 2D window graphic environment such as Microsoft Windows or X-Motif.

2D objects. To ease integration of 2D graphics primitives with 3D, the design of 2D primitives follows the design of the 3D ones. For instance, the 2D Circle primitive corresponds to the 3D Sphere primitive, while a 2D Rectangle shape corresponds to the 3D Box.

2D/3D scene composition. In addition to the 2D nodes, MPEG-4 defines how to mix 2D and 3D primitives. This is a major difference with VRML97 as well as other 3D graphic formats. Usually, 2D/3D composition is done by restricting 3D primitives to 2D environments or by interfacing a 3D engine with a 2D engine. Both of these approaches are inefficient in terms of rendering and often complex. MEPG-4 defines three ways to solve this issue:

  • 2D primitives can be directly included in the 3D space by drawing them in a local x,y plane.

  • Transparent regions can be composed on top of the others as layers. MPEG-4 defines Layer2D and Layer3D nodes to render 2D or 3D scenes, respectively. Each layer can be viewed as a rendering of a subscene with its own viewpoint. This allows MPEG-4 to show different views of the same scene, add a 3D logo on a 2D video, or use a 2D interface to a 3D scene as a menu.

  • Scenes can be rendered as textures and mapped on any objects with CompositeTexture2D and CompositeTexture3D nodes. The first renders 2D scenes as textures, the second as 3D scenes.

Facial and body animation nodes. Animating a face is a complex task that involves nonrigid deformations. This requires special libraries to render the displacements of vertices on the mesh. MPEG-4 defines facial nodes with a default face model (though the mesh itself is not standardized) or one that can be customized by a content creator. State-of-the-art virtual worlds nowadays use a limited set of predefined models; customization allows developers to use their own models. The body nodes define displacements of joints in a skeleton. In VRML, the Web3D Consortium's Humanoid Animation (H-Anim) Working Group has already defined these specific nodes and scripts to bound and to control the displacements of a humanoid (see http://www.web3d.org/WorkingGroups). These nodes are included in MPEG-4, so that animation scripts do not have to be included with the scene as they are with VRML (making humanoid animation more efficient in MPEG-4).

Extended sound composition. In VRML, the sound model supports attaching sounds to object geometry and the user's point of view. MPEG-4 enhances this model by attaching physical properties to material in the scene, by defining some environmental sound rendering parameters, and by composing sound based on physical and perceptual modeling. MPEG-4 defines an audio scene graph, where sound composition is accomplished by applying signal-processing transformations to the input audio streams (which in turn produces the audio output stream). This approach allows simple, yet flexible mixing control of sound sources. Advanced features allow defining of "instruments" (real instruments, such as horns or drums, as well as conceptual instruments, such as rushing water or blowing wind), defining synthetic music, or producing special effects on natural audio streams.

Scalability and scene control. When dealing with rich multimedia content, a challenging issue is to be sure that the content will be shown as the author intended shown on every client platform. The diversity of client platforms makes it difficult to predict how the content will behave. MPEG-4 defines mechanisms (sometimes called "graceful degradation" mechanisms) that allow content to be scaled down at the client terminal in order to adapt it according to specific environment parameters:

  • The TermCap node contains a list of system capabilities such as frame rate, memory, or CPU load. At any time during scene-graph traversal, the current capabilities of the system can be estimated and the scene graph adapted accordingly. A typical example is to use different sets of models according to the frame rate.

  • Enhancing the EAI interface of VRML97, MPEG-4 defines a Java layer called MPEG-J. An MPEG-J application can access the system resources and act on network, decoders, and composition interfaces.



Not a subscriber?

Start A Free Trial

  • Creative Edge
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint