LanguagesXMLWorking with SMIL

Working with SMIL

Developer.com content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.


The Synchronized Multimedia Integration Language (SMIL) is a recommendation from the World Wide Web Consortium (W3C) intended to allow the easy implementation of sophisticated time-based multimedia content on the Web. SMIL is an XML extension and currently in version 1.0.

SMIL is currently supported for Windows, Unix and Macintosh in GRiNS from the Centrum voor Wiskunde en Informatica (CWI) and for Unix and Java in HPAS via the W3C. SMIL is also partially supported for Windows in the G2 Player beta at Real Networks, with the promise of full implementation in the final release. For this tutorial, I recommend using the GRiNS player.

Because of the nature of multimedia, you will need a high-speed connection to play the demos over the network, otherwise download the example files and media and run them locally.

Five easy pieces

Just how easy is easy? We’ll begin with a simple slide show of my trip to the Great Wall of China. This was done in just 16 lines:










The first three lines inform the application that this document is an extension to XML 1.0 and gives a URI at the W3C’s server where the application can find the Document Type Definition (DTD), if necessary. Following that, the remainder of the document looks quite similar to HTML.

This slide show is fairly boring, so let’s add some visual interest and a bit of layout.

This .smil document is not much larger than the first, but it contains almost all the elements you will ever need to use.

The <head> of a .smil document contains the non-time-based information about the document: the title, meta information for search engines, and all the layout parameters for the presentation. SMIL allows you to set the overall size of the display area, regions based on position related to left and top edges, size of the regions, and layers specified by the z-index. In the example above, the “background” region is given an explicit z-index value of 0 — when an image is placed in the “image 1” region, which has a z-index value of 1, it will appear on top of the background image.

The <body> of the document contains the instructions for the time-based elements and linking behavior of the document. This is where the two key elements of a SMIL document, the <seq> and <par> tags, are found. Media enclosed in a <seq> tag are presented in sequence. Media enclosed in a tag are played simultaneously, or in parallel. It’s as easy as that. <seq> and <par> nodes can be nested to allow for complex, interrelated behavior.

Media types and their temporal behavior are described within the nodes. For example,

means that the image contained in the file “fortress.jpg” will be displayed in the “image 1” region five seconds after that sequence begins and will disappear after five seconds.


Part 2 of 2

The next example demonstrates more complex, relational behavior by incorporating music and timed text.

By placing the following reference inside the first <par> node along with the instructions for the background image

we create “audio wallpaper” for the presentation. Note that although it is not a visual element, the audio file still requires a region to be declared in the layout node.

This presentation also incorporates text items related to specific images. This is achieved by creating a sequence of <par> nodes, each containing an image element and a text element. For example:



Note that the source of the text element is contained within the document itself. This could just as easily be done by referencing a text document via a URI instead. For example:



Or a remote document:



The next presentation incorporates an audio narration keyed to specific slides. It demonstrates the use of “clip-begin.”


This can also be stated in SMPTE format, which in this case would look like this:


The use of SMPTE allows for the easy porting to SMIL of an Edit Decision List (EDL), created in a video-editing environment such as Avid or Premiere.

The narration consists of a single audio file containing five separate segments. In order to synchronize the appropriate narration with a specific image, the SMIL application is instructed to begin playing the audio file a specified interval from its beginning. In this case, the value is “npt=10s”, which means ten seconds from the beginning in normal playing time.

This is one of the most powerful features of SMIL, because it allows the reuse of a single sequential media file in a number of distributed presentations. A news site presenting a press conference could offer the entire conference or just edited highlights using the same media source.

The final example takes the previous self-running presentation and turns it into a user-guided slide show that could be used in a presentation. Keep in mind that all of these presentations have made use of the same media files. Unlike formats such as Shockwave and ASF that “containerize” their media, SMIL allows presentations to be modified “on the fly.”

This presentation makes use of SMIL’s hyperlinking feature.





As you can see, the syntax is almost identical to HTML. In this document, all links reference nodes within the same document, but they can also link to distributed .smil documents. A SMIL browser could display an HTML page with the proper plug-in, or an HTML browser could use a plug-in to display an embedded SMIL document.

This just scratches the surface of what SMIL can do. SMIL also offers support for bandwidth management, alternate layout and content, and extensibility. These will be covered in future articles. A good source of further information about SMIL is the justsmil.com Web site, as well as the W3C itself.

John Maxwell Hobbs is a musician and has been working with computer multimedia for over fifteen years. He is currently in charge of multimedia development at Ericsson CyberLab New York. His interactive composition “Web Phases” was recently one of the winners of ASCI’s Digital ’98 competition and is currently on exhibit at the New York Hall of Science. He is also on the board of directors of Vanguard Visions, an organization dedicated to fostering the work of artists experimenting with technology. He is the former producing director for The Kitchen in New York City.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories