Learn Development at Frontend Masters
Here’s a scenario. You start a banging Kendrick Lamar track in one of your many open browser tabs. You’re loving it, but someone walks into your space and you need to pause it. Which tab is it? Browsers try to help with that a little bit. You can probably mute the entire system audio. But wouldn’t it be nice to actually have control over the audio playback without necessarily needing to find your way back to that tab?
The Media Session API makes this possible. It gives media playback access to the user outside of the browser tab where it is playing. If implemented, it will be available in various places on the device, including:
In addition, the Media Session API allows us to control media playback with media keys and voice assistants like Siri, Google Assistant, Bixby, or Alexa.
The Media Session API mainly consists of the two following interfaces:
The MediaMetadata
interface is what provides data about the playing media. It is responsible for letting us know the media’s title, album, artwork and artist (which is Kendrick Lamar in this example). The MediaSession
interface is what is responsible for the media playback functionality.
Before we take a deep dive into the topic, we would have to take note of feature detection. It is good practice to check if a browser supports a feature before implementing it. To check if a browser supports the Media Session API, we would have to include the following in our JavaScript file:
The constructor, MediaMetadata.MediaMetadata()
creates a new MediaMetadata
object. After creating it, we can add the following properties:
The value of the artwork
property of the MediaMetadata
object is an array of MediaImage
objects. A MediaImage
object contains details describing an image associated with the media. The objects have the three following properties:
Let’s create a MediaMetadata
object for Kendrick Lamar’s “Alright” off his To Pimp a Butterfly album.
As stated earlier, this is what lets the user control the playback of the media. We can perform the following actions on the playing media through this interface:
The MediaSessionAction
enumerated type makes these actions available as string types. To support any of these actions, we have to use the MediaSession
’s setActionHandler()
method to define a handler for that action. The method takes the action, and a callback that is called when the user invokes the action. Let us take a not-too-deep dive to understand it better.
To set handlers for the play
and pause
actions, we include the following in our JavaScript file:
Here we set the track to play when the user plays it and pause when the user pauses it through the media interface.
For the previoustrack
and nexttrack
actions, we include the following:
This might not completely be self-explanatory if you are not much of a Kendrick Lamar fan but hopefully, you get the gist. When the user wants to play the previous track, we set the previous track to play. When it is the next track, it is the next track.
To implement the seekbackward
and seekforward
actions, we include the following:
Given that I don’t consider any of this self-explanatory, I would like to give a concise explanation about the seekbackward
and seekforward
actions. The handlers for both actions, seekbackward
and seekforward
, are fired, as their names imply, when the user wants to seek backward or forward by a few number of seconds. The MediaSessionActionDetails
dictionary provides us the “few number of seconds” in a property, seekOffset
. However, the seekOffset
property is not always present because not all user agents act the same way. When it is not present, we should set the track to seek backward or forward by a “few number of seconds” that makes sense to us. Hence, we use 10 seconds because it is quite a few. In a nutshell, we set the track to seek by seekOffset
seconds if it is provided. If it is not provided, we seek by 10 seconds.
To add the seekto
functionality to our Media Session API, we include the following snippet:
Here, the MediaSessionActionDetails
dictionary provides the fastSeek
and seekTime
properties. fastSeek
is basically seek performed rapidly (like fast-forwarding or rewinding) while seekTime
is the time the track should seek to. While fastSeek
is an optional property, the MediaSessionActionDetails
dictionary always provides the seekTime
property for the seekto
action handler. So fundamentally, we set the track to fastSeek
to the seekTime
when the property is available and the user fast seeks, while we just set it to the seekTime
when the user just seeks to a specified time.
Although I wouldn’t know why one would want to stop a Kendrick song, it won’t hurt to describe the stop
action handler of the MediaSession
interface:
The user invokes the skipad
(as in, “skip ad” rather than “ski pad”) action handler when an advertisement is playing and they want to skip it so they can continue listening to Kendrick Lamar’s “Alright” track. If I’m being honest, the complete details of the skipad
action handler is out of the scope of my “Media Session API” understanding. Hence, you should probably look that up on your own after reading this article, if you actually want to implement it.
We should take note of something. Whenever the user plays the track, seeks, or changes the playback rate, we are supposed to update the position state on the interface provided by the Media Session API. What we use to implement this is the setPositionState()
method of the mediaSession
object, as in the following:
In addition, I would like to remind you that not all browsers of the users would support all the actions. Therefore, it is recommended to set the action handlers in a try...catch
block, as in the following:
Putting everything we have done, we would have the following:
Here’s a demo of the API:
I implemented six of the actions. Feel free to try the rest during your leisure.
If you view the Pen on your mobile device, notice how it appears on your notification area.
If your smart watch is paired to your device, take a sneak peek at it.
If you view the Pen on Chrome on desktop, navigate to the media hub and play with the media buttons there. The demo even has multiple tracks, so you experiment moving forward/back through tracks.
If you made it this far (or not), thanks for reading and please, on the next app you create with media functionality, implement this API.
Frontend Masters is the best place to get it. They have courses on all the most important front-end technologies, from React to CSS, from Vue to D3, and beyond with Node.js and Full Stack.
Frontend Masters is the best place to get it. They have courses on all the most important front-end technologies, from React to CSS, from Vue to D3, and beyond with Node.js and Full Stack.
Great!, Looking to implementing it some day, thanks sir
You’re welcome!
You may write comments in Markdown thanks to Jetpack Markdown. This is the best way to post any code, inline like `<div>this</div>` or multiline blocks within triple backtick fences (“`) with double new lines before and after. All comments are held for moderation. Be helpful and kind and yours will be published no problem.
The related posts above were algorithmically generated and displayed here without any load on my server at all, thanks to Jetpack.
CSS-Tricks* is created, written by, and maintained by Chris Coyier and a team of swell people. The tech stack for this site is fairly boring. That’s a good thing! I’ve used WordPress since day one all the way up to v17, a decision I’m very happy with. I also leverage Jetpack for extra functionality and Local for local development.
*May or may not contain any actual “CSS” or “Tricks”.
CodePen is a place to experiment, debug, and show off your HTML, CSS, and JavaScript creations.
CSS-Tricks is hosted by Flywheel, the best WordPress hosting in the business, with a local development tool to match.
ShopTalk is a podcast all about front-end web design and development.