This document defines how a stream of media can be captured from a DOM
element, such as a <video>
, <audio>
,
or <canvas>
element, in the form of
a MediaStream
[[!GETUSERMEDIA]].
This document is not complete. It is subject to major changes and, while early experimentations are encouraged, it is therefore not intended for implementation.
This document describes an extension to both HTML media elements and the HTML canvas element that enables the capture of the output of the element in the form of streaming media.
The captured media is formed into a MediaStream
[[GETUSERMEDIA]], which can then be consumed by the various APIs that
process streams of media, such as WebRTC [[WEBRTC]], or Web Audio
[[WEBAUDIO]].
This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.
Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [[!WEBIDL]], as this specification uses that specification and terminology.
Method captureStream()
is defined on HTML [[!HTML5]]
media elements.
Both MediaStream
and HTMLMediaElement
expose the
concept of a track
. Since there is no common type used for
HTMLMediaElement
, this document uses the term
track to refer to either VideoTrack
or AudioTrack
.
MediaStreamTrack
is used to identify the media in a MediaStream
.
partial interface HTMLMediaElement { MediaStream captureStream (); };
captureStream
The captureStream()
method produces a
real-time capture of the media that is rendered to the media element.
The captured MediaStream
comprises of
MediaStreamTrack
s that render the content from the set of
selected
(for VideoTrack
s,
or other exclusively selected track types) or enabled
(for AudioTrack
s,
or other track types that support multiple selections)
tracks from the media element. If the media element does not
have a selected or enabled tracks of a given type, then no
MediaStreamTrack
of that type is present in the captured
stream.
A <video>
element can therefore capture a video
MediaStreamTrack
and any number of audio
MediaStreamTrack
s. An <audio>
element
can capture any number of audio MediaStreamTrack
s. In
both cases, the set of captured MediaStreamTrack
s could
be empty.
Unless and until there is a track of given type that is
selected or enabled, no MediaStreamTrack
of that type is
present in the captured stream. In particular, if the media element
does not have a source assigned, then the
captured MediaStream
has no tracks. Consequently, a media
element with a ready state
of HAVE_NOTHING
produces no captured MediaStreamTrack
instances. Once
metadata is available and the selected or enabled tracks are
determined, new captured MediaStreamTrack
instances are
created and added to the MediaStream
.
A captured MediaStreamTrack
ends
when playback
ends (and the ended
event fires) or when the track
that it captures is no longer selected or enabled for playback. A
track is no longer selected or enabled if the source is changed by
setting the src
or srcObject
attributes of
the media element.
The set of captured MediaStreamTrack
s change if the
source of the media element changes. If the source for the media
element ends, a different source is selected.
If the selected VideoTrack or enabled AudioTracks for
the media element change,
a addtrack
event with a new MediaStreamTrack
is generated for
each track that was not previously selected or enabled; and
a removetrack
events is generated for each track that ceases to be selected
or enabled. A MediaStreamTrack
MUST end prior to being
removed from the MediaStream
.
Since a MediaStreamTrack
can only end once, a track that
is enabled, disabled and re-enabled will be captured as two separate
tracks. Similarly, restarting playback after playback ends causes a
new set of captured MediaStreamTrack
instances to be
created. Seeking during playback without changing track selection
does not generate events or cause a
captured MediaStreamTrack
to end.
The MediaStreamTrack
s that comprise the
captured MediaStream
become muted or unmuted as the
tracks they capture change state. At any time, a media element might
not have active content available for capture on a given track for a
variety of reasons:
MediaStreamTrack
that is acting as a source could
be muted
or disabled.
Absence of content is reflected in captured tracks through
the muted
attribute. A captured MediaStreamTrack
MUST have
a muted
attribute set to true
if its
corresponding source track does not have available and
accessible
content. A mute
event is raised on the MediaStreamTrack
when content
availability changes.
What output a muted capture produces as a result will vary based on
the type of media: a VideoTrack
ceases to capture
new frames when muted, causing the captured stream to show the last
captured frame; a muted AudioTrack
produces
silence.
Whether a media element is actively rendering content (e.g., to a screen or audio device) has no effect on the content of captured streams. Muting the audio on a media element does not cause the capture to produce silence, nor does hiding a media element cause captured video to stop.
Captured audio from an element with an effective playback rate other than 1.0 MUST be time-stretched. An unplayable playback rate causes the captured audio track to become muted.
The captureStream()
method is added to the HTML [[!HTML5]] canvas element. The resulting
CanvasCaptureMediaStreamTrack
provides methods that
allow for controlling when frames are sampled from the canvas.
partial interface HTMLCanvasElement { MediaStream captureStream (optional double frameRate); };
captureStream
The captureStream()
method
produces a real-time video capture of the surface of the canvas. The
resulting media stream has a single
video CanvasCaptureMediaStreamTrack
that matches
the dimensions of the canvas element.
Content from a canvas that is not origin-clean MUST NOT be captured. This method throws a SecurityError exception if the canvas is not origin-clean.
A captured stream MUST immediately cease to capture content if
the origin-clean flag of the source canvas becomes false after
the stream is created by captureStream()
. The
captured MediaStreamTrack
MUST become muted,
producing no new content while the canvas remains in this state.
Each track that captures a canvas has an
internal frameCaptureRequested
property that
is set to true when a new frame is requested from the canvas.
The value of the frameCaptureRequested
property on
all new tracks is set to true
when the track is created.
On creation of the captured track with a specific,
non-zero frameRate
, the user agent starts a
periodic timer at an interval of 1/frameRate
seconds. At
each activation of the timer,
the frameCaptureRequested
property is set
to true
.
In order to support manual control of frame capture with
the requestFrame()
method, browsers MUST support a
value of 0 for frameRate
. However, a captured stream
MUST request capture of a frame when created, even
if frameRate
is zero.
This method throws
a NotSupportedError
if frameRate
is negative.
A new frame is requested from the canvas
when frameCaptureRequested
is true and the canvas is
painted. Each time that the captured canvas is painted, the following
steps are executed:
frameCaptureRequested
internal property of track is set, add a new frame
to track containing what was painted to the canvas.
frameRate
value was specified, set
the frameCaptureRequested
internal
property of track to false
.
This algorithm results in a captured track not starting until something changes in the canvas.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
frameRate | double | ✘ | ✔ |
MediaStream
CanvasCaptureMediaStreamTrack
The CanvasCaptureMediaStreamTrack
is an
extension of MediaStreamTrack
that provide a single
requestFrame()
method. Applications that depend on tight
control over the rendering of content to the media stream can use this
method to control when frames from the canvas are captured.
interface CanvasCaptureMediaStreamTrack : MediaStreamTrack { readonly attribute HTMLCanvasElement canvas; void requestFrame (); };
canvas
of type HTMLCanvasElement, readonly requestFrame
The requestFrame()
method allows
applications to manually request that a frame from the canvas be
captured and rendered into the track. In cases where applications
progressively render to a canvas, this allows applications to avoid
capturing a partially rendered frame.
As currently specified, this results in
no SecurityError
or other error feedback if the canvas
is not origin-clean. In part, this is because we don't track where
requests for frames come from. Do we want to highlight that?
void
Media elements can render media resources from origins that differ from
the origin of the media element. In those cases, the contents of the
resulting MediaStreamTrack
MUST be protected from access by
the document origin.
How this protection manifests will differ, depending on how the content is
accessed. For instance, rendering inaccessible video to a
canvas
element [[2DCONTEXT]] causes the origin-clean
property of the canvas to become false
; attempting to create
a Web Audio MediaStreamAudioSourceNode
[[WEBAUDIO]] succeeds,
but produces no information to the document origin (that is, only silence
is transmitted into the audio context); attempting to transfer the media
using WebRTC [[WEBRTC]] results in no information being transmitted.
The origin of the media that is rendered by a media element can change at any time. This is even the case for a single media resource. User agents MUST ensure that a change in the origin of media doesn't result in exposure of cross origin content.
This section will be removed before publication.
This document is based on the stream processing specification [[streamproc]] originally developed by Robert O'Callahan.