WebM Metadata‎ > ‎

Global Metadata

[WIP] Global Metadata in WebM

Last updated: 2012-02-21



Background

WebM global metadata is based on the Matroska tag specification. This document describes where the formats differ.

Tag Elements

Element Name Level Class-ID Mand. Multi. Range Default Element Type Description
Tags 1 [12][54][C3][67] - * - - sub-elements Element containing elements specific to Tracks/Chapters. A list of valid tags can be found here.
Tag 2 [73][73] * * - - sub-elements Element containing elements specific to Tracks/Chapters.
Targets 3 [63][C0] * - - - sub-elements Contain all UIDs where the specified meta data apply. It is void to describe everything in the segment.
TargetTypeValue 4 [68][CA] - - - 50 u-integer A number to indicate the logical level of the target (see TargetType).
TargetType 4 [63][CA] - - - - string An informational string that can be used to display the logical level of the target like "ALBUM", "TRACK", "MOVIE", "CHAPTER", etc (see TargetType).
TrackUID 4 [63][C5] - * - 0 u-integer A unique ID to identify the Track(s) the tags belong to. If the value is 0 at this level, the tags apply to all tracks in the Segment. This value SHOULD be 0, meaning the tags apply to all tracks in the Segment.
EditionUID 4 [63][C9] - * - 0 u-integer A unique ID to identify the EditionEntry(s) the tags belong to. If the value is 0 at this level, the tags apply to all editions in the Segment.
ChapterUID 4 [63][C4] - * - 0 u-integer A unique ID to identify the Chapter(s) the tags belong to. If the value is 0 at this level, the tags apply to all chapters in the Segment.
AttachmentUID 4 [63][C6] - * - 0 u-integer A unique ID to identify the Attachment(s) the tags belong to. If the value is 0 at this level, the tags apply to all the attachments in the Segment.
SimpleTag 3+ [67][C8] * * - - sub-elements Contains general information about the target.
TagName 4+ [45][A3] * - - - UTF-8 The name of the Tag that is going to be stored.
TagLanguage 4+ [44][7A] * - - und string Specifies the language of the tag specified, in the Matroska languages form.
TagDefault 4+ [44][84] * - 0-1 1 u-integer (1 bit) Indication to know if this is the default/original language to use for the given tag.
TagString 4+ [44][87] - - - - UTF-8 The value of the Tag.
TagBinary 4+ [44][85] - - - - binary The values of the Tag if it is binary. Note that this cannot be used in the same SimpleTag as TagString.

Official Tags

TBD

Official Matroska tags
Nesting Information (tags containing other tags)
ORIGINAL - A special tag that is meant to have other tags inside (using nested tags) to describe the original work of art that this item is based on. All tags in this list can be used "under" the ORIGINAL tag like LYRICIST, PERFORMER, etc.
SAMPLE - A tag that contains other tags to describe a sample used in the targeted item taken from another work of art. All tags in this list can be used "under" the SAMPLE tag like TITLE, ARTIST, DATE_RELEASED, etc.
COUNTRY UTF-8 The name of the country (biblio ISO-639-2) that is meant to have other tags inside (using nested tags) to country specific information about the item. All tags in this list can be used "under" the COUNTRY_SPECIFIC tag like LABEL, PUBLISH_RATING, etc.
Colour coding
subjective information
subject to change or removal
Tag Name Type Description
Organizational Information
TOTAL_PARTS UTF-8 Total number of parts defined at the first lower level. (e.g. if TargetType is ALBUM, the total number of tracks of an audio CD)
PART_NUMBER UTF-8 Number of the current part of the current level. (e.g. if TargetType is TRACK, the track number of an audio CD)
PART_OFFSET UTF-8 A number to add to PART_NUMBER when the parts at that level don't start at 1. (e.g. if TargetType is TRACK, the track number of the second audio CD)
Titles
TITLE UTF-8 The title of this item. For example, for music you might label this "Canon in D", or for video's audio track you might use "English 5.1" This is akin to the TIT2 tag in ID3.
SUBTITLE UTF-8 Sub Title of the entity.
Nested Information (tags contained in other tags)
URL UTF-8 URL corresponding to the tag it's included in.
SORT_WITH UTF-8 A child element to indicate what alternative value the parent tag can have to be sorted, for example "Pet Shop Boys" instead of "The Pet Shop Boys". Or "Marley Bob" and "Marley Ziggy" (no comma needed).
INSTRUMENTS UTF-8 The instruments that are being used/played, separated by a comma. It should be a child of the following tags: ARTIST, LEAD_PERFORMER or ACCOMPANIMENT.
EMAIL UTF-8 Email corresponding to the tag it's included in.
ADDRESS UTF-8 The physical address of the entity. The address should include a country code. It can be useful for a recording label.
FAX UTF-8 The fax number corresponding to the tag it's included in. It can be useful for a recording label.
PHONE UTF-8 The phone number corresponding to the tag it's included in. It can be useful for a recording label.
Entities
ARTIST UTF-8 A person or band/collective generally considered responsible for the work. This is akin to the TPE1 tag in ID3.
LEAD_PERFORMER UTF-8 Lead Performer/Soloist(s). This can sometimes be the same as ARTIST.
ACCOMPANIMENT UTF-8 Band/orchestra/accompaniment/musician. This is akin to the TPE2 tag in ID3.
COMPOSER UTF-8 The name of the composer of this item. This is akin to the TCOM tag in ID3.
ARRANGER UTF-8 The person who arranged the piece, e.g., Ravel.
LYRICS UTF-8 The lyrics corresponding to a song (in case audio synchronization is not known or as a doublon to a subtitle track). Editing this value when subtitles are found should also result in editing the subtitle track for more consistency.
LYRICIST UTF-8 The person who wrote the lyrics for a musical item. This is akin to the TEXT tag in ID3.
CONDUCTOR UTF-8 Conductor/performer refinement. This is akin to the TPE3 tag in ID3.
DIRECTOR UTF-8 This is akin to the IART tag in RIFF.
ASSISTANT_DIRECTOR UTF-8 The name of the assistant director.
DIRECTOR_OF_PHOTOGRAPHY UTF-8 The name of the director of photography, also known as cinematographer. This is akin to the ICNM tag in Extended RIFF.
SOUND_ENGINEER UTF-8 The name of the sound engineer or sound recordist.
ART_DIRECTOR UTF-8 The person who oversees the artists and craftspeople who build the sets.
PRODUCTION_DESIGNER UTF-8 Artist responsible for designing the overall visual appearance of a movie.
CHOREGRAPHER UTF-8 The name of the choregrapher
COSTUME_DESIGNER UTF-8 The name of the costume designer
ACTOR UTF-8 An actor or actress playing a role in this movie. This is the person's real name, not the character's name the person is playing.
CHARACTER UTF-8 The name of the character an actor or actress plays in this movie. This should be a sub-tag of an ACTOR tag in order not to cause ambiguities.
WRITTEN_BY UTF-8 The author of the story or script (used for movies and TV shows).
SCREENPLAY_BY UTF-8 The author of the screenplay or scenario (used for movies and TV shows).
EDITED_BY UTF-8 This is akin to the IEDT tag in Extended RIFF.
PRODUCER UTF-8 Produced by. This is akin to the IPRO tag in Extended RIFF.
COPRODUCER UTF-8 The name of a co-producer.
EXECUTIVE_PRODUCER UTF-8 The name of an executive producer.
DISTRIBUTED_BY UTF-8 This is akin to the IDST tag in Extended RIFF.
MASTERED_BY UTF-8 The engineer who mastered the content for a physical medium or for digital distribution.
ENCODED_BY UTF-8 This is akin to the TENC tag in ID3.
MIXED_BY UTF-8 DJ mix by the artist specified
REMIXED_BY UTF-8 Interpreted, remixed, or otherwise modified by. This is akin to the TPE4 tag in ID3.
PRODUCTION_STUDIO UTF-8 This is akin to the ISTD tag in Extended RIFF.
THANKS_TO UTF-8 A very general tag for everyone else that wants to be listed.
PUBLISHER UTF-8 This is akin to the TPUB tag in ID3.
LABEL UTF-8 The record label or imprint on the disc.
Search / Classification
GENRE UTF-8 The main genre (classical, ambient-house, synthpop, sci-fi, drama, etc). The format follows the infamous TCON tag in ID3.
MOOD UTF-8 Intended to reflect the mood of the item with a few keywords, e.g. "Romantic", "Sad" or "Uplifting". The format follows that of the TMOO tag in ID3.
ORIGINAL_MEDIA_TYPE UTF-8 Describes the original type of the media, such as, "DVD", "CD", "computer image," "drawing," "lithograph," and so forth. This is akin to the TMED tag in ID3.
CONTENT_TYPE UTF-8 The type of the item. e.g. Documentary, Feature Film, Cartoon, Music Video, Music, Sound FX, ...
SUBJECT UTF-8 Describes the topic of the file, such as "Aerial view of Seattle."
DESCRIPTION UTF-8 A short description of the content, such as "Two birds flying."
KEYWORDS UTF-8 Keywords to the item separated by a comma, used for searching.
SUMMARY UTF-8 A plot outline or a summary of the story.
SYNOPSIS UTF-8 A description of the story line of the item.
INITIAL_KEY UTF-8 The initial key that a musical track starts in. The format is identical to ID3.
PERIOD UTF-8 Describes the period that the piece is from or about. For example, "Renaissance".
LAW_RATING UTF-8 Depending on the country it's the format of the rating of a movie (P, R, X in the USA, an age in other countries or a URI defining a logo).
ICRA binary The ICRA content rating for parental control. (Previously RSACi)
Temporal Information
DATE_RELEASED UTF-8 The time that the item was originaly released. This is akin to the TDRL tag in ID3.
DATE_RECORDED UTF-8 The time that the recording began. This is akin to the TDRC tag in ID3.
DATE_ENCODED UTF-8 The time that the encoding of this item was completed began. This is akin to the TDEN tag in ID3.
DATE_TAGGED UTF-8 The time that the tags were done for this item. This is akin to the TDTG tag in ID3.
DATE_DIGITIZED UTF-8 The time that the item was tranfered to a digital medium. This is akin to the IDIT tag in RIFF.
DATE_WRITTEN UTF-8 The time that the writing of the music/script began.
DATE_PURCHASED UTF-8 Information on when the file was purchased (see also purchase tags).
Spacial Information
RECORDING_LOCATION UTF-8 The location where the item was recorded. The countries corresponding to the string, same 2 octets as in Internet domains, or possibly ISO-3166. This code is followed by a comma, then more detailed information such as state/province, another comma, and then city. For example, "US, Texas, Austin". This will allow for easy sorting. It is okay to only store the country, or the country and the state/province. More detailed information can be added after the city through the use of additional commas. In cases where the province/state is unknown, but you want to store the city, simply leave a space between the two commas. For example, "US, , Austin".
COMPOSITION_LOCATION UTF-8 Location that the item was originaly designed/written. The countries corresponding to the string, same 2 octets as in Internet domains, or possibly ISO-3166. This code is followed by a comma, then more detailed information such as state/province, another comma, and then city. For example, "US, Texas, Austin". This will allow for easy sorting. It is okay to only store the country, or the country and the state/province. More detailed information can be added after the city through the use of additional commas. In cases where the province/state is unknown, but you want to store the city, simply leave a space between the two commas. For example, "US, , Austin".
COMPOSER_NATIONALITY UTF-8 Nationality of the main composer of the item, mostly for classical music. The countries corresponding to the string, same 2 octets as in Internet domains, or possibly ISO-3166.
Personal
COMMENT UTF-8 Any comment related to the content.
PLAY_COUNTER UTF-8 The number of time the item has been played.
RATING UTF-8 A numeric value defining how much a person likes the song/movie. The number is between 0 and 5 with decimal values possible (e.g. 2.7), 5(.0) being the highest possible rating. Other rating systems with different ranges will have to be scaled.
Technical Information
ENCODER UTF-8 The software or hardware used to encode this item. ("LAME" or "XviD")
ENCODER_SETTINGS UTF-8 A list of the settings used for encoding this item. No specific format.
BPS UTF-8 The average bits per second of the specified item. This is only the data in the Blocks, and excludes headers and any container overhead.
FPS UTF-8 The average frames per second of the specified item. This is typically the average number of Blocks per second. In the event that lacing is used, each laced chunk is to be counted as a seperate frame.
BPM UTF-8 Average number of beats per minute in the complete target (e.g. a chapter). Usually a decimal number.
MEASURE UTF-8 In music, a measure is a unit of time in Western music like "4/4". It represents a regular grouping of beats, a meter, as indicated in musical notation by the time signature.. The majority of the contemporary rock and pop music you hear on the radio these days is written in the 4/4 time signature.
TUNING UTF-8 It is saved as a frequency in hertz to allow near-perfect tuning of instruments to the same tone as the musical piece (e.g. "441.34" in Hertz). The default value is 440.0 Hz.
REPLAYGAIN_GAIN binary The gain to apply to reach 89dB SPL on playback. This is based on the Replay Gain standard. Note that ReplayGain information can be found at all TargetType levels (track, album, etc).
REPLAYGAIN_PEAK binary The maximum absolute peak value of the item. This is based on the Replay Gain standard.
Identifiers
ISRC UTF-8 The International Standard Recording Code, excluding the "ISRC" prefix and including hyphens.
MCDI binary This is a binary dump of the TOC of the CDROM that this item was taken from. This holds the same information as the MCDI in ID3.
ISBN UTF-8 International Standard Book Number
BARCODE UTF-8 EAN-13 (European Article Numbering) or UPC-A (Universal Product Code) bar code identifier
CATALOG_NUMBER UTF-8 A label-specific string used to identify the release (TIC 01 for example).
LABEL_CODE UTF-8 A 4-digit or 5-digit number to identify the record label, typically printed as (LC) xxxx or (LC) 0xxxx on CDs medias or covers (only the number is stored).
LCCN UTF-8 Library of Congress Control Number
Commercial
PURCHASE_ITEM UTF-8 URL to purchase this file. This is akin to the WPAY tag in ID3.
PURCHASE_INFO UTF-8 Information on where to purchase this album. This is akin to the WCOM tag in ID3.
PURCHASE_OWNER UTF-8 Information on the person who purchased the file. This is akin to the TOWN tag in ID3.
PURCHASE_PRICE UTF-8 The amount paid for entity. There should only be a numeric value in here. Only numbers, no letters or symbols other than ".". For instance, you would store "15.59" instead of "$15.59USD".
PURCHASE_CURRENCY UTF-8 The currency type used to pay for the entity. Use ISO-4217 for the 3 letter currency code.
Legal
COPYRIGHT UTF-8 The copyright information as per the copyright holder. This is akin to the TCOP tag in ID3.
PRODUCTION_COPYRIGHT UTF-8 The copyright information as per the production copyright holder. This is akin to the TPRO tag in ID3.
LICENSE UTF-8 The license applied to the content (like Creative Commons variants).
TERMS_OF_USE UTF-8 The terms of use for this item. This is akin to the USER tag in ID3.

Format Guidelines

  • the Tags element SHOULD be placed at the end of the file to allow for trivial updates
  • TagName SHOULD appear before the tag data, i.e., TagString, TagBinary
  • a SimpleTag SHOULD NOT contain other SimpleTags
  • when concatenating multiple files new Segments should be used to avoid merging tags common to each file
Comments