efm decoder - happycube/ld-decode GitHub Wiki
Overview
EFM-decoder is a collection of tools for handling EFM (Eight-to-Fourteen Modulation) data as used on Compact Discs and LaserDiscs.
Note: The EFM decoder tools are mostly untested - please consider using and reporting any issues found. If you need something more robust, use the ld-process-efm tool instead.
The supported EFM data and structures are defined by the ECMA-130 (issue 2) specification "Data interchange on read-only 120 mm optical data disks (CD-ROM)" which was written to enhance the original (audio-only) specification IEC 60908 (second edition 1999-02) "Audio recording - Compact disc digital audio system".
The EFM decoder is split into a number of individual tools, each providing functionality at different layers of the decoding process. In addition, the suite of tools includes an F2 Section stacker that is capable of performing error correction at the F2 Frame level based on multiple EFM input images.
TL;DR - How do I get digital audio from Disney's Bambi?
Assuming that you have decoded the LaserDisc using ld-decode (and you specified the output as bambi_1) you will have a EFM file called bambi_1.efm (for side 1).
The following commands will decode into a 44.1KHz Stereo 16-bit wav file along with an Audacity label file (metadata) that can be used to understand the contents of the resulting audio:
efm-decoder-f2 ./bambi_1.efm ./bambi_1.f2s
efm-decoder-d24 ./bambi_1.f2s ./bambi_1.d24
efm-decoder-audio ./bambi_1.d24 ./bambi.wav --audacity-labels --zero-pad
This process will leave you with two files, bambi_1.wav (which contains the audio) and bambi_1.txt (which contains the labels).
efm-decoder-f2
The efm-decoder-f2 tool takes T-values as input (supplied by a tool such as ld-decode) and decodes them into F2 sections. Each section contains 98 F2 frames.
Note that the reason for 'sections' of 98 frames is due to the ECMA-130 requirements around metadata (which contains addressing information in the form of time-stamps). The resolution of the time stamps is 1/75th of a second as the q-channel metadata repeats every 98 frames. So each section contains 98 frames with the same Q and P channel metadata. Correction of frame order at a higher resolution isn't possible due to the lack of addressing resolution.
The decoding sequence is:
- T-values
- Channel frames
- F3 Frames
- F2 Sections
- F2 Section Correction
Command line
Usage: efm-decoder-f2 [options] input output
efm-decoder-f2 - EFM T-values to F2 Section decoder
(c)2025 Simon Inns
GPLv3 Open-Source - github: https://github.com/happycube/ld-decode
Options:
-h, --help Displays help on commandline options.
--help-all Displays help including Qt specific options.
-v, --version Displays version information.
-d, --debug Show debug
-q, --quiet Suppress info and warning messages
--show-f3 Show F3 frame data
--show-f2 Show F2 frame data
--show-tvalues-debug Show T-values to channel decoding debug
--show-channel-debug Show channel to F3 decoding debug
--show-f3-debug Show F3 to F2 section decoding debug
--show-f2-correct-debug Show F2 section correction debug
--show-all-debug Show all debug
Arguments:
input Specify input EFM file
output Specify output F2 section file
efm-decoder-d24
The efm-decoder-d24 tool takes F2 sections as input (supplied by a tool such as efm-decoder-f2) and decodes them into Data24 sections. Each section contains 98 frames.
Note that Data24 sections are almost identical to F1 Frames (as defined by ECMA-130) except the byte order of the F1 Frame has been corrected (see clause 16 of ECMA-130). The reason for this extra data type is that Data24 frames can either be interpreted as audio (in accordance to IEC 60908-1999) or as sector-based data (in accordance to ECMA-130). So an intermediate type is required to support both possibilities.
The decoding sequence is:
- F2 Sections
- F1 Sections
- Data24 Sections
Note that the decoding from F2 to F1 includes the unscrambling and CIRC based error correction (known as C1 and C2 correction) according to ECMA-130 section 18. Section metadata from the F2 sections is preserved by the F1 sections.
Command line
Usage: efm-decoder-d24 [options] input output
efm-decoder-d24 - EFM F2 Section to Data24 Section decoder
(c)2025 Simon Inns
GPLv3 Open-Source - github: https://github.com/happycube/ld-decode
Options:
-h, --help Displays help on commandline options.
--help-all Displays help including Qt specific options.
-v, --version Displays version information.
-d, --debug Show debug
-q, --quiet Suppress info and warning messages
--show-f1 Show F1 frame data
--show-data24 Show Data24 frame data
--show-f2-debug Show F2 to F1 decoding debug
--show-f1-debug Show F1 to Data24 decoding debug
--show-all-debug Show all debug options
Arguments:
input Specify input F2 Section file
output Specify output Data24 Section file
efm-decoder-audio
The efm-decoder-audio tool takes Data24 sections as input (supplied by a tool such as efm-decoder-d24) and decodes them into wav-format audio as well as metadata.
Note that wav files do not contain any detailed metadata, so the efm-decoder-audio tool produces a separate CSV file containing the information provided by the q-channel addressing information. In addition, the CSV file contains the locations of any errors or erasures (missing samples) which have been silenced or concealed by the decoding process.
Note that, in accordance with IEC 60908-1999, the output wav format is 16-bit signed integer stereo at a sampling rate of 44.1KHz. The tool includes a standard wav format header in the output to allow tools such as Audacity to correctly interpret the format automatically.
The decoding sequence is:
- Data24 Sections
- 16-bit WAV audio
Note that the silencing and concealment of corrupt samples is optional. If a sample is corrupt the tool will look and see if the sample before and after is valid. If valid the sample will be concealed with the average of the two valid samples. If either the leading or trailing sample is corrupt, the tool will silence the sample with 0 amplitude.
Command line
Usage: efm-decoder-audio [options] input output
efm-decoder-audio - EFM Data24 to Audio decoder
(c)2025 Simon Inns
GPLv3 Open-Source - github: https://github.com/happycube/ld-decode
Options:
-h, --help Displays help on commandline options.
--help-all Displays help including Qt specific options.
-v, --version Displays version information.
-d, --debug Show debug
-q, --quiet Suppress info and warning messages
--audacity-labels Output WAV metadata as Audacity labels
--no-audio-concealment Do not conceal errors in the audio data
--zero-pad Zero pad the audio data from 00:00:00
--show-audio Show Audio frame data
--show-audio-debug Show Data24 to audio decoding debug
--show-audio-correction-debug Show Audio correction debug
--show-all-debug Show all decoding debug
Arguments:
input Specify input Data24 Section file
output Specify output wav file
Metadata format
The metadata output is formatted as an Audacity label file and can be directly loaded into Audacity in order to annotate the WAV output.
Note that the timestamps are in Audacity's millisecond time (with enough resolution to identify individual stereo-sample pairs at 44.1 KHz). Timestamps in the label text are CDDA format (to match the decoding process output).
Note: Track number metadata is only written if the track number changes
Example metadata
The following shows an example of the metadata output:
2038.365964 2038.366100 Silenced: 33:58:27
2038.366281 2038.366281 Concealed: 33:58:27
2038.366327 2038.366327 Concealed: 33:58:27
2038.366372 2038.366372 Concealed: 33:58:27
2038.366531 2038.366531 Concealed: 33:58:27
2038.366576 2038.366576 Concealed: 33:58:27
2038.366621 2038.366621 Concealed: 33:58:27
2038.366825 2038.366825 Concealed: 33:58:27
2038.366871 2038.366871 Concealed: 33:58:27
2038.366916 2038.366916 Concealed: 33:58:27
0.000000 175.333333 Track: 01 [00:00:00-02:53:25]
175.346667 470.520000 Track: 02 [00:00:00-04:55:13]
470.533333 736.786667 Track: 03 [00:00:00-04:26:19]
efm-decoder-data
The efm-decoder-data tool takes Data24 sections as input (supplied by a tool such as efm-decoder-d24) and decodes them into ECMA-130 compliant data sectors.
The output is a binary file that represents the original input data to the ECMA-130 encoding process (with the exception of unrecoverable errors).
Note that Data24 sections contain 24x98 = 2352 bytes. The Data24 sections are stripped of parity and other error correction data when converted into sectors. Each sector therefore contains 2048 bytes of data (see ECMA-130 for details). Any sector metadata is preserved and output in the metadata output file.
The decoding sequence is:
- Data24 Sections
- Raw Sectors (2048 bytes per sector)
- Sectors (after unscramble and error correction)
- Binary data + metadata
Note that the decoding process from Data24 to binary data includes RSPC error correction using Q and P parity according to ECMA-130 section 14.
Note that the output of metadata is optional.
Command line
Usage: efm-decoder-data [options] input output
efm-decoder-data - EFM Data24 to data decoder
(c)2025 Simon Inns
GPLv3 Open-Source - github: https://github.com/happycube/ld-decode
Options:
-h, --help Displays help on commandline options.
--help-all Displays help including Qt specific options.
-v, --version Displays version information.
-d, --debug Show debug
-q, --quiet Suppress info and warning messages
--output-metadata Output bad sector map metadata
--show-rawsector Show Raw Sector frame data
--show-rawsector-debug Show Data24 to raw sector decoding debug
--show-sector-debug Show raw sector to sector decoding debug
--show-sector-correction-debug Show sector correction decoding debug
--show-all-debug Show all decoding debug
Arguments:
input Specify input Data24 Section file
output Specify output data file
Metadata format
Format: Address
Each address represents a 2048 byte sector. Since error correction is performed on the whole sector there are no individual error/erasure flags; the resulting sector is either completely valid or invalid. Only invalid sectors are written to the metadata.
Example metadata
28
29
30
31
32
33
34
efm-stacker-f2
The efm-stacker-f2 tool takes F2 section files (produced by efm-decoder-f2) and stacks them together. The stacking process examines differences between the available F2 section sources and corrects based on a "most likely correct" voting system. The more available sources, the more error-free the output.
Command line
Usage: efm-stacker-f2 [options] inputs output
efm-stacker-f2 - EFM F2 Section stacker
(c)2025 Simon Inns
GPLv3 Open-Source - github: https://github.com/happycube/ld-decode
Options:
-h, --help Displays help on commandline options.
--help-all Displays help including Qt specific options.
-v, --version Displays version information.
-d, --debug Show debug
-q, --quiet Suppress info and warning messages
Arguments:
inputs Specify input F2 section files
output Specify output F2 section file
Example usage
./efm-stacker-f2 --debug ../efm-decoder-f2/DS2_NatA_PP.f2s ../efm-decoder-f2/DS4_NatA_PP.f2s ../efm-decoder-f2/DS6_NatA_PP.f2s ../efm-decoder-f2/DS7_NatA_PP.f2s ../efm-decoder-f2/DS8_NatA_PP.f2s ../efm-decoder-f2/DS10_NatA_PP.f2s ./DSstacked_NatA_PP.f2s
vfs-verifier
The VFS verifier is a specific testing tool for BBC Domesday AIV data images produced by EFM decoding. The VFS verifier takes the output from the efm-decoder-data tool and the valid sector metadata and attempts to read the underlying VFS file structure within the data. By reading the VFS structure the tool can work out where the actual VFS file data is in the EFM image. This is compared to the EFM valid sector metadata to verify that all required data for the VFS image is valid in the resulting data output file.
Command line
Usage: vfs-verifier [options] input bad-sector-map
vfs-verifier - Acorn VFS (Domesday) image verifier
(c)2025 Simon Inns
GPLv3 Open-Source - github: https://github.com/happycube/ld-decode
Options:
-h, --help Displays help on commandline options.
--help-all Displays help including Qt specific options.
-v, --version Displays version information.
-d, --debug Show debug
-q, --quiet Suppress info and warning messages
Arguments:
input Specify input EFM file
bad-sector-map Specify bad sector map metadata file
EFM data structure
In order to understand the EFM decoding it is necessary to understand the underlying structure of the EFM data. The various EFM data types are as follows:
- T-Values
- Channel Frames
- F3 Frames
- F2 Frame Sections
- F1 Frame Sections
- Data24 Sections
- ECMA-130 Sectors (data only)
T-Values
The initial EFM data (supplied by a tool such as ld-decode or another extraction method) consists of a file containing unsigned bytes. Each byte represents a 'T' value. T-values range from T3 to T11 with T3 being the shortest 'event' in the EFM and T11 being the longest (these actually represent the period of the EFM from the original source).
The T-values are first converted into a bit-stream. Each T value represents a section of the bit-stream as shown below:
T3 = 100
T4 = 1000
T5 = 10000
T6 = 100000
T7 = 1000000
T8 = 10000000
T9 = 100000000
T10 = 1000000000
T11 = 10000000000
The initial bit-stream is formed by simply concatenating the T-value bit equivalents together. For example, T3+T6+T9 would be 100100000100000000
.
Channel frames
The bit-steam is then split into "channel frames" - each channel frame consists of 27 sync pattern bits and 33 EFM data symbols (of 17 bits per symbol) totalling 588 bits per channel frame.
A channel frame has the following structure:
- Sync Header : 24 Channel bits
- Merging bits : 3 Channel bits
- Control byte : 14 Channel bits
- Merging bits : 3 Channel bits
Bytes 1 to 32, each followed by Merging bits : 32 x (14+3) = 544 Channel bits
Thus, each Channel Frame representing a F3-Frame comprises 588 Channel bits.
These Channel bits are recorded on the (CD) disk along a Physical Track. A ONE Channel bit shall be represented by a change of pit to land or land to pit in the reflective layer. A ZERO Channel bit shall be represented by no change in the reflective layer.
F3 Frames
The F3 Frame consists of 33 symbols each of 8-bits in length which result from the initial EFM data processing. The total of 33 symbols of 8-bits in length gives a total of 264 bits per frame.
Each F3 frame consists of (since each symbol is 8-bits, one symbol is equivalent to one byte):
- 1 Subcode byte
- 24 user-data bytes
- 8 parity bytes
Recording of the F3 Frames on the disk
In order to record the F3-Frames on the disk each 8-bit byte shall be represented by 14 so-called Channel bits. Each F3-Frame is represented by a so-called Channel Frame comprising a Sync Header, Merging bits and 33 14-Channel bit bytes.
8-to-14 Encoding
All 33 bytes of the F3-Frames of each Section are 8-bit bytes. They shall be converted into 14-bit bytes according to the table of annex D. The bits of these 14-bit bytes are called Channel bits. These bytes of 14 Channel bits are characterized by the fact that between two ONEs there are at least two and at most ten ZEROs.
The first byte of the first two F3-Frames of each Section, i.e. the Control byte of these frames, is not converted according to this table but is given a specific synchronisation pattern of 14 Channel bits that is not included in the table of valid EFM codes. These two patterns shall be:
- 1st Frame, byte 0, called SYNC 0 : 00100000000001
- 2nd Frame, byte 0, called SYNC 1 : 00000000010010
The left-most Channel bit is sent first in the data stream.
Sync Header
A Sync Header shall be the following sequence of 24 Channel bits:
100000000001000000000010
Merging Channel bits
Merging Channel bits are sequences of three Channel bits set according to ECMA-130 annex E and inserted between the bytes of 14 Channel bits as well as between the Sync Header and the adjacent bytes of 14 Channel bits.
F2 Frame Sections
The result of decoding 98 F3 Frames (along with the associated subcode data) produces an F2 Frame Section (containing 98 F2 Frames). F2 Frames also contain CIRC parity data allowing basic error correction (and detection) to be performed.
F1 Frame Sections
Once an F2 Frame has passed through error correction (as well as delay-lines and interleaving, etc) the parity data is stripped leaving an F1 Frame Section that contains 98 F1 Frame (of 24 bytes each) and the associated subcode metadata. In accordance to the ECMA-130 spec, all F1 frames have the byte order reversed.
Data24
When handing raw data (such as input data or input WAV data) the tools use data24 to represent chunks of data. The reason for this is that F1 Frames take 24 bytes as input, so the data24 is a convenient way of handling incoming data. Data24 is not part of the ECMA-130 definitions. Data24 sections consist of 98 data24 frames. A Data24 frame is basically an F1 Frame with the byte order of the data corrected.
Note: data24 * 98 frames represents 1/75th of a second
Sections of 98 frames
A section is a group of 98 F2 Frames representing 1/75th of a second of user-data (audio or other data) or a lead-in/lead-out section.
Each section has 98 bytes of subcode data (one byte per F2 Frame).
These requirements mean you need a minimum of 98 F2 Frames in order to produce an F3 Frame (or vice-versa, 98 F3 Frames in order to produce subcode data).
Why 98 Frames per Section?
IEC Sample requirements:
- Sample Rate: 44.1 kHz = 44,100 samples per second (per channel)
- Bit Depth: 16 bits = 2 bytes (per sample)
- Channels: Stereo = 2 channels
Calculation:
- Data per sample = 2 bytes
- Data per second per channel = 44,100 samples/second} * 2 bytes/sample = 88,200 bytes/second
- Total data for stereo = 88,200 bytes/second * 2 channels = 176,400 bytes/second
A 44.1 kHz, 16-bit, stereo sample requires 176,400 bytes per second.
Since the original IEC specification is audio based and the supported sample type is 16-bit, 44.1 KHz every second of audio requires 176,400 bytes. The minimum time window allowed is a "section" which is 98x24 bytes = 2352 bytes. 176,400 / 2,352 bytes = 75 (i.e. a section represents a time period of 1/75th of a second).