MPEG 4 AVC H264 - uupaa/H264.js GitHub Wiki
ãã®ãšã³ããªã§ã¯ã䞻㫠HLS ãã MPEG-2 TS ãåãåºããH.264 ãš AAC ã«åè§£ãããŸã§ã®çšèªãã«ããŒããŠããŸãã å°ããžã MPEG-2 PS ã«é¢ããèšè¿°ã¯ãããŸããã
MPEG-2 ã®èª¬æãåç §ããŠãã ããã
MPEG-4 part 10 (MPEG-4 AVC/H264)
MPEG-4 part 10 ã®èŠæ Œæžã¯ãITU-T ããç¡æã§ããŠã³ããŒãã§ããŸã (ISO/IEC ã§ã¯åããã®ãææã§è²©å£²ããŠããŸã)ã
MPEG-4 ã¯ãã®æãç«ã¡ãããMPEG-2 ã MPEG-1 ã§å®çŸ©ãããçšèªã床ã ç»å Žããäºã«ãªããŸãã
MPEG-4
-
ByteStream
ByteStream format
(ByteStream format)- MPEG-4 part 10 AVC/H264 ã®èŠæ Œæžã® Annex B ã§å®çŸ©ãããŠãããã€ãããŒã¿ã®ãã©ãŒãããã§ãé£ç¶ãã StartCode + EBSP ã§æ§æããããã€ãããŒã¿ã§ã
- MPEG-2 TS ã«æèŒãããH.264ããŒã¿ãå
ã«MP4ãã¡ã€ã«ãçæããã«ã¯ã以äžã®æé ã§ããŒã¿å€æãè¡ãå¿
èŠããããŸã
- fetch M3U8 file
- parse M3U8 file
- fetch *.ts file
- parse TSPacket
- parse PESPacket
- parse ByteStream
- parse NAL file format
- parse AUD, SPS, PPS, SEI, IDR, ...
- build AccessUnit
- Mux
- build mp4 file
-
VCL
(Video Coding Layer)- ãããªç¬Šå·åã¬ã€ã€, åç»ç¬Šå·åãæ±ãæåã®ã¬ã€ã€ãŒã«ãªãŸã
-
NAL
(Network Abstraction Layer)- 笊å·åããæ å ±ãäŒé/èç©ãè¡ã VCL ã®æ¬¡ã®ã¬ã€ã€ãŒã«ãªããŸã
- 1å以äžã® NAL Unit ãé£ç¶ããŠãã Stream ã§ãã
| NALUnit | NALUnit | ...
-
VCL-NALUnit
- å§çž®ãããã¹ã©ã€ã¹ããŒã¿(æ åããŒã¿)ãã®ãã®ãæããŸã
nal_unit_type
ã®å€ã¯ 1 ã 5 ã«ãªããŸã
-
Non-VCL-NALUnit
- æ åããŒã¿ã§ã¯ãªãã¡ã¿ããŒã¿(è£å©çãªæ å ±)ã§ããAUD, SPS, PPS, SEI ãªã©ã Non-VCL-NALUnit ã«è©²åœããŸã
nal_unit_type
ã®å€ã 6以äžã«ãªããŸã
-
NALUnit
(Network Abstraction Layer Unit)- NALUnit ã¯AVC ã®äžçã«ãããããŒã¿ã®åºæ¬çãªåäœã§
NALHeader
ãšRBSP
ã§æ§æãããŸãNALUnit
=NALUnitHeader
+EBSP
- NALUnit ã¯ãã€ãåäœã§ã¢ã©ã€ã¡ã³ããããŠããŸã
- NALUnit ã¯åç»ãé³å£°ããŒã¿ãå«ã VCL-NALUnit ãšãããããå«ãŸãªã Non-VCL-NALUnit ã®2ã€ã«å€§ããåé¡ã§ããŸã
- VCL-NALUnit ãã©ããã¯ã
nal_unit_type
ã§å€å¥ã§ããŸã
- VCL-NALUnit ãã©ããã¯ã
- NALUnit ã«ã¯2ã€ã® Stream(ãã€ããªããŒã¿ãã©ãŒããã) ããããŸã
ByteStream
(NAL Uints in Byte-stream Format)(StartCode + NALUnitHeader + EBSP
)
- ãã®ãã©ãŒãããã§ã¯ NALUnit ã®é·ãã¯æç¢ºã«æ ŒçŽãããŠããŸãããå é ããStartCode(00 00 01)ãæ¢ãããšã§NALUnitã®åºåããšãã€ããµã€ãºãç¥ãããšãã§ããŸã
- StartCode ã¯
00 00 01
ã§ããå é ã«è€æ°ã®äœèšãª00
ãå«ãŸããŠããŠãOKã§ã00 00 01
ã®ä»ã«00 00 00 01
ã00 00 00 00 01
ãªã©ãåæ³ã§ã- ä»ã® Stream ã«å€æããå Žåã¯ãEP3B(
00 00 03 0x
) ã00 00 0x
ã«ãã³ãŒãããå¿ èŠããããŸã
- MPEG-2 TS ã«æ ŒçŽãããŠãããã€ãããŒã¿ã¯ãã®ãã©ãŒãããã§ã
- ffmpeg -o ã§æ¡åŒµåã
.264
ã«ããå Žåã«çæãããçããŒã¿ããã®ãã©ãŒãããã§ã
NAL file format
(NALUnitSize + NALUnitHeader + EBSP
)
- mp4 ã®
mdat
box ã«æ ŒçŽãããŠãããã€ãããŒã¿ã¯ãã¡ãã®ãã©ãŒãããã§ã - NALUnitSize ã«ã¯ NALUnit ã®ãã€ãæ°ãããã¯ãšã³ãã£ã¢ã³ã§æ ŒçŽãããŠããŸã
- MPEG-4 part 10 AVC ã®èŠæ Œæžã§ã¯ãNALUnitãã -> NALUnitHeader ãš RBSP ãåãåºãæ¬äŒŒã³ãŒãã
nal_unit
颿°ãšããŠèª¬æãããŠããŸãemulation_prevention_three_byte
ã nal_unit 颿°ã®äžã§ãã³ãŒãããŠããŸã
- NALUnit ã¯AVC ã®äžçã«ãããããŒã¿ã®åºæ¬çãªåäœã§
-
StartCode
- StartCode =
00
(0å以äž) +00 00 01
00 00 01
ããæ§æããããã€ãåã§ããå é ã«äœèšãª00
ãè€æ°ååšããå ŽåããããŸã
- StartCode =
-
NALUnitSize
,NumBytesInNALunit
- NALUnit ã®ãµã€ãºãç€ºãæ å ±ã§ã
- NALUnit ã®ãµã€ãºã 4 byte ãªã
| 00 00 00 04 | NALUnit |
ã«ãªããŸã - éåžž 4 byte ã§ãã 1 byte ãŸã㯠2 byte ã®ã±ãŒã¹ãããããŸã
- AVCDecoderConfigurationRecord.lengthSizeMinusOne + 1 ã2 ãªã NALUnitSize ã®ãµã€ãºã¯ 2 ã«ãªããŸã
-
NALUnitHeader
,NALHeader
-
NALUnit ã®å é 1byteã§ããå é ã¯
0b0
ã§å§ãŸããŸããNALUnit ã®å 容ããã³ãŒãããããã®æ å ±ãå«ãŸããŠããŸã -
NALUnit ã®å 容ãç¥ãã«ã¯ãå é 1byte (NALUnitHeader) ãèªã¿èŸŒã¿,
0
(åºå®å€) +nal_ref_idc (2 bit)
+nal_unit_type (5 bit)
ã«åè§£ããŸã|ABBCCCCC| bits field |~ | -> 1 always zero `0` | ~~ | -> 2 nal_ref_idc (åç §ãã¯ãã£ãšãªãã¹ã©ã€ã¹ãå«ãŸããŠãããã©ãã) | ~~~~~| -> 5 nal_unit_type (NAL Unit ã® type ã瀺ãèå¥å)
-
nal_ref_idc
(2 bit)- åç §ãã¯ãã£ãšãªãã¹ã©ã€ã¹ãå«ãŸããŠãããã©ããã瀺ããŸã
nal_ref_idc
ã00
ãªããã®NALUnitããã¯ãã£äºæž¬ã®ããã«åç §ãããªãäºãæå³ããŠããŸã(ã€ãŸãäžåºŠèªãã ãéããã«å»æ£å¯èœãªNALUnitãšããäºã«ãªããŸã)ã00
以å€ãªããã® AccessUnit ã«ã¯ SPSãPPSãã¹ã©ã€ã¹ããŒã¿ãå«ãŸããŠãããä»ã®NALUnitããã¯ãã£äºæž¬ããåç §ãããããŒã¿ãå«ãã§ãããšããæå³ã«ãªããŸã(ã€ãŸãèªã¿èŸŒãã åŸã«ããããŸå»æ£ã§ããŸãã)
-
nal_unit_type
(5 bit)- NALUnit ã® type ã瀺ãèå¥åã§ã
- nal_unit_type ã 5 ãªããã® NALUnit 㯠IDR ãã¯ãã£ã§ããVCL-NALUnit ã§ã
- nal_unit_type ã 9 ãªããã® NALUnit 㯠AUD ã§ããAccessUnit ã®åãç®ãæå³ããŸããNon-VCL-NALUnit ã§ã
- nal_unit_type ã 7 ãªããã® NALUnit 㯠SPS ã§ããAccessUnit å šäœã«å¯Ÿããã¡ã¿æ å ±ãæ ŒçŽãããŠããŸããNon-VCL-NALUnit ã§ã
- nal_unit_type ã 8 ãªããã® NALUnit 㯠PPS ã§ãããã¯ãã£ã«å¯Ÿããã¡ã¿æ å ±ãæ ŒçŽãããŠããŸããNon-VCL-NALUnit ã§ã
nal_unit_type Subject 0 æªå®çŸ© 1 IDR 以å€ã®ãã¯ãã£ã®äžéš(ã¹ã©ã€ã¹) 2 ããŒã¿ããŒãã£ã·ã§ãã³ã°Aã§ç¬Šå·åãããã¹ã©ã€ã¹ 3 ããŒã¿ããŒãã£ã·ã§ãã³ã°Bã§ç¬Šå·åãããã¹ã©ã€ã¹ 4 ããŒã¿ããŒãã£ã·ã§ãã³ã°Cã§ç¬Šå·åãããã¹ã©ã€ã¹ 5 IDR ãã¯ã㣠6 SEI 7 SPS 8 PPS 9 AU ããªãã¿ 10 End of Sequence 11 End of Stream 12 Filler Data (ãã©ãŒããããæŽããããã«æ¿å ¥ãããè©°ãç©ããããŒããŒã¿) 13 SPSæ¡åŒµ, FRExt ã§äœ¿çš 14 .. 18 Reserved 19 è£å©ã¹ã©ã€ã¹, FRExt ã§äœ¿çš 20 .. 23 Reserved 24 .. 31 æªå®çŸ©
-
-
AccessUnit
AU
- AccessUnit 1æã®ãã¯ãã£(ãã¬ãŒã )ãçæããããã«å¿ èŠãª NALUnit ã®éãŸãã®äºã§ãã
- AccessUnit ã¯
AUD
(Access Unit Delimiter) ãæ€çŽ¢ããããšã§èŠã€ããäºãã§ããŸãã
+-----+-----+-----+-----+---------++-----+-----------++-----+-----------+ | AUD | SPS | PPS | SEI | IDR ... || AUD | ... || AUD | ... | +-----+-----+-----+-----+---------++-----+-----------++-----+-----------+ <---------- Access Unit ----------><-- Access Unit --><-- Access Unit -->
-
EBSP
(Encapsulate Byte Sequence Payload),EP3B
(Emulation Prevention Three byte),emulation_prevention_three_byte
-
EP3B (00 00 03 0x) ãå«ã(ãããããªã)ãã€ãããŒã¿ã§ãã
-
StartCode ã®
00 00 01
ãšé¡äŒŒããããããã¿ãŒã³ã ByteStream å ã«ååšãããšé©åã«ããŒã¹ãã§ããªããªã£ãŠããŸããŸãã -
ãã®ãããªç¶æ ãé¿ãããããByteStream(EBSP) ã«ãããŠã¯ä»¥äžã®ç¹å®ã®ãã¿ãŒã³ã®ãã€ãåããå¥ã®ãã€ãåã«çœ®æããç¶æ ã§æ ŒçŽãããŠããŸãã
-
EBSP ã RBSP ã«å€æããã«ã¯
00 00 03 0x
ã00 00 0x
ã«ãã³ãŒãããå¿ èŠããããŸããx
ã«ã¯ 0ã3 ã®å€ãå ¥ããŸãEBSP RBSP 00 00 03 00
00 00 00
00 00 03 01
00 00 01
00 00 03 02
00 00 02
00 00 03 03
00 00 03
-
-
RBSP
(Raw Byte Sequence Payload), Syntax Element- NALUnitHeader + RBSP ãšããæèã§èª¬æãããŠããå Žåã¯ãNALUnit ã® 2byteç®ä»¥éã®ããŒã¿ã§ã
- EP3B ãå«ãŸãªããã€ãããŒã¿ã§ã
- RBSP ã®æ«å°Ÿã¯ãã€ãã¢ã©ã€ã¡ã³ã(8bitåã)ãè¡ãããŸãã1byte ã«æºããªãéšåã«ã¯
RBSP_trailng_bits
ãä»äžã㊠byte åäœã«ããŸã- RBSPã®ããŒã¿ã3bitäœãå Žåã¯ã
xxx
+1
+0000
ã RBSPã®æ«å°Ÿã«åã蟌ãŸãxxx10000
ã«ãªããŸã - ãã³ãŒãã¯ãæåŸã®ByteãLSBããMSBã®æ¹åã«èŠãŠãããŸã(å³ããå·Šã«èŠãŠãããŸã)ã
1
ãçŸããããããŸã§ã padding ããŒã¿ã§ã
- RBSPã®ããŒã¿ã3bitäœãå Žåã¯ã
-
RBSP_trailng_bits
- RBSP ã®æ«å°Ÿã«é 眮ããè©°ãç©ã§ã
- å
é ã 1 ã§ããã以é㯠0 ã®ãããåã§ã
- 7bit ã®ééãããå Žåã¯ã0b1000000 ã§åããŸã
- 4bit ã®ééãããå Žåã¯ã0b1000 ã§åããŸã
- 2bit ã®ééãããå Žåã¯ã0b10 ã§åããŸã
- 1bit ã®ééãããå Žåã¯ã0b1 ã§åããŸã
-
SODB
(string of data bits)- NALUnit ã«å«ãŸãã raw ãã€ããªããŒã¿ã§ãã
- ãã€ããªããŒã¿ã Bitåäœã§èãããã®ã SODBã§ãByteåäœã§èãããã®ã RBSP ã§ãã
-
Stream
,MPEG Stream
- MPEG ã«ããã Stream ãšã¯ ãã€ããªããŒã¿ãã©ãŒãããã®äºã§ãã MPEG ã§ã¯æ§ã ãªãã©ãŒãããã®ãã€ããªããŒã¿ãæ±ããããã©ãŒããããèå¥ããããã®ååãã€ããŠããŸã
ES
(Elementary Stream)- çãªãŒãã£ãªããŒã¿ããããªããŒã¿ã¯ ES ãšåŒã°ããŸã
- ES ã Mux ããŠåŸãããããŒã¿ã¯ System Stream (MPEG1) ã Program Stream (MPEG-2 PS)ãšåŒã°ããŸã
PES
(Packetized Elementary Stream)- ES ããããã¯ãŒã¯ã«æµããããããã«åå²ããã±ããåãããã®ã PES ãšåŒã³ãŸã
- PES ã«ã¯åçæå»ã«é¢ããæ
å ±(PCR, OPCR)ãå«ãäºãã§ããŸã
- PCR, OPCR 㯠27MHz(27 * 1000 * 1000) ã§é§åããã¯ããã¯ã§ããçŸå®äžçã®1ç§ã¯ PCR ã§ã¯ 27000000 ã«ãªããŸã
- ãã³ãŒãã¯æå»æ å ±ãå«ãŸããŠããå Žåã«ããããã䜿ã£ãŠé³å£°ãšåç»ãåæãããäºãå¯èœã«ãªããŸã
- çšéã«å¿ã㊠TS ãš PS ããããŸã
TS
(Transport Stream)- ãããã¯ãŒã¯ã«æµããããããã«ãã±ããåãããããŒã¿ã¯ Transport Stream (MPEG-2 TS)ãšåŒã°ããŸã
PS
(Program Stream)- HDD,ããŒã,åç€ã¡ãã£ã¢ãªã©ã«ä¿åãããã圢ã§ãã±ããåãããããŒã¿ã¯ Program Stream (MPEG-2 PS)ãšåŒã°ããŸã
- PS ãš PS 㯠lossless (ç¡å£å)ã§çžäºã«å€æããããšãå¯èœã§ãã
NALUnit in Byte Stream Format
(H264 Byte stream format)- StartCode(00 00 00 01 ã 00 00 01) ãåºåããšãã NAL Unit ã®ãã€ããªã¹ããªãŒã ã§ã
| 00 00 00 01 | NALUnit | 00 00 01 | NALUnt | ...
NAL
,NAL Stream
- 1å以äžã® NAL Unit ãé£ç¶ããŠãã Stream ã§ãã
| NALUnitSize | NALUnit | NALUnitSize | NALUnit | ...
-
marker_bit
- ãããåãã¹ã¿ãŒãã³ãŒã(0x000001ç)ãšäžèŽããŠããŸãããšãé¿ããããã« ã¬ãŒãã³ãŒããšããŠ
marker_bit
ãæã æ¿å ¥ãããŠããŸããmarker_bit
èªäœã¯æ å ±ãæã£ãŠããªãããèªã¿é£ã°ããŸãã
- ãããåãã¹ã¿ãŒãã³ãŒã(0x000001ç)ãšäžèŽããŠããŸãããšãé¿ããããã« ã¬ãŒãã³ãŒããšããŠ
-
AU
(Access Unit)- 1ã€ã®ãã¯ãã£ãæ§æããããã«å¿ èŠãªæ å ±ãåããNALUnitã®éåã§ã
- äžè¬ç㪠AU ã¯é£ç¶ããNALUnit(AUD, SPS, PPS, SEI, IDR...) ããæ§æãããŸã
- AU ã« SPS ã PPS ãå«ãŸããªãå Žåã¯ã1ã€åã® SPS ã PPS ãã³ããŒããŠäœ¿çšããå¿ èŠãããããã§ã TODO: èŠç¢ºèª â ç¢ºèªæžã¿ãæåŸã®SPSãPPSã¯çç¥ãããå Žåã«åããä¿æããŠããå¿ èŠããããŸã
-
AUD
(Access unit delimiter)(AU Delimiter)- ã¢ã¯ã»ã¹ãŠãããã®å é ã瀺ããŸã
- ã¢ã¯ã»ã¹ãŠãããã«å«ãŸããã¹ã©ã€ã¹ã®çš®é¡ãæ ŒçŽãããŠããŸã
nal_unit_type
㯠9 ã§ã00 00 01 09 F0
ããæ§æããããã€ãåã§ã
-
End of Sequence
- ã·ãŒã±ã³ã¹ã®çµç«¯ã瀺ããŸã
-
End of Stream
- æ å/é³å£°ã¹ããªãŒã å šäœã®çµç«¯ã瀺ããŸã
-
Filler Data
- ããŒã¿éãå°ãªãããŠèŠæ Œã®ä»æ§ãæºãããªãå Žåã«æ¿å ¥ãããç¡æå³ãªããŒã¿ã§ã
-
SPS
(Sequence parameter set)nal_unit_type
㯠7 ã§ã- çç¥ãå¯èœã§ããçç¥ãããŠããå Žåã¯ããããŸã§ã«çŸããæåŸ(latest)ã® SPS ãæµçšããŸã
- AU å
šäœã«é¢ããéèŠãªã¡ã¿æ
å ±ãæ ŒçŽãããŠããŸã
seq_parameter_set_id
, ãã®SPSããŒã¿ã瀺ããŠããŒã¯ãªIDã§ããPPS ããã¯ãã®IDã§åç §ãããŸããéèŠãªããŒã¿ã§ãprofile_idc
, H.264 profile ã§ããBaseline profile 㯠0xE0(224) ã§ãlevel_idc
, H.264 Level ã§ã Level 30 㯠0x1E(30) ã§ãnum_ref_frames
, åç §ããŠãããã¬ãŒã æ°ã§ãã TODO: åç §ããŠãã? åç §ãããŠãã?pic_width_in_mbs_minus1
, ãã¯ãã£ã®å¹ -1ã®å€ã§ãã+1 ããããšã§å¹ ãåŸãããŸãpic_height_in_map_units_minus1
, ãã¯ãã£ã®é«ã-1ã®å€ã§ãã+1 ããããšã§é«ããåŸãããŸã
-
PPS
(Picture parameter set)nal_unit_type
㯠8 ã§ã- çç¥ãå¯èœã§ããçç¥ãããŠããå Žåã¯ããããŸã§ã«çŸããæåŸ(latest)ã® PPS ãæµçšããŸã
- ãã¯ãã£å
šäœã«é¢ãããã©ã¡ã¿ãã¡ã¿æ
å ±ãæ ŒçŽãããŠããŸã
pic_parameter_set_id
, ãã®PPSããŒã¿ã瀺ããŠããŒã¯ãªIDã§ããslice ããã¯ãã®IDã§åç §ãããŸããéèŠãªããŒã¿ã§ãseq_parameter_set_id
, ãã®PPSããŒã¿ãšé¢é£ãã SPS ã®IDã§ããentropy_coding_mode_flag
, CAVLC or CABAC ã®æ å ±ã§ããBaseline Profileã§ã¯ CAVLCããMainProfile ã§ã¯ CABAC ã䜿çšãããŸã
-
SEI
(Supplemental enhancement information)nal_unit_type
㯠6 ã§ã- SEI ã¯ç¬Šå·åã«ã¯å¿
èŠããªãã衚瀺ããããã¡ç®¡çäžæçšãªæ
å ±ãæäŸãã NALUnit ã§ã
- Recovery point, ã·ãŒã¯åŸã«è¡šç€ºã埩垰ããããã®æ å ±ãªã©ãæ ŒçŽãããŠããŸã
- SEI NALUnit ã¯ãè€æ°ã®SEI Message ãå«ãã§ããäºããããŸã
- ãŠãŒã¶ããŒã¿ãªã©ã SEI ã«æ ŒçŽããããšãã§ããŸã
- ffmpeg ãçæãããšã³ã³ãŒãæã«äœ¿çšããèšå®æ å ±ã User Date ã«æ ŒçŽãããŠãããããŸã
-
IDR
picture (Instantaneous Decoding Refresh picture), Multiple Reference Frames- ããåäœã§æç«ããç»åã®äºã§ããIãã¬ãŒã ãšãåŒã°ããŸãã
- å é ã®ãã¬ãŒã ã¯å¿ ãIãã¬ãŒã ã«ãªããŸãã宿çã«Iãã¬ãŒã ãéãããããšã§ã転éãšã©ãŒçºçæã«ãšã©ãŒããå埩ããããšãã§ããŸãã
- æ¯èŒçããŒã¿ãµã€ãºã倧ãããããè€æ°ã®ãã±ããã«åå²ãããäºããããŸã
- IDRã¯Iãã¬ãŒã (ãŸã㯠SIãã¬ãŒã )ã®äžçš®ã§ããéåžžã®Iãã¬ãŒãã®åäœã«å ããåŸç¶ã®ãã¬ãŒã ãIDRãã¬ãŒã ãããåæ¹ãåç
§ããäºãçŠæ¢ãã广ããããŸãã
- ãããªã·ãŒã±ã³ã¹ã«ãããæåã®ãã¯ãã£ã¯ãå¿ ãIDRãã¯ãã£ã«ãªããŸããIãã¬ãŒã ãæåã«æ¥ãªãå Žåã¯åç»ãšããŠæç«ããŸãã
- ããããããŒãã¬ãŒã ã«ãªããŸãããã¬ã€ã€ãŒãªã©ã®ã·ãŒã¯åäœã¯IDRåäœã§è¡ãããŸã
- H.264 ã®å身ãšãªã MPEG-4 ASP ã§ã¯ãåç §ã§ãããã¬ãŒã ã¯çŽåã®ãã¬ãŒã ã®ã¿ã§ããããH.264 ã§ã¯ãã®å¶éããªããªã è€æ°ã®ãã¬ãŒã ãåç §ãããé¢ãããã¬ãŒã ãåç §ã§ããå¯èœã«ãªããŸããã ãã®ããšããçŽåã®Iãã¬ãŒã ãPãã¬ãŒã ãé£ã³è¶ããæŽã«åã®ãã¬ãŒã ãåç §å¯èœã«ãªããŸãããã ããã¯ãã³ãŒãã«ãšã£ãŠã¯åè¿·æãªä»æ§ã§ããããã®ãããªåé¡ã解決ããããã«ãç¡è»éãªåæ¹ã®åç §ãçŠæ¢ããæ°ããIãã¬ãŒã = IDR ãå¿ èŠã«ãªããŸããã
- IDR picture ã¯ä»ã®ãã¯ãã£ãã¹ã©ã€ã¹ãåç §ããå¿ èŠããªããã POC ã frame çªå·ã¯ 0 ã«ãªã»ãããããŸã
- é垞㯠IDR picture ã 0.5 ã 2ç§ééã§æ¿å ¥ããããšã§ãæå³ããå Žæãžã®ã·ãŒã¯ãå¯èœãªã ãŒããŒãäœæããŸã
-
seq_parameter_set_id
- PPSããåç §ãããSPSã®IDã§ãã0ã31ã®å€ã§ã
-
pic_parameter_set_id
- ã¹ã©ã€ã¹ããåç §ãããPPSã®IDã§ãã0ã31ã®å€ã§ã
-
log2_max_frame_num_minus4
- MaxFrameNum ãç®åºããããã®ãã©ã¡ã¿ã§ãã
- MaxFrameNum ã¯
MaxFrameNum = Math.pow(2, log2_max_frame_num_minus4 + 4)
ã§èšç®ã§ããŸã
-
MMCO5
(memory_management_control_operation
equal to 5)memory_management_control_operation
ã 5 ã®å Žåã¯frame_num
ã®æ±ããäžéšå€åããŸã
Any coded slice NAL unit or coded slice data partition A NAL unit of the primary coded picture of the current access unit shall be different from any coded slice NAL unit or coded slice data partition A NAL unit of the primary coded picture of the previous access unit in one or more of the following ways:
â frame_num differs in value. The value of frame_num used to test this condition is the value of frame_num that appears in the syntax of the slice header, regardless of whether that value is inferred to have been equal to 0 for subsequent use in the decoding process due to the presence of memory_management_control_operation equal to 5.
NOTE 1 â A consequence of the above statement is that a primary coded picture having frame_num equal to 1 cannot contain a memory_management_control_operation equal to 5 unless some other condition listed below is fulfilled for the next primary coded picture that follows after it (if any).
-
slice_header
- TODO:
slice_header
ã¯2段éã§èªã¿èŸŒãå¿ èŠããããŸã- 1段ç®ã¯ã
- 2段ç®ã¯ã1段ç®ããåŸããã SPSã®IDãšPPSã®IDãåç §ããã³ãŒãåŠçãé²ããŸã
-
slice_id
- ã¹ã©ã€ã¹æ¯ã«èšå®ãããŠããIDã§ãã
nal_unit_type
ã 2 ã4 ã®å Žåã«å¿ èŠã«ãªããŸã - å€ã®ç¯å²ã«ã¯äžå®ã®å¶éããããŸã
if (separate_colour_plane_flag === 0) { ... }
if (separate_colour_plane_flag === 1) { ... }
if (MbaffFrameFlag === 0) { ... }
if (MbaffFrameFlag === 0) { ... }
- ã¹ã©ã€ã¹æ¯ã«èšå®ãããŠããIDã§ãã
-
Baseline profile
- I ãš P ã¹ã©ã€ã¹ããå©çšã§ããŸãã
- NALUnit ã®
nal_unit_type
ã«ã¯ 2ã4ã®å€(Coded slice data partition A/B/C)ã¯ç»å ŽããŸãã - SPS.
frame_mbs_only_flag
ã¯å¿ ã 1 ã«ãªããŸãã-
frame_mbs_only_flag equal to 1, it is specifies that every coded picture of the coded video sequence is a coded frame containing only frame macroblocks.
-
- SPS ã®ä»¥äžã®ãã©ã¡ã¿ã«ã¯ä»¥äžã®å¶éããããŸã
chroma_format_idc
ã¯å©çšããŸããbit_depth_luma_minus8
ã¯å©çšããŸããbit_depth_chroma_minus8
ã¯å©çšããŸããqpprime_y_zero_transform_bypass_flag
ã¯å©çšããŸããseq_scaling_matrix_present_flag
ã¯å©çšããŸãã
- PPS ã®ä»¥äžã®ãã©ã¡ã¿ã«ã¯ä»¥äžã®å¶éããããŸã
weighted_pred_flag
ã¯å©çšããŸããweighted_bipred_idc
ã¯å©çšããŸããentropy_coding_mode_flag
ã¯å©çšããŸããnum_slice_groups_minus1
㯠0 ãã 7 ãŸã§ã®å€ã«ãªããŸãtransform_8x8_mode_flag
ã¯å©çšããŸããpic_scaling_matrix_present_flag
ã¯å©çšããŸããsecond_chroma_qp_index_offset
ã¯å©çšããŸãã
- CAVLC ã®ä»¥äžã®ãã©ã¡ã¿ã«å¶éããããŸã(å 容çç¥)
- Macroblock layer ã®ä»¥äžã®ãã©ã¡ã¿ã«å¶éããããŸã(å 容çç¥)
profile_idc
ã 66 ã«ããconstraint_set0_flag
ã 1 ã«constraint_set1_flag
ã 1 ã«ããŠãã ãã- ã€ãŸãã
profile_idc
ã« 0x42(66) ãã»ãããæ¬¡ã®1byteã« 0xE0(224) ãã»ããããŸã
- ã€ãŸãã
-
Level
- Level ã«å¿ãããããã¬ãŒãã®å¶éããããŸã
- Level 3.0 ã¯ãæå€§ãã¬ãŒã ãµã€ãº(MaxFS)ã 1620, æå€§ãããã¬ãŒã(MaxBR)ã 10000 ã§ã
- Level ã«å¿ãããããã¬ãŒãã®å¶éããããŸã
-
primary_pic_type
-
AccessUnit ã«ã©ããªã¹ã©ã€ã¹ãå«ãŸããŠãããã瀺ãå€ã§ãããã®å€ãèŠãããšã§ãAccessUnit ã®åŠçæ¹éã®ç®å®ã決ããäºãã§ããŸã
-
ãã ã ffmpeg ãçæããããŒã¿ã§ã¯åžžã«
primary_pic_type = 7
ã«ãªã£ãŠããããã®å€ãææšãšããŠå©çšããããšã¯é£ãã(ã§ããªã)ããã§ãprimary_pic_type slice_type values that may be present in the primary coded picture Name of slice_type 0 2, 7 I 1 0, 2, 5, 7 P, I 2 0, 1, 2, 5, 6, 7 P, B, I 3 4, 9 SI 4 3, 4, 8, 9 SP, SI 5 2, 4, 7, 9 I, SI 6 0, 2, 3, 4, 5, 7, 8, 9 P, I, SP, SI 7 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 P, B, I, SP, SI
-
-
slice_type
- AVC ã§ã¯ 1æã®ãã¯ãã£ã«è€æ°ã®sliceãå«ãŸããŠããŸããsliceã¯æ··åšãå¯èœã§ãã
slice_type
ã5以äžãªãããã®ãã¯ãã£ã«å«ãŸããå šãŠã® slice ãåãçš®é¡ã§ããäºãæå³ããŸãã-
slice_type = 7 ã®å Žåã¯ããã®ãã¯ãã£ã«å«ãŸããå šãŠã® slice ã I Picture ã§ã
slice_type Name of slice_type 0 P (P slice) 1 B (B slice) 2 I (I slice) 3 SP (SP slice) 4 SI (SI slice) 5 P (P slice) 6 B (B slice) 7 I (I slice) 8 SP (SP slice) 9 SI (SI slice)
-
-
CAVLC
(Context-Adaptive Variable Length Coding)- ã³ã³ãã¯ã¹ãé©å¿åå¯å€é·ç¬Šå·åæ¹åŒã§ããå§çž®çã¯ããã»ã©é«ããããŸãã
- Baseline profile ã§äœ¿çšãããŸã
-
CABAC
(Context-Adaptive Binary Arithmetic Coding), UVLC- ã³ã³ãã¯ã¹ãé©å¿åïŒå€ç®è¡ç¬Šå·åæ¹åŒãCAVLC ãããæéã¯ããããŸããå§çž®çã¯10ã15%ã»ã©é«ããªããŸã
- Main profile 以äžã§äœ¿çšã§ããŸã
-
Exp-Golomb
(Exponential-Golomb coding)- æŽæ°ã笊å·åããæ¹æ³ã®1ã€ã§ãã
- 0 ã¯
1
ã«ã1 ã¯010
ã«ç¬Šå·åãããŸã
-
MPEGå§çž®ããŒã¿ã®6éå±€æ§æ
- MPEG ã§å§çž®ãããããŒã¿ã¯6ã€ã®ã¬ã€ã€ãŒã§åºåãããŠããŸã
- ã·ãŒã±ã³ã¹ã¬ã€ã€ãŒ, GOP + SH ã 1察ãšãããã±ãããåŠçããã¬ã€ã€ãŒã§ã
- GOPã¬ã€ã€ãŒ, GOP ã®äžã«å«ãŸããè€æ°ã® I/B/P ãã¯ãã£ãåŠçããã¬ã€ã€ãŒã§ã
- ãã¯ãã£ã¬ã€ã€ãŒ, 1æã®ç»åã16ã©ã€ã³çšã§äžããé çªã«åãåã£ãã¹ã©ã€ã¹ãšããåäœã§åŠçããã¬ã€ã€ãŒã§ã
- ã¹ã©ã€ã¹ã¬ã€ã€ãŒ, ã¹ã©ã€ã¹ã暪æ¹åã«16ã©ã€ã³å¹ ã§åãåã£ããã¯ããããã¯ãšããåäœã§åŠçããã¬ã€ã€ãŒã§ã
- ãã¯ããããã¯ã¬ã€ã€ãŒ, ãã¯ããããã¯ã4åå²ãããããã¯ããæŽã«YUVæ¯ã«åŠçããã¬ã€ã€ãŒã§ã
- ãããã¯ã¬ã€ã€ãŒ(DCTåŠçåäœ), ãããã¯ãåŠçããã¬ã€ã€ãŒã§ã
- MPEG ã§å§çž®ãããããŒã¿ã¯6ã€ã®ã¬ã€ã€ãŒã§åºåãããŠããŸã
-
Sequence (ã·ãŒã±ã³ã¹)
- 15æçšåºŠã®é£ç¶ãã Picture ããŸãšãããã®ã Sequence ãšåŒã³ãŸã
- Sequence ã¯
| GOP + SH | GOP + SH ...
ããæ§æãããããŒã¿æ§é ã§ã - GOP 㯠15æçšåºŠã® I/B/P Pictureããæ§æãããŠããŸã
- GOPåäœã§åç»åã®å·®åãæœåºããI,B,Pãã¯ãã£ãäœããå§çž®çã皌ãã®ãMPEGã®åºæ¬æŠç¥ã§ã
- SH ã¯Sequence Header ã®ç¥ã§ããç»åã®éå§ç¹ã¯SHã®åãç®ããã«ãªããŸã
- I, B, P ãã¯ãã£ã¯ãããã«ç»åãæšªæ¹åã«åãåã£ã slice ã§æ§æãããŠããŸã
-
Picture (Frame)
- 1æã®ç»å(ãã¯ãã£,ãã¬ãŒã )ã¯è€æ°ã®ã¹ã©ã€ã¹ããæ§æãããŠããŸã
- I Picture, B Picture, P Picture ãªã©ã®çš®é¡ããããŸã
- MPEG-2/MPEG-4 ã§ã¯ Picture ãåºæ¬ã®åäœã§ãããAVC ã§ã¯ slice ãåºæ¬åäœãšãªããŸã
- MPEG-2/MPEG-4 ã§ã¯ 1 Picture ã 1 Frame ã«çžåœããPicture æ¯ã«ç¬Šå·åã¢ãŒããååšããŸãããã AVC ã§ã¯ slice åäœã§ç¬Šå·åã¢ãŒããæ±ºããããŠããŸãã
- AVC ã§ã¯ 1ã€ã® Picture ã®äžã«ãç°ãªãã¿ã€ãã® slice ãæ··åšãããäºãå¯èœã§ã
- 1ã€ã® Picture ã®äžã«1çš®é¡ã® slice ã ããååšããäºã瀺ãã«ã¯ãslice_type ã®å€ã 5ã9ã«ãããã AUD ã® primary_pic_type ã䜿ã£ãŠæç€ºããŸã
-
Picture å¢ç(å é ã®Picture)
- AVCããŒã¿ãMPEG-2 System(MPEG-2 TS ã MPEG-2 PS)ãçµç±ããå Žåã¯ãå¿ ãAUDãä»äžãããŠããããã å é ã®ãã¯ãã£ãã©ãããããŸãæèããå¿ èŠã¯ãããŸãã
- AUD ãååšããªããã€ãã¹ããªãŒã ãæ±ãå Žåã¯ãå
é ã®Picture ãã©ããã調ã¹ãå¿
èŠããããŸã
- 以äžã®æ¡ä»¶ã«è©²åœããå Žåã¯å é ã®Pictureã§ã
frame_num
ãç°ãªãfield_pic_flag
ãç°ãªãframe_num
ãåãã ã POC (Picture Order Count) ã®å€ãç°ãªãnal_ref_idc
ãç°ãªã- IDRãã¯ãã£ã§
idr_pic_idc
ãç°ãªã
-
Slice
frame_num
- slice header ã®äžã«ãããŸãããã®sliceãã©ã®frameã«æå±ãããã®æ å ±ã§ã
- I Picture (Intra Picture)
- ä»ã®ãã¯ãã£ãåç §ããããšãªãç»é¢ãåŸããããã¯ãã£ã§ããæãåºæ¬çãªãã¯ãã£ã«ãªããŸã
- P Picture (Predictive Picture)
- 1ã€åã®ãã¯ãã£ãåç §ããããšã§åŸããããã¯ãã£ã§ã
- B Picture (Bi-direction predictiv Picture)
- H.264 ã§ã¯ãéå»ãæªæ¥ã®2æã®ãã¯ãã£ãåç §ããããšã§åŸããããã¯ãã£ã§ã
- Baseline profile ã§ã¯å©çšã§ããŸãã
- SI slice
- SI slice ã¯ã¹ããªãŒã ã®åãæ¿ããè¡ãããã®ç¹æ®ãªI sliceã§ããffmpeg ããã¯çæãããŸãã
- SP slice
- SP slice ã¯ã¹ããªãŒã ã®åãæ¿ããè¡ãããã®ç¹æ®ãªP sliceã§ããffmpeg ããã¯çæãããŸãã
-
YUV Colour Formats
yuv4:2:0
ãªã©ãæå®ã§ããŸãã- ffmpeg ã§å©çšã§ãã x264ã³ãŒããã¯ã«ã¯
yuv4:2:0
ããæå®ã§ããŸãã
-
MP4 file format version 2. ".mp4"
- MP4 file format 㯠ISO/IEC 14496-14:2003 ã§èŠå®ãããŠããŸãã ãã®èŠæ Œæžã¯ã2001幎ã«çºè¡ããã ISO/IEC 14496-1:2001 ãäžæžããããã®ã§ã ã³ãã¥ããã£ãã㯠MPEG-4 file format version 2 ãŸã㯠MP4v2 ãšåŒã°ããŠããŸãã
- æ¡åŒµåã ".264" ã®ç©ã¯ MP4 ã³ã³ããã«å ¥ã£ãŠããªãçã® H.264 ãããªã¹ããªãŒã (moovã®äžèº«)ã§ãã
-
ISOBMFF
(ISO Base Media File Format)- MP4Box æ§é ã®ããšã§ã
- ISO/IEC 14496-12 ã§å®çŸ©ãããŠããŸã
-
VLC
(Variable Length Coding)- ãããã³ç¬Šå·ãããŒã¹ãšããå¯å€é·ç¬Šå·åã§ã
-
IOD
(Initial Object Descriptor)- 14496-14 ã§å®çŸ©ãããŠããŸã TODO: 詳现ãèšè¿°
-
OD
(Object Descriptors)- 14496-14 ã® 3.1.3 ã§å®çŸ©ãããŠããŸã
-
PCR
(Program clock reference), OPCR (Original Program clock reference)- MPEG-2 TS ã«ãããæå»æ å ±ã§ã
- MPEG-2 TS ã® PCR ã AVC ã® PTS ãšããŠæã¡ãŸããäºãã§ããã° PTS ã®èšç®ãçç¥ã§ããŸã(TODO: èŠç¢ºèª â åºæ¥ãŸããã§ãã)
- PCR 㯠27MHz(33bit * 300 + 9bit) ã§Clockã管çããŠããŸã
-
PCR Wrap-around
- 33bit éšå㯠90kHz ã®è§£å床ãæã¡ã9bit ã®å€(0ã299)ã®å€ãå ããäºã§ã900kHz * 300 = 27MHz ã®ç²ŸåºŠãåºãããã«å·¥å€«ãããŠããŸã
- ãã®33bitéšåãã©ãŠã³ãã¢ããã0ã«æ»ãããšã PCR Wrap-around ãšåŒã³ãŸãããã®çŸè±¡ã¯ 0x1FFFFFFFF = 8589934591 (26:30:43.717) ã§çºçããŸã
- PCR Wrap-around ãçºçãããšé³å£°ãšæ åã®åæãåããªããªããæ åã«ä¹±ããçºçããŸã
-
DTS
(decoding time stamp, åŸ©å·æå»), PTS (presentation time stamp, 衚瀺æå»)- MPEG-4 ã«ãããI, P, B ãã¯ãã£ã衚瀺ããæå»ãæå®ããã®ã PTS ã§ããã³ãŒãã®é çªãæç€ºããã®ã DTS ã§ãã
- PTS 㯠DTS ãšãã®åšèŸºæ å ±ãªã©ããç®åºããªããã°ãªããªãç¶æ³ããããšã
-
PTS
(Presentation Time Stamp)- ã³ã³ãã³ãããã€åºåããããæå®ããã¿ã€ã ã¹ã¿ã³ãæ å ±(33 bits)ã§ããæ åãšé³å£°ã®åæã®ããã«å¿ èŠã«ãªããŸã
- ãã³ãŒã㯠STC ã PTS ã®ç€ºãæå»ã«ãªããšãã³ãŒãçµæãæ åãé³å£°ãšããŠåºåããŸã
-
STC
(System Time Clock)- ã·ã¹ãã åºæºæå»ã§ããå éšã¯ããã¯ã§ã
- 90 kHz ã§é§åããŸã
- MPEG-1 ã®ã¯ããã¯ã¯ãã¹ãŠ90kHzåäœã§ããSTC, SCR, PTS ã90KHzãåºç€åäœãšããæå»æ å ±ãšãªããŸã
-
SCR
(System Clock Reference)- ã¹ããªãŒã äžã§ã®ã·ã¹ãã ã¯ããã¯æ å ±(33 bits)ã§ãã
-
LipSync
- MPEG-1 ã«ãããŠã¯ãé³å£°ãšæ åã«ãããçºçããå Žåã¯ãé³å£°ãåºæºã«ãããä¿®æ£ããŸã
- ããã¯äººéãé³ã®ããã«å¯ŸããŠããææãªããã§ã
-
Mux
(Multiplexing, Multiplexer)- Audio Elementary Stream ãš Video Elementary Stream ãå€éåããããšã Multiplexing (Mux, ãã¯ã¹) ãšåŒã³ãŸãã
-
Demux
(Demultiplexing, Demultiplexer)- Mux ããã¹ããªãŒã ãå ã«æ»ãäºã Demultiplexing (Demux, ããã¯ã¹) ãšåŒã³ãŸãã
- æè¡æ å ±
- é¢é£ããŒã«
- MP4Box
- ffmpeg
-
ESDescriptor
(Elementary Stream Descriptor)- 14496-14 ã® 3.1.2 ã§å®çŸ©ãããŠããŸã
- Audio Object Descriptor ãš Video Object Descriptor ã®ãããã¹ããªãŒã 㯠Elementary Stream ã§ã
- ã·ãŒã³ãæ§æããããããã®ãªããžã§ã¯ãæ¯ã«ESDescriptorãååšããŸã
-
BIFS
(BInary Format for Scene description)- VRML (Visual Reality Modeling Language)ãæ¡åŒµããã·ãŒã³èšè¿°èšèªã§ã
- èŠæ Œæžã«ã¯ååšããŸãããå®è£ ããæ°çåã¯ãããŸãã
-
POC
(Picture Order Count) -
MMCO5
(memory_management_control_operation
equal to 5) -
MP4Boxes
-
MP4 box type äžèЧ http://www.mp4ra.org/atoms.html
-
"ctts" atom, which has to be written when muxing b-frames into MP4
ctts
box 㯠Bãã¬ãŒã ãæžãåºãå Žåã«å¿ èŠã§ããã€ãŸãctts
ãããå Žå㯠B ãã¬ãŒã ãååšããŸã -
"vol", which is placed on every keyframe in AVI, but has to be seperated from the movie data in MP4
vol
ããããã£ã¯ããŒãã¬ãŒã æ¯ã«å¿ èŠã§ããTODO: ??
-
-
ES
(Elementary Stream)- çãªãŒãã£ãªããŒã¿ããããªããŒã¿ã¯ ES ãšåŒã°ããŸã
-
PS
(Parameter Set)- sequence parameter set ãš picture parameter set ãç·ç§°ã㊠Parameter Set ãšåŒã³ãŸã
-
Parameter Set Elementary Stream
- elementary stream containing samples made up of only sequence and picture parameter set NAL units synchronized with the video elementary stream.
ãã®ã¿ã€ãã®ãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«å«ãŸããããŒã¿ã¯ãæ åã¹ããªãŒã ãšåæãã SPS ãš PPS ã® NALUnit ã§ããä»ã®çš®é¡ã®ããŒã¿ã¯å«ãŸããŸãã
- elementary stream containing samples made up of only sequence and picture parameter set NAL units synchronized with the video elementary stream.
-
VideoES
(VIDEO Elementary Stream)- elementary stream containing access units made up of NAL units for coded picture data
ãã®ã¿ã€ãã®ãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«å«ãŸããããŒã¿ã¯ã笊å·åãããç»åããŒã¿ã§ããä»ã®çš®é¡ã®ããŒã¿ã¯å«ãŸããŸãã
- elementary stream containing access units made up of NAL units for coded picture data
-
- MPEG-2 ã® CRC32 ã¯ãCRC32/MPEG2 ãšåŒã°ããéåžžã® CRC32 ãšã¯èšç®åŒãšåæå€ãç°ãªããŸã