Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Transcoding HDR video failed with GPU HANG #1821

Open
Goodwu opened this issue Jun 22, 2024 · 6 comments
Open

[Bug]: Transcoding HDR video failed with GPU HANG #1821

Goodwu opened this issue Jun 22, 2024 · 6 comments
Assignees

Comments

@Goodwu
Copy link

Goodwu commented Jun 22, 2024

Which component impacted?

Decode

Is it regression? Good in old configuration?

None

What happened?

  1. Hardware is J3455
  2. Immich docker in OMV8 in PVE container
  3. Transcoding HDR video with HW decoding and encoding on
  4. Transcoding failed with GPU HANG

What's the usage scenario when you are seeing the problem?

Others

What impacted?

Transcoding failed

Debug Information

/usr/lib/jellyfin-ffmpeg/vainfo

Trying display: drm
libva info: VA-API version 1.21.0
libva info: Trying to open /usr/lib/jellyfin-ffmpeg/lib/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_21
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.21 (libva 2.21.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 24.2.1 (0593864)
vainfo: Supported profile and entrypoints
      VAProfileNone                   :	VAEntrypointVideoProc
      VAProfileNone                   :	VAEntrypointStats
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointEncSlice
      VAProfileH264Main               :	VAEntrypointFEI
      VAProfileH264Main               :	VAEntrypointEncSliceLP
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointEncSlice
      VAProfileH264High               :	VAEntrypointFEI
      VAProfileH264High               :	VAEntrypointEncSliceLP
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileJPEGBaseline           :	VAEntrypointVLD
      VAProfileJPEGBaseline           :	VAEntrypointEncPicture
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointEncSlice
      VAProfileH264ConstrainedBaseline:	VAEntrypointFEI
      VAProfileH264ConstrainedBaseline:	VAEntrypointEncSliceLP
      VAProfileVP8Version0_3          :	VAEntrypointVLD
      VAProfileVP8Version0_3          :	VAEntrypointEncSlice
      VAProfileHEVCMain               :	VAEntrypointVLD
      VAProfileHEVCMain               :	VAEntrypointEncSlice
      VAProfileHEVCMain               :	VAEntrypointFEI
      VAProfileHEVCMain10             :	VAEntrypointVLD
      VAProfileVP9Profile0            :	VAEntrypointVLD

error log from ffmpeg:

[Nest] 7  - 06/20/2024, 2:05:59 PM     LOG [Microservices:MediaService] Started encoding video 7de4b8c4-6638-4d00-bf1d-7b731e4e9bb8 {"inputOptions":["-hwaccel qsv","-hwaccel_output_format qsv","-async_depth 4","-threads 1"],"outputOptions":["-c:v h264_qsv","-c:a copy","-movflags faststart","-fps_mode passthrough","-map 0:0","-map 0:1","-bf 7","-refs 5","-g 256","-v verbose","-vf scale_qsv=720:-1:async_depth=4:mode=hq,hwmap=derive_device=opencl,tonemap_opencl=desat=0:format=nv12:matrix=bt709:primaries=bt709:range=pc:tonemap=hable:transfer=bt709,hwmap=derive_device=qsv:reverse=1,format=qsv","-preset 7","-global_quality:v 23","-maxrate 4000k","-bufsize 8000k"],"twoPass":false}
[Nest] 7  - 06/20/2024, 2:06:06 PM   ERROR [Microservices:MediaRepository] [AVHWDeviceContext @ 0x49a52032040] Initialised VAAPI connection: version 1.21
[AVHWDeviceContext @ 0x49a52032040] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 24.2.1 (0593864).
[AVHWDeviceContext @ 0x49a52032040] Driver not found in known nonstandard list, using standard behaviour.
[AVHWDeviceContext @ 0x49a52030180] Use Intel(R) oneVPL to create MFX session, API version is 2.10, the required implementation version is 1.3
libva info: VA-API version 1.21.0
libva info: Trying to open /usr/lib/jellyfin-ffmpeg/lib/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_21
libva info: va_openDriver() returns 0
[AVHWDeviceContext @ 0x49a52030180] Initialize MFX session: implementation version is 1.35
Stream mapping:
  Stream #0:0 -> #0:0 (hevc (hevc_qsv) -> h264 (h264_qsv))
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
[AVHWDeviceContext @ 0x49a52033a80] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 24.2.1 (0593864).
[AVHWDeviceContext @ 0x49a52033a80] Driver not found in known nonstandard list, using standard behaviour.
[hevc_qsv @ 0x49a521e1d80] Decoder: output is video memory surface
[hevc_qsv @ 0x49a521e1d80] Use Intel(R) oneVPL to create MFX session with the specified MFX loader
[AVHWDeviceContext @ 0x49a52032dc0] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 24.2.1 (0593864).
[AVHWDeviceContext @ 0x49a52032dc0] Driver not found in known nonstandard list, using standard behaviour.
[hevc_qsv @ 0x49a521e1d80] Decoder: output is video memory surface
[hevc_qsv @ 0x49a521e1d80] Use Intel(R) oneVPL to create MFX session with the specified MFX loader
[graph 0 input from stream 0:0 @ 0x49a52174140] w:1920 h:1080 pixfmt:qsv tb:1/600 fr:30000/1001 sar:0/1
[AVHWDeviceContext @ 0x49a52031680] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 24.2.1 (0593864).
[AVHWDeviceContext @ 0x49a52031680] Driver not found in known nonstandard list, using standard behaviour.
[Parsed_scale_qsv_0 @ 0x49a52174080] Use Intel(R) oneVPL to create MFX session with the specified MFX loader
[Parsed_scale_qsv_0 @ 0x49a52174080] VPP: input is video memory surface
[Parsed_scale_qsv_0 @ 0x49a52174080] VPP: output is video memory surface
[AVHWDeviceContext @ 0x49a52034200] 0.0: Intel(R) OpenCL Graphics / Intel(R) HD Graphics 500
[AVHWDeviceContext @ 0x49a52034200] Intel QSV to OpenCL mapping function found (clCreateFromVA_APIMediaSurfaceINTEL).
[AVHWDeviceContext @ 0x49a52034200] Intel QSV in OpenCL acquire function found (clEnqueueAcquireVA_APIMediaSurfacesINTEL).
[AVHWDeviceContext @ 0x49a52034200] Intel QSV in OpenCL release function found (clEnqueueReleaseVA_APIMediaSurfacesINTEL).
[AVHWDeviceContext @ 0x49a52034f80] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 24.2.1 (0593864).
[AVHWDeviceContext @ 0x49a52034f80] Driver not found in known nonstandard list, using standard behaviour.
[h264_qsv @ 0x49a521e1180] Using input frames context (format qsv) with h264_qsv encoder.
[h264_qsv @ 0x49a521e1180] Encoder: input is video memory surface
[h264_qsv @ 0x49a521e1180] Use Intel(R) oneVPL to create MFX session with the specified MFX loader
[h264_qsv @ 0x49a521e1180] Using the constant quality with VBR algorithm (QVBR) ratecontrol method
[h264_qsv @ 0x49a521e1180] profile: avc high; level: 30
[h264_qsv @ 0x49a521e1180] GopPicSize: 256; GopRefDist: 8; GopOptFlag: closed; IdrInterval: 0
[h264_qsv @ 0x49a521e1180] TargetUsage: 7; RateControlMethod: QVBR
[h264_qsv @ 0x49a521e1180] NumSlice: 1; NumRefFrame: 5
[h264_qsv @ 0x49a521e1180] RateDistortionOpt: OFF
[h264_qsv @ 0x49a521e1180] RecoveryPointSEI: OFF
[h264_qsv @ 0x49a521e1180] VDENC: OFF
[h264_qsv @ 0x49a521e1180] Entropy coding: CABAC; MaxDecFrameBuffering: 5
[h264_qsv @ 0x49a521e1180] NalHrdConformance: ON; SingleSeiNalUnit: ON; VuiVclHrdParameters: OFF VuiNalHrdParameters: ON
[h264_qsv @ 0x49a521e1180] FrameRateExtD: 1001; FrameRateExtN: 30000
[h264_qsv @ 0x49a521e1180] IntRefType: 0; IntRefCycleSize: 0; IntRefQPDelta: 0
[h264_qsv @ 0x49a521e1180] MaxFrameSize: 224640; MaxSliceSize: 0
[h264_qsv @ 0x49a521e1180] BitrateLimit: ON; MBBRC: OFF; ExtBRC: OFF
[h264_qsv @ 0x49a521e1180] Trellis: auto
[h264_qsv @ 0x49a521e1180] RepeatPPS: OFF; NumMbPerSlice: 0; LookAheadDS: 2x
[h264_qsv @ 0x49a521e1180] AdaptiveI: OFF; AdaptiveB: OFF; BRefType:pyramid
[h264_qsv @ 0x49a521e1180] MinQPI: 0; MaxQPI: 0; MinQPP: 0; MaxQPP: 0; MinQPB: 0; MaxQPB: 0
[h264_qsv @ 0x49a521e1180] DisableDeblockingIdc: 0
[h264_qsv @ 0x49a521e1180] SkipFrame: no_skip
[h264_qsv @ 0x49a521e1180] QVBRQuality: 23
[h264_qsv @ 0x49a521e1180] PRefType: default
[h264_qsv @ 0x49a521e1180] TransformSkip: unknown
[h264_qsv @ 0x49a521e1180] IntRefCycleDist: 0
[h264_qsv @ 0x49a521e1180] LowDelayBRC: OFF
[h264_qsv @ 0x49a521e1180] MaxFrameSizeI: 0; MaxFrameSizeP: 0
[h264_qsv @ 0x49a521e1180] ScenarioInfo: 0
Output #0, mp4, to 'upload/encoded-video/0ef79aea-c9d6-4f20-b7d2-bc3a01408cb6/7d/e4/7de4b8c4-6638-4d00-bf1d-7b731e4e9bb8.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 1
    compatible_brands: isommp41mp42
    copyright-eng   :
    copyright       :
    encoder         : Lavf60.3.100
  Stream #0:0(und): Video: h264, 1 reference frame (avc1 / 0x31637661), qsv(pc, bt709, progressive), 720x405 (0x0), q=2-31, 1000 kb/s, 29.97 fps, 30k tbn (default)
    Metadata:
      creation_time   : 2022-07-08T04:16:39.000000Z
      handler_name    : Core Media Video
      vendor_id       : [0][0][0][0]
      encoder         : Lavc60.3.100 h264_qsv
    Side data:
      cpb: bitrate max/min/avg: 4000000/0/1000000 buffer size: 8000000 vbv_delay: N/A
      DOVI configuration record: version: 1.0, profile: 8, level: 4, rpu flag: 1, el flag: 0, bl flag: 1, compatibility id: 4
      displaymatrix: rotation of -0.00 degrees
  Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 175 kb/s (default)
    Metadata:
      creation_time   : 2022-07-08T04:16:39.000000Z
      handler_name    : Core Media Audio
      vendor_id       : [0][0][0][0]
frame=    0 fps=0.0 q=0.0 size=       0kB time=00:00:01.22 bitrate=   0.3kbits/s speed=0.381x
frame=    0 fps=0.0 q=0.0 size=       0kB time=00:00:01.25 bitrate=   0.3kbits/s speed=0.236x
[Parsed_tonemap_opencl_2 @ 0x49a52174380] Failed to finish command queue: -5.
Error while filtering: Input/output error
Failed to inject frame into filter network: Input/output error
Error while processing the decoded data for stream #0:0
[out#0/mp4 @ 0x49a52171440] All streams finished
[out#0/mp4 @ 0x49a52171440] Terminating muxer thread
[AVIOContext @ 0x49a521d0540] Statistics: 52 bytes written, 0 seeks, 1 writeouts
Terminating demuxer thread 0
[AVIOContext @ 0x49a52190180] Statistics: 1472270 bytes read, 35 seeks
Conversion failed!

[Nest] 7  - 06/20/2024, 2:06:06 PM   ERROR [Microservices:MediaService] Error: ffmpeg exited with code 1: Conversion failed!

[Nest] 7  - 06/20/2024, 2:06:06 PM   ERROR [Microservices:MediaService] Error occurred during transcoding. Retrying with QSV acceleration disabled.

/sys/class/drm/card1/error:

root@omv-new:/srv/dev-disk-by-uuid-2bf642cf-adb7-4ec8-9291-13bc77208fb0/immich-app/library# cat /sys/class/drm/card1/error
GPU HANG: ecode 9:1:00280000, in ffmpeg [3608236]
Kernel: 6.8.4-3-pve x86_64
Driver: 20230929
Time: 1718892364 s 788081 us
Boottime: 460358 s 901832 us
Uptime: 460338 s 421457 us
Capture: 4755025978 jiffies; 152542026 ms ago
Active process (on ring rcs0): ffmpeg [3608236]
Reset count: 0
Suspend count: 0
Platform: BROXTON
Subplatform: 0x0
PCI ID: 0x5a85
PCI Revision: 0x0b
PCI Subsystem: 8086:2067
IOMMU enabled?: 1
DMC initialized: yes
DMC loaded: yes
DMC fw version: 1.7
RPM wakelock: yes
PM suspended: no
IER: 0x08080000
DERRMR: 0x2077efef
GT awake: yes
CS timestamp frequency: 0 Hz, 0 ns
EIR: 0x00000000
PGTBL_ER: 0x00000000
GTIER[0]: 0x09090909
GTIER[1]: 0x09090909
GTIER[2]: 0x00000000
GTIER[3]: 0x00000909
  fence[0] = 00000000
  fence[1] = 00000000
  fence[2] = 00000000
  fence[3] = 00000000
  fence[4] = 00000000
  fence[5] = 00000000
  fence[6] = 00000000
  fence[7] = 00000000
  fence[8] = 00000000
  fence[9] = 00000000
  fence[10] = 00000000
  fence[11] = 00000000
  fence[12] = 00000000
  fence[13] = 00000000
  fence[14] = 00000000
  fence[15] = 00000000
  fence[16] = 00000000
  fence[17] = 00000000
  fence[18] = 00000000
  fence[19] = 00000000
  fence[20] = 00000000
  fence[21] = 00000000
  fence[22] = 00000000
  fence[23] = 00000000
  fence[24] = 00000000
  fence[25] = 00000000
  fence[26] = 00000000
  fence[27] = 00000000
  fence[28] = 00000000
  fence[29] = 00000000
  fence[30] = 00000000
  fence[31] = 00000000
FORCEWAKE: 0xffff0001
ERROR: 0x00000000
DONE_REG: 0x038ff2ff
FAULT_TLB_DATA: 0x00000006 0x12003528
GTT_CACHE_EN: 0xf0007fff
rcs0 command stream:
  CCID:  0x00000000
  START: 0x00f7f000
  HEAD:  0x000006d0 [0x00000678]
  TAIL:  0x00000728 [0x000006d8, 0x00000730]
  CTL:   0x00003001
  MODE:  0x00000000
  HWS:   0x00002000
  ACTHD: 0x000074af 3c038aec
  IPEIR: 0x00000000
  IPEHR: 0xffffffff
  ESR:   0x00000001
  INSTDONE: 0xffd7ffff
  SC_INSTDONE: 0xffffffff
  SAMPLER_INSTDONE[0][1]: 0xffffffff
  SAMPLER_INSTDONE[0][2]: 0xffffffff
  ROW_INSTDONE[0][1]: 0xffffffff
  ROW_INSTDONE[0][2]: 0xffffffff
  batch: [0x000074af_3c01b000, 0x000074af_3c02b000]
  BBADDR: 0x000074af_3c02f241
  BB_STATE: 0x00000020
  INSTPS: 0x00009001
  INSTPM: 0x00000000
  FADDR: 0x000074af 3c02dbc0
  RC PSMI: 0x00000010
  FAULT_REG: 0x00000000
  GFX_MODE: 0x00008000
  PDP0: 0x000000001df0e000
  PDP1: 0x0000000000000000
  PDP2: 0x0000000000000000
  PDP3: 0x0000000000000000
  ELSP[0]:  pid 3608236, seqno      4e9:00000014, prio 0, head 00000678, tail 00000730
  hung: 1
  engine reset count: 0
  Active context: ffmpeg[3608236] prio 0, guilty 1 active 0, runtime total 12068104ns, avg 943400ns
  context timeline seqno 19
rcs0 --- WA context = 0x00000000 ffffe000
:bjCpF0RsTe^YskXJJ^sQ"KZ!Q&dhS`LqllT!?`?sZm[=Qm@An]?KK(QB)qC@!P_os`:ll:@kq7R)@%s<J%I8GE?4Uo?-eS1;tBs2j<^9u3=TPkWD-/OLkpkEaG>Y.D36HC!!'P>
rcs0 --- HW Status = 0x00000000 00002000
:cL%-H+92kU0Ymi"YPSb/VU0.rgsH-kU,t%$D9<l20]-au!Iau[ScA`j0W`m\!!!#%
rcs0 --- ring = 0x00000000 00f7f000
:R(]#7;%laY8nWnnMQkl=`*bb>0R_g`'/:oT`\EE^UDV!-\1Z]q3XCV$<==_p%E/Qk'Qt9-mQ7B[r"It16l4Vh';_^k:/YtMC41S,GO=dp9[;G-V5CjhhS[9G<-gZIk0qS)*&C=!fr;<H4`=U&Sm'i_m+lg=$tM5p/b)Rg5+QW^qW&]XIF-rM0DsL=Qf%Z(D9\PEe.6jnBg5_?kF[.Ug^o6][iRWN8fbtnHDR*M$p04qE)%FpmQ>&"#Bq9rk:9bgEWgT+h7+YQ5gJdk[fe>ONoRrY-$OA4*_MF%h8f]-Wn-*fJ_f++enn:>#SX=<!?L:2+!5iL>0$QXpJ8K9jkC8N7/s%L/oM^LHdskrr+Kl^HhEqZOj-;lj-n)OYii63\=nCkgfUjle,BAT;8<)EF*MqNs7utJEk?<W!0,c9rsJ#j*''u_RQ7I[Oam@\`K)ACh8gQ6o/gqcS!Mi4lZ8`Ah.>KGDCgJh>IgtBfVuLV*7Q/#Fr]?kMGX>GilkQ[LN3J,*Zj*8\N<3'jGBHWo$K)c`:JPFP8hgKfm`?;+F*<*]8C.nS]-@-Ue]tN(.,F=pE11ETY$(\%6_[digi6R\?K%XB=S"TC<OAT+>!7&54WOaS]->ZVG?1T(0Z^-pE1IMAq0AV*d`ec!5ecoI+/&5'RtW'!5<dFGp.\$il17j5O\e^0n8.YZ[.\Cq>j.Lhk!/A9AEX5.0ohQcQb+@3NU=eQi;P61qNAgKinA*]n."!5Nj+?Xk0dEU*'.5YY)\"s$3"(DS:XiTteB#YXZCsq?="Qle!Rl&,ZO\+Q)ro!]KN"pEFPSAUCTi;\/d,r_25ECial^9&KIG#@8A;5O\m3Rm5k:k49VkDuB]UUHe2pc3-clJ)CFB%Y)?K?V*aF5e.lcqoLr"pD,`sOoKqa"hXdATH<Y#rIV`_H]T=oc9G^O$g2O4J-4j*Z!'?ip'M>fs.BGSq)s<s;np+%@#4jAoEmch0<e-"O8K%q/jK,.!)K*:lkH)jrr>L4:];)=e)p?1"D@a"J6DJjrsEcG\\3U6<57W"!LGaLpfEOSc0Q6"Q2a)a$G!j%_Ld9qhSD5P^ZZ_d?T52m;ig\Rlk-Grrdbm1o6UXVGlN:?_)hUMq?H<CS(K:,YIOY:CZY6sK'VhiO9YN/:LL'shf8t)\&mtL"F&XP+C4O7I_e2ArQ$(.HP`nO"V9C=+=$E'5/Cb'rQ#->57att(I)pBJ1nh,qZSY.pE2!Ur#W[if)<YooCh"WE=)\q!!&h/
rcs0 --- HW context = 0x00000000 fffaf000
:h!LVVop]n2n+"GHrK6=8r1(&Tm"rfZgEXfg:`-#0]L+A<6)_lqe0ke:S%=q#d*GER\#jmopT@^LG!JpG%OWVt1)qEX[$3+pc)s2e+Xh6<[fN%*FOt0p@C5$HRCZS_Z.]U=+%\\2&ck;-f<4M^P4--^*fkc:8"P#iE_,fJ>Fc?)P]<kLT31-d+&8!X/,B&qk,BqMqo*i_HumQ139-#CN]?'Zj,Q+ZK#O_7,"Ah"LkQ2,,-7##eO)t:RSp7UUtV;d5H'rA6*9cMrL-A'UaFNM_#dcYd!W.iORViP-rVkMj)%F_OmgW0M7:7G%b6a+:Fp?$Zo#SAKU=+Fkb)mIO-8kR>`/Kd-dUe+'s]GI[hmWDB8sF(rGkWe<q0chjpSDe>$*OghND_>DPQUODI8sF^_kV8pCp;M?'!d%Z[#Ercn"u4=.b2*F?5%Dd!Yp^K[9f3>6HY)?q(u@(5$ERlo;h?"kX-7ljmO:+0^0IE!(W.VVrj5hp(u-KBWgDs7[_k>$L\O?2Km*50&pFBE<nP:e538_fRql@=`5+YcWHn"u'3i5+HVjQbndkjmS5m"Uj5,JG"q+%iciU)kkb"IY6hkN%i+^#1utlXt?"m`f%<%n?ZH+^'B$2hlqG8pZ'&]iMIP]iUYdfr-4b,Es:>%YPT-?\FIQL?%A;;4-'qlQ[]ccfo5P"]XMr^%lWeYkDSXT:+jGbrL$4(H/bjhXf1<\fW(I;UpNIB-sJS,a_GiEr:7&>\>m.t-0:-)=?_6g<0e=*WT/H,6W0^qHgu@"*$'(>1UXo:[Fs;Q1d@R3/A+qVR<_\GKmI#PJ?b*`!Jfl&(hUWu$ns8b[M?"k@)`R?"N#=%Yhaot5/R!u^c)eVP6a,K@C-\e.,\\U3,ugZ*5Qh"_Y4Mp4pl7Wn&+l#"?\f_n?_)O*XR3")8e+]J=[a*GZhh#hZLS*gAH;g*,r1R^i+apm;dGS,*0e/JB4@DA/u,,a]HM!O!Qfkk2a&j_t)fd0J.kj'sJnk`X>)@_KHo,h1Ftr_Y+5=2\4P)#WMk#ZkIDBi+<m9EPfpJ_05o>\tI<S>L2!`D<;H10(TADX*O"nY;2V>ST=i&J1c,&MFF?"5Dr6B\b(1BSn$q&kd%]6mfJbP90oY)jdg_C?LnHQ/L`=Y=jJHojoo6I>$_KHQ@lhrQ$E"oNq^Z.XChAHk)e^%k"Hb)j=_82iKhrLBAOT#_EVeK,jt[*n%JKT&DH\\<q0)_T\7$S]5\o]jnZlC1/hP,3)o`UoTZ=.?IA#\l:\:F?t.Q"<LqK4GKTV?5>@r8k9?m=K9&3OFm%Y^RVb*aS+(n)HS$.MZl3N5ag\A>o;Uk@`-u@Q`W%V>/K$RQ8bN+#na^_2Xj`2D@fCe^C$V`*Y=jB-2?bs.ir3Y`-lPPK1__FrKb_GgJ?b,dTh;&n/RT&re+X,V+][T)Q2_pNl&7Um\mq[fqUg69c`Sb!m;;Fa8:tW$"V',)paO#fR['BGjP`G)[I`tZg`(QK%PVAtllTXt%,`([^eYG&K3pY,04+q`Hi&HMmLOedSbg;IEn1-"b8]jGq68>sR=4YGYb>g+ps5t!Vgn]WFd23RSlj^Fqb9_\'Pj">nmh],bFb,oEg\@nqiqj[b7,[>?W,qKn9Mc_po@Ve>8ge+@_6+bkJ)M:c?bR**RNC7XB4_od]V;o?*g_bf2-kp&c(lqgCi@edRMgd:q''fIT^35esa2+L[8657\8.pNS"`[A*qT+jsQKWV6')Le1r1B<8;lj%J*^4B*]#J;m=#QL.q7j3qu8n\]#?/hP"rCWT_9J1ZGn2IOTBrk2SOs<9U;cQHo'7"K,P4hf7*sANectm:9c+C13?`.TUEF1$JYlSLI31iSo:?^\t.3aj0q#iVcQ3h#6].pcT_!<BXF(I138GFS_VA'5c@n&`oJA-`q'MBh?^ET92]pP<CK!2mI1T\uff["2enecUdrN);+>[O/7NX8N:Xu[AU0%DK]O%S]GQY($I4U'8?*cX(QQM?"Ye'(FVmCVlq.K&hf\(*ZB#3Fu:;F=VJ6ljS*;u]T?J,1UK9j@V]!im2gNVmZcUpms/$(;;FkQmBE.Jp=s?RjSI!b(r1?X_JLWp-b$_]1])"Rg:]qAK5g3SYf9C(AZr9"YBI%U>n2[eKa@T<?>"_7p(;32cb#;<H9n!rVt7=EX6]_BIc!@s*H0![Ydebg*\tZ-jD-@9dokoI+@,O;S&Zl71-"as+8:nnTdF^3T]:Dr5'@_SEpihm^Ek'irXL!SZYcCoC(-sT6Ok]Dp%IW7UlG+aqMeElgS_ZD!TY1il\79YRQ-XdJ#=>cenq&3;S"2VB!PuN]!.EPa.GkEl*F<+[uf2Rdq>R5;"G=.KDSQrAQ.6rh36g_,@4tcBe7OYHrrYn^OZ"S;BFGjICbfLCs'8$)$J=Zm'"m)0?Tn;\u^<-G%ZWN:F9hPZ.YB*Q=]5mYBfcJJ]7#j"od=2_g@".J%)g9s!>SQhc88!kC9(qnSs&"]H;'m7]#<aitcOtkGE=CIJHau4ma1DimA3L'^mXZgH1ru7"#7/3m@fL/e&mqWM,RQjh(IK4u59DQ4t@CrHB`8&Q+N.DQWY+s#C7il=fK+Lofq@cCLqPdt!!Q>8c\LUP6<R([%^F3&4Ou5'9tV`T6m^'<)D9'@WSI3ZOrd$mfT6Ut;d'\nO=jV4',5`[\H11rJ-^l$IOmI!29s;L)Wa._b.sjs]l@RJqsFE+h^Gg;4&Qmc8FQ/jK!LfQU>k%J^=m+o_#A2Z1_0B"fD$W`iH"h0.Y\VXifT_tCcg!S>LV#@BLc%Z[sJWHZ]02j#AM4$7fVZgr6,N&_8?^X&>`hLgg\=P*KkCWgokZ.NoJ!ifJ)BfgOf=0h9Okc-i=5;It3hEX,h<h7O\n&T"=l](C=[R%XZbk8++kV[(QjDi+f,5e[VQ.\^C^Bp>&''igo)D#@j14"8J9hBV?NrG297&9'-Zk%94VaqoMms?Q">?ppI]kBPFns%W"*pb^]qpkA^Bl18Hebdb'l?aWAl2FbhIX^2CrE#9MjEiDtm^j_<^GV9@b'N?qe$dnFkUM^sbHTX-"t5M&A\>#,Ps!iZk'B^UhL"!Hl#!3%j!Tr^I<$Bi]&+\Z>**&3n#3ZeB%?VG-A:+u;3#6X,IBKW;Q?hqf<'[email protected]:LfjOo:>Uk:U>8<r(u(Wbcag)G=#,b.D=7a*o8r+0dg[Mki:R.kiujN)et.=F#nQAIl$k\25]#i;6EYhT?QrG[gC=4mL*i$^8h^mP/Y/gJXHOWa=#][h`pBerC&D&k+C?@h_@PoRe5/Gg4*g$*C$j;)u+Y=pc`j_2Uu1*om%SnJ":L2dVmX&Z_m&R#`]^\6bVBlY'G+KSD)iKDZ9lK_jo`^kK)j@Z6nc,>JjbC;J^,<SOh@t?fq74<ETu&GKGSgH!H]44I<!7I^_ml<[KG)2A[c2pr2W;U/`kug15N:Sii]QPO,U/l$\fi4\p4f7TXiA$$%)GO;CmN9P@6>Is^"J75_Oog1.f@Cc=XE]t4s/__pZKc:%W'`-UXF5"V.5SfA/5Rhe)j;Y$T*Cg1V=2U<t7W0A0Cg8%cu>YYYIpTWd=2"uDJ(#'Aa?QBXa0Te<"cU.:urs-7Gs$(/[hlig091@)&HXDPVOlsLTI(;2=8CO@kA3*-E(R3,.8\_BGma1J0BB3k*oc62.b'n+b>pf:/EbU^PJ'OMT.JL,JegKVSQ2B*Edmo5Q-?g!,]I8\:o@$,\IC:rs2<KhXq-NFR2Z(/Da/6<Fb"_Ru,=[s\C!Z1TJ#N-iV4XL.[Ilj)r7B#5a)bEt6]D!aEIJ0EMPm!Q(%0Tos//f4?1@Z1:paHaZ;Qk@mAl"kLaK@!f4+k/,PA#ol?o9[_T(sj%E.c."S3k+lh"9#U#FCMQi3XG>r-1!o9-&dS'kW1)=McJgtV5dZN%?e\MNo>?_<8Jf&%ln:u)VS_t5$9$t-FB9IF8g%G3;Kdg_ss@7LDRoKW-dc237LST!:>&)R<E'Y06N?"m/IR@1DJgY]O$m5j;J8Xh;EV08G+(op34G@MBRN)WouIq'4A0i0jKc,M7l`DGS0*+GbS.Rt>Yit\]9<O%FiN)blmcJoZ"@\Rk"mP._s<;.64`[cCpro4`AMMCYYGf;r<XINb((ohKkY7@Jb^Gqu88D)EaC5"*A`o(Xnnl&eBr^c<h.WYO^'=ko#roXU0c5O(jIre__CB_W#<f,ViE;o#.4.JKa$d_$+'=kmpKJ!b>M;7$Ii#<20U3AS+O_+L:1`:8/=Xa7m\?]spT8G_TVjB<jDMM&%q@B5IYaPQ#$Xe],C^@oeK$\@gSVJ%:@(QFn]n,N/_tFVF%KLGW"Pb9>P_PQB*C6GQ5(K?<BE<nPJf&+:,7H7X6+.N4*+6bolib1K#I*$9fKR1G!PgW)E0CPe)I]ud`a>u_<SR`9nV8lC_M+00)kcD45K@i/`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=)o$C5Zp4R*=b#O]=b^tMOrj3f]DXYZeS1NLPJ>7^)i[$\1%a!`/M0OlCql4XbKkt,EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>EQ9&=*-'"=`uK*>J+)04GhWK'!!(Dp
rcs0 --- batch = 0x000074af 3c01b000
:dd<QL:r>o=YMk!K:o_rXSaCK;ah[@IWQiQ063n]c`c!4/GV/,`%22W*Je"<<cOL&Kr.O,e6qI-f'i'b%Nk4E(mV&XSFBu<NrV+L):*&)pAk">j&:Vg[M!Hk78CNC8(Y?_#g\(*mjSeE?`PKfEhs<Bgea<;\FDUua)>%\<19^)[P;2h+m?_2UhsP-L:$qm*F#s<Ro-IUj5MId)s7=NnDu#UC1,&@EQnS4.Cn^mQs)TsqT5WsHAKUWgA=IN*qdn+QDtc&K?i8"]r1E]*!!"+?zzzzzzzzzzzzzzrVuouX\fV$!!)`!
available engines: 0
slice total: 0, mask=0000
subslice total: 0
EU total: 0
EU per subslice: 0
has slice power gating: no
has subslice power gating: no
has EU power gating: no
Unavailable
graphics version: 9
media version: 9
graphics stepping: C0
media stepping: C0
display stepping: C0
base die stepping: **
gt: 0
memory-regions: 0x21
page-sizes: 0x11000
platform: BROXTON
ppgtt-size: 48
ppgtt-type: 2
dma_mask_size: 39
is_mobile: no
is_lp: yes
require_force_probe: no
is_dgfx: no
has_64bit_reloc: yes
has_64k_pages: no
gpu_reset_clobbers_display: no
has_reset_engine: yes
has_3d_pipeline: yes
has_flat_ccs: no
has_global_mocs: no
has_gmd_id: no
has_gt_uc: yes
has_heci_pxp: no
has_heci_gscfi: no
has_guc_deprivilege: no
has_guc_tlb_invalidation: no
has_l3_ccs_read: no
has_l3_dpf: no
has_llc: no
has_logical_ring_contexts: yes
has_logical_ring_elsq: no
has_media_ratio_mode: no
has_mslice_steering: no
has_oa_bpc_reporting: no
has_oa_slice_contrib_limits: no
has_oam: no
has_one_eu_per_fuse_bit: no
has_pxp: no
has_rc6: yes
has_rc6p: no
has_rps: yes
has_runtime_pm: yes
has_snoop: yes
has_coherent_ggtt: no
tuning_thread_rr_after_dep: no
unfenced_needs_alignment: no
hws_needs_physical: no
has_pooled_eu: no
rawclk rate: 266667 kHz
display version: 9
cursor_needs_physical: no
has_cdclk_crawl: no
has_cdclk_squash: no
has_ddi: yes
has_dp_mst: yes
has_dsb: no
has_fpga_dbg: yes
has_gmch: no
has_hotplug: yes
has_hti: no
has_ipc: yes
has_overlay: no
has_psr: yes
has_psr_hw_tracking: yes
overlay_needs_physical: no
supports_tv: no
has_hdcp: yes
has_dmc: yes
has_dsc: no
Has logical contexts? yes
scheduler: 0x1f
i915.modeset=-1
i915.enable_guc=0
i915.guc_log_level=-1
i915.guc_firmware_path=(null)
i915.huc_firmware_path=(null)
i915.dmc_firmware_path=(null)
i915.gsc_firmware_path=(null)
i915.memtest=no
i915.mmio_debug=0
i915.reset=3
i915.inject_probe_failure=0
i915.force_probe=
i915.request_timeout_ms=20000
i915.lmem_size=0
i915.lmem_bar_size=0
i915.enable_hangcheck=yes
i915.error_capture=yes
i915.enable_gvt=no
i915.vbt_firmware=
i915.lvds_channel_mode=0
i915.panel_use_ssc=-1
i915.vbt_sdvo_panel_type=-1
i915.enable_dc=-1
i915.enable_dpt=yes
i915.enable_sagv=yes
i915.disable_power_well=1
i915.enable_ips=yes
i915.invert_brightness=0
i915.edp_vswing=0
i915.enable_dpcd_backlight=-1
i915.load_detect_test=no
i915.force_reset_modeset_test=no
i915.disable_display=no
i915.verbose_state_checks=yes
i915.nuclear_pageflip=no
i915.enable_dp_mst=yes
i915.enable_fbc=0
i915.enable_psr=-1
i915.psr_safest_params=no
i915.enable_psr2_sel_fetch=yes

dmesg:

[460353.897449] i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
[460353.897645] i915 0000:00:02.0: [drm] ffmpeg[3608236] context reset due to GPU hang
[460353.906167] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:00280000, in ffmpeg [3608236]
[460441.801317] i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
[460441.801421] i915 0000:00:02.0: [drm] ffmpeg[3609065] context reset due to GPU hang
[460441.807658] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:00280000, in ffmpeg [3609065]
[479666.577426] i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
[479666.577537] i915 0000:00:02.0: [drm] ffmpeg[3827358] context reset due to GPU hang
[479666.588557] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:00295fff, in ffmpeg [3827358]
[479704.830315] i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
[479704.830421] i915 0000:00:02.0: [drm] ffmpeg[3827710] context reset due to GPU hang
[479704.840767] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:00280000, in ffmpeg [3827710]
[479739.436018] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
[479739.436132] i915 0000:00:02.0: [drm] ffmpeg[3828065] context reset due to GPU hang
[479739.441867] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:8ed1fff2, in ffmpeg [3828065]
[479780.658127] i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
[479780.658331] i915 0000:00:02.0: [drm] ffmpeg[3828451] context reset due to GPU hang
[479780.666494] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:00280000, in ffmpeg [3828451]
[479827.283232] i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
[479827.283364] i915 0000:00:02.0: [drm] ffmpeg[3828912] context reset due to GPU hang
[479827.288876] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:00295fff, in ffmpeg [3828912]
[479866.100681] i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
[479866.100787] i915 0000:00:02.0: [drm] ffmpeg[3829287] context reset due to GPU hang
[479866.107034] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:00280000, in ffmpeg [3829287]
[494595.260958] BTRFS info (device dm-11): device stats zeroed by btrfs (3947864)
[530927.635270] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
[530927.635376] i915 0000:00:02.0: [drm] ffmpeg[47795] context reset due to GPU hang
[530927.644808] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:8ed1fff2, in ffmpeg [47795]
[530964.882985] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
[530964.883088] i915 0000:00:02.0: [drm] ffmpeg[48202] context reset due to GPU hang
[530964.890020] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:8ed1fff2, in ffmpeg [48202]
[530983.441819] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
[530983.441925] i915 0000:00:02.0: [drm] ffmpeg[48405] context reset due to GPU hang
[530983.452043] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:8ed1fff2, in ffmpeg [48405]
[531032.747598] i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
[531032.747704] i915 0000:00:02.0: [drm] ffmpeg[48902] context reset due to GPU hang
[531032.755920] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:00280000, in ffmpeg [48902]
[531055.584224] i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
[531055.584326] i915 0000:00:02.0: [drm] ffmpeg[49112] context reset due to GPU hang
[531055.602631] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:00280000, in ffmpeg [49112]
[531078.927740] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
[531078.927849] i915 0000:00:02.0: [drm] ffmpeg[49382] context reset due to GPU hang
[531078.935235] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:e757fefe, in ffmpeg [49382]
[531113.717433] i915 0000:00:02.0: [drm] Resetting rcs0 for CS error
[531113.717537] i915 0000:00:02.0: [drm] ffmpeg[49734] context reset due to GPU hang
[531113.724791] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:00280000, in ffmpeg [49734]
[581042.737142] BTRFS info (device dm-11): device stats zeroed by btrfs (495467)

Do you want to contribute a patch to fix the issue?

None

@Goodwu Goodwu changed the title [Bug]: Transcoding failed with GPU HANG [Bug]: Transcoding HDR video failed with GPU HANG Jun 22, 2024
@Goodwu
Copy link
Author

Goodwu commented Jun 23, 2024

https://pan.baidu.com/s/1DR_2FTUGH5gGJX2luQqSOQ?pwd=v5pp
This is the video file cause the problem.

@Goodwu
Copy link
Author

Goodwu commented Jun 23, 2024

The transcoding hangs ffmpeg randomly.
This is the transcoding command line:
/usr/bin/ffmpeg -hwaccel qsv -hwaccel_output_format qsv -async_depth 4 -threads 1 -i /media/photos/Camera Uploads/DCIM/Camera/VID_20240109_121531.mp4 -y -c:v h264_qsv -c:a copy -movflags faststart -fps_mode passthrough -map 0:0 -map 0:1 -bf 7 -refs 5 -g 256 -v verbose -vf scale_qsv=720:-1:async_depth=4:mode=hq,hwmap=derive_device=opencl,tonemap_opencl=desat=0:format=nv12:matrix=bt709:primaries=bt709:range=pc:tonemap=hable:transfer=bt709,hwmap=derive_device=qsv:reverse=1,format=qsv -preset 7 -global_quality:v 23 -maxrate 4000k -bufsize 8000k upload/encoded-video/0ef79aea-c9d6-4f20-b7d2-bc3a01408cb6/28/49/28491f52-ccf9-4073-aace-b7919fffd528.mp4

@XinfengZhang
Copy link
Contributor

@xhaihao , looks the hang is in RCS, so , suppose it is caused by tonemap_opencl, it is not a media driver feature. should be reported in ffmpeg or openCL?

@Goodwu
Copy link
Author

Goodwu commented Jun 28, 2024

@xhaihao , looks the hang is in RCS, so , suppose it is caused by tonemap_opencl, it is not a media driver feature. should be reported in ffmpeg or openCL?

Is it similar to #1456 ?

@chao-camect
Copy link

I have the same issue. I suspect it's some change after intel-media-23.2.4.
It has been running fine for me for years, before I upgraded to a newer version.
I'll revert to intel-media-23.2.4 and keep other deps the same see whether it'll be fine.

@chao-camect
Copy link

Have been running well with intel-media-23.2.4 on kernel 5.19.17 for almost a month.
I'll switch back to the latest release again to see whether I can reproduce it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants