[SRU][Noble][PATCH 0/1] i915: Fixup regressions introduced with enabling single CCS engine
Matthew Ruffell
matthew.ruffell at canonical.com
Mon Jul 15 02:45:32 UTC 2024
BugLink: https://bugs.launchpad.net/bugs/2072755
[Impact]
Recently, the Intel i915 susbsystem underwent a change that limited the number
of CCS engines that were initialised by default, and exposed to the user.
Different chipsets have differing amounts of CCS engines, but most available in
the market have 4 CCS engines. The new change just starts a single engine only,
and allocates all CCS slices to this single engine. This single engine is then
exposed to userspace. This effort is to workaround a hardware bug.
This all happened in:
commit 6db31251bb265813994bfb104eb4b4d0f44d64fb
Author: Andi Shyti <andi.shyti at linux.intel.com>
Date: Thu Mar 28 08:34:05 2024 +0100
Subject: drm/i915/gt: Enable only one CCS for compute workload
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6db31251bb265813994bfb104eb4b4d0f44d64fb
which landed in:
$ git describe --contains 67f164e8510b16bda18642464863dba87a33d8cb
Ubuntu-6.8.0-38.38~525
There have been some side effects as a result of these changes, leading to
failure of userspace applications, namely in video transcoding with ffmepg,
resulting in fence expiration errors in dmesg like:
[ 81.026591] Fence expiration time out i915-0000:01:00.0:ffmpeg[521]:2!
There has also been a performance impact introduced by this change, which
dropped performance of the GPU to 1/4 of what it was previously. This is likely
due to most ARC GPUs usually having 4 CCS engines, and going down to 1 only
without actually allocating the other three.
There are no workarounds. Users are suggested to downgrade to 6.8.0-36-generic
while the fix is coming.
[Fix]
The regression was fixed by these two commits:
commit aee54e282002a127612b71255bbe879ec0103afd
Author: Andi Shyti <andi.shyti at linux.intel.com>
Date: Fri Apr 26 02:07:23 2024 +0200
Subject: drm/i915/gt: Automate CCS Mode setting during engine resets
Link: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble/commit/?id=aee54e282002a127612b71255bbe879ec0103afd
commit ee01b6a386eaf9984b58a2476e8f531149679da9
Author: Andi Shyti <andi.shyti at linux.intel.com>
Date: Fri May 17 11:06:16 2024 +0200
Subject: drm/i915/gt: Fix CCS id's calculation for CCS mode setting
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ee01b6a386eaf9984b58a2476e8f531149679da9
"drm/i915/gt: Automate CCS Mode setting during engine resets" is already applied
to noble/master-next through upstream stable v6.8.10.
We just need "drm/i915/gt: Fix CCS id's calculation for CCS mode setting". It is
queued up for v6.9.4, but that could still be another SRU cycle or two away. So
send it now.
"drm/i915/gt: Fix CCS id's calculation for CCS mode setting" restores another
1/4 performance, but some performance issues still remain, and will hopefully
be addressed in a future patch.
[Testcase]
This affects video transcoding with ffmpeg, on machines equipped with Intel ARC
GPUs.
An example ffmpeg command might be:
/usr/lib/jellyfin-ffmpeg/ffmpeg -analyzeduration 200M -probesize 1G -ss 00:00:03.000 -noaccurate_seek -init_hw_device vaapi=va:,kernel_driver=i915,driver=iHD -init_hw_device qsv=qs at va -filter_hw_device qs -hwaccel vaapi -hwaccel_output_format vaapi -noautorotate -i file:"/path/to/1080_video.mkv" -noautoscale -map_metadata -1 -map_chapters -1 -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 av1_qsv -preset veryfast -b:v 3616000 -maxrate 3616000 -bufsize 7232000 -g:v:0 72 -keyint_min:v:0 72 -vf "setparams=color_primaries=bt709:color_trc=bt709:colorspace=bt709,scale_vaapi=w=1280:h=720:format=nv12:extra_hw_frames=24,hwmap=derive_device=qsv,format=qsv" -codec:a:0 libfdk_aac -ac 2 -vbr:a 5 -copyts -avoid_negative_ts disabled -max_muxing_queue_size 2048 -f hls -max_delay 5000000 -hls_time 3 -hls_segment_type fmp4 -hls_fmp4_init_filename "c30716eb121448346fcc00a2440071a3-1.mp4" -start_number 1 -hls_segment_filename "/var/lib/jellyfin/transcodes/c30716eb121448346fcc00a2440071a3%d.mp4" -hls_playlist_type vod -hls_list_size 0 -y "/var/lib/jellyfin/transcodes/c30716eb121448346fcc00a2440071a3.m3u8
Another user on bug 2072933 came up with this minimalist reproducer:
#include <cstdio>
#include <sycl/sycl.hpp>
int main() {
// auto selector = sycl::cpu_selector_v; // Works fine
auto selector = sycl::gpu_selector_v;
auto queue = sycl::queue(selector);
printf("Hello\n");
queue.submit([&](sycl::handler &cgh) {
cgh.parallel_for(sycl::range(1), [=](sycl::item<1> item) {});
});
queue.wait();
printf("Bye\n");
return 0;
}
$ icpx -fsycl sycltest.cpp -o sycltest
$ ./sycltest
These commands should run successfully to completion. On failure, they will
emit in dmesg:
[ 81.026591] Fence expiration time out i915-0000:01:00.0:ffmpeg[521]:2!
A test kernel is available in the following ppa:
https://launchpad.net/~mruffell/+archive/ubuntu/lp2072755-test
If you install the test kernel, things should work correctly.
[Where problems could occur]
This issue affects users of i915, which is a pretty universal integrated GPU
present on Intel processors. While these patches are unlikely to cause outages
that stop the primary display from functioning, any further regressions may add
additional performance impact or prevent workloads from executing correctly.
These patches are all accepted into upstream -stable, and we would consume them
in due course anyway.
If a regression were to occur, there are no workarounds, and users would need to
select an older kernel until a fix is available.
[Other info]
Upstream Bug:
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895
Andi Shyti (1):
drm/i915/gt: Fix CCS id's calculation for CCS mode setting
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 6 ++++++
drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c | 2 +-
drivers/gpu/drm/i915/gt/intel_gt_types.h | 8 ++++++++
3 files changed, 15 insertions(+), 1 deletion(-)
--
2.45.2
More information about the kernel-team
mailing list