[Pull Request][Lucid] Fix Xorg hangs with some i915 chipsets

Seth Forshee seth.forshee at canonical.com
Wed Jun 22 15:44:10 UTC 2011


On Wed, Jun 22, 2011 at 04:50:34PM +0200, Stefan Bader wrote:
> On 17.06.2011 18:38, Seth Forshee wrote:
> > This is a rather large series of non-trivial backports to fix a Xorg
> > hang caused by the i915 driver. Patches 5 and 6 are the key patches that
> > implement an LRU-based eviction algorithm; patches 1-4 are
> > prerequisites, patch 7 fixes some of the clean-up, and patch 8 adds a
> > periodic flush to retire active buffers.
> > 
> > This is a pretty severe bug for those affected by it, but it's also a
> > huge amount of change for a stable kernel, so let me know if it's beyond
> > hope for an SRU.
> > 
> > SRU Justification:
> > 
> >   Impact: When the GPU aperture becomes sufficiently fragmented two or
> >   more mmapped buffers in active use can repeatedly push each other out
> >   of the aperture. This results in a sort of livelock situation, with
> >   Xorg appearing to be hung with high CPU utilization.
> > 
> >   Fix: Backport of a series of patches that convert the i915 eviction
> >   algorithm from a best-fit approach to one based on evicting the least
> >   recently used objects and a patch that adds a periodic flush requests
> >   to retire active buffers when no client is active.
> > 
> >   Testcase: Without these patches this scenario can be triggered
> >   ocassionally when visiting certain web pages that utilize Flash or
> >   viewing certain images in Firefox, as described on bugs #599017 and
> >   #330460. Testing the same pages/images with these patches does not
> >   induce the hang.
> > 
> > 
> > The following changes since commit 9cc333d02cb4620561e91a63ed86a2159ec32638:
> > 
> >   UBUNTU: Ubuntu-2.6.32-33.67 (2011-06-16 11:26:04 -0500)
> > 
> > are available in the git repository at:
> >   git://kernel.ubuntu.com/sforshee/ubuntu-lucid.git lp599017
> > 
> > Chris Wilson (5):
> >       drm/i915: Move the eviction logic to its own file.
> >       drm/i915: Implement fair lru eviction across both rings.  (v2)
> >       drm/i915: Maintain LRU order of inactive objects upon access by CPU (v2)
> >       drm/i915/evict: Ensure we completely cleanup on failure
> >       drm/i915: Periodically flush the active lists and requests
> > 
> > Daniel Vetter (3):
> >       drm_mm: extract check_free_mm_node
> >       drm: implement helper functions for scanning lru list
> >       drm/i915: prepare for fair lru eviction
> > 
> >  drivers/gpu/drm/drm_mm.c              |  236 +++++++++++++++++++++++++-----
> >  drivers/gpu/drm/i915/Makefile         |    1 +
> >  drivers/gpu/drm/i915/i915_drv.h       |   13 ++
> >  drivers/gpu/drm/i915/i915_gem.c       |  228 ++++-------------------------
> >  drivers/gpu/drm/i915/i915_gem_evict.c |  253 +++++++++++++++++++++++++++++++++
> >  include/drm/drm_mm.h                  |   15 ++-
> >  6 files changed, 510 insertions(+), 236 deletions(-)
> >  create mode 100644 drivers/gpu/drm/i915/i915_gem_evict.c
> > 
> 
> I also tend to say the bug sounds serious enough to make an exception to the
> rule of limiting the change.
> 
> I would probably add those to the 2.6.32+drm33 tree. Though there are two
> details which I think should be resolved before:
> 1. In the bug report (#599017) the last request for testing points to a place
> which only shows 6 patches. Does the kernel provided actually contain all of them?

No, I guess I didn't post a link to that build on the bug, although I
just did so. I have an affected machine though and have tested the full
series myself for several days. I was able to trigger the bug very
ocassionally without the last patch but have never triggered it with
that patch.

> 2. It would be good to get some feedback about the test kernel in the bug.

I have received some offline feedback, but I'll ask for the feedback to
be posted to the bug.




More information about the kernel-team mailing list