[PATCH 1/1] UBUNTU: SAUCE: add tracing for user initiated readahead requests

Stefan Bader stefan.bader at ubuntu.com
Thu Jul 29 11:48:10 UTC 2010


On 07/29/2010 11:41 AM, Andy Whitcroft wrote:
> On Thu, Jul 29, 2010 at 11:20:40AM +0200, Stefan Bader wrote:
>> On 07/29/2010 10:46 AM, Andy Whitcroft wrote:
>>> Track pages which undergo readahead and for each record which were
>>> actually consumed, via either read or faulted into a map.  This allows
>>> userspace readahead applications (such as ureadahead) to track which
>>> pages in core at the end of a boot are actually required and generate an
>>> optimal readahead pack.  It also allows pack adjustment and optimisation
>>> in parallel with readahead, allowing the pack to evolve to be accurate
>>> as userspace paths change.  The status of the pages are reported back via
>>> the mincore() call using a newly allocated bit.
>>>
>>> Signed-off-by: Andy Whitcroft <apw at canonical.com>
>>> ---
>>>  include/linux/page-flags.h |    3 +++
>>>  mm/filemap.c               |    3 +++
>>>  mm/memory.c                |    7 ++++++-
>>>  mm/mincore.c               |    2 ++
>>>  mm/readahead.c             |    1 +
>>>  5 files changed, 15 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
>>> index 5b59f35..89dc94f 100644
>>> --- a/include/linux/page-flags.h
>>> +++ b/include/linux/page-flags.h
>>> @@ -108,6 +108,7 @@ enum pageflags {
>>>  #ifdef CONFIG_MEMORY_FAILURE
>>>  	PG_hwpoison,		/* hardware poisoned page. Don't touch */
>>>  #endif
>>> +	PG_readaheadunused,	/* user oriented readahead as yet unused*/
>>>  	__NR_PAGEFLAGS,
>>>  
>>>  	/* Filesystems */
>>> @@ -239,6 +240,8 @@ PAGEFLAG(MappedToDisk, mappedtodisk)
>>>  PAGEFLAG(Reclaim, reclaim) TESTCLEARFLAG(Reclaim, reclaim)
>>>  PAGEFLAG(Readahead, reclaim)		/* Reminder to do async read-ahead */
>>>  
>>> +PAGEFLAG(ReadaheadUnused, readaheadunused)
>>> +
>>>  #ifdef CONFIG_HIGHMEM
>>>  /*
>>>   * Must use a macro here due to header dependency issues. page_zone() is not
>>> diff --git a/mm/filemap.c b/mm/filemap.c
>>> index 20e5642..26e5e15 100644
>>> --- a/mm/filemap.c
>>> +++ b/mm/filemap.c
>>> @@ -1192,6 +1192,9 @@ int file_read_actor(read_descriptor_t *desc, struct page *page,
>>>  	if (size > count)
>>>  		size = count;
>>>  
>>> +	if (PageReadaheadUnused(page))
>>> +		ClearPageReadaheadUnused(page);
>>> +
>>>  	/*
>>>  	 * Faults on the destination of a read are common, so do it before
>>>  	 * taking the kmap.
>>> diff --git a/mm/memory.c b/mm/memory.c
>>> index 119b7cc..97ca21b 100644
>>> --- a/mm/memory.c
>>> +++ b/mm/memory.c
>>> @@ -2865,10 +2865,15 @@ static int __do_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>>>  	else
>>>  		VM_BUG_ON(!PageLocked(vmf.page));
>>>  
>>> +	page = vmf.page;
>>> +
>>> +	/* Mark the page as used on fault. */
>>> +	if (PageReadaheadUnused(page))
>>> +		ClearPageReadaheadUnused(page);
>>> +
>>>  	/*
>>>  	 * Should we do an early C-O-W break?
>>>  	 */
>>> -	page = vmf.page;
>>>  	if (flags & FAULT_FLAG_WRITE) {
>>>  		if (!(vma->vm_flags & VM_SHARED)) {
>>>  			anon = 1;
>>> diff --git a/mm/mincore.c b/mm/mincore.c
>>> index 9ac42dc..a4e573a 100644
>>> --- a/mm/mincore.c
>>> +++ b/mm/mincore.c
>>> @@ -77,6 +77,8 @@ static unsigned char mincore_page(struct address_space *mapping, pgoff_t pgoff)
>>>  	page = find_get_page(mapping, pgoff);
>>>  	if (page) {
>>>  		present = PageUptodate(page);
>>> +		if (present)
>>> +			present |= (PageReadaheadUnused(page) << 7);
>>>  		page_cache_release(page);
>>>  	}
>>>  
>>> diff --git a/mm/readahead.c b/mm/readahead.c
>>> index 77506a2..6948b92 100644
>>> --- a/mm/readahead.c
>>> +++ b/mm/readahead.c
>>> @@ -181,6 +181,7 @@ __do_page_cache_readahead(struct address_space *mapping, struct file *filp,
>>>  		list_add(&page->lru, &page_pool);
>>>  		if (page_idx == nr_to_read - lookahead_size)
>>>  			SetPageReadahead(page);
>>> +		SetPageReadaheadUnused(page);
>>>  		ret++;
>>>  	}
>>>  
>>
>> I think it looks good. Just out of interest, the last hunk sounds a bit like it
>> only sets PageReadahead on one page while PageREadaheadUnused is set on all of
>> them. Which seems a bit odd.
> 
> Thats because the PageReadahead flag is a marker, a pointer into the
> memory space, we read the block marked with that for real we know its
> time to schedule more readahead as we are close to consume all of the
> previous readhead.
> 
> -apw

Ah, thanks for the explanation. It sounded rather like a marker which pages came
from readahead. But it makes sense and probably is hard to find a good name for it.


Acked-by: Stefan Bader <stefan.bader at canonical.com>




More information about the kernel-team mailing list