Fwd: TRANSPARENT_HUGEPAGE_ALWAYS Analysis (LP:# LP:743688) and Trusty Kernel team work item

Tim Gardner tim.gardner at canonical.com
Thu Dec 12 13:38:54 UTC 2013


APPLIED

-------- Original Message --------
Subject: TRANSPARENT_HUGEPAGE_ALWAYS Analysis (LP:#  LP:743688) and
Trusty Kernel team work item
Date: Mon, 09 Dec 2013 14:38:04 +0000
From: Colin Ian King <colin.king at canonical.com>
To: Tim Gardner <rtg.canonical at gmail.com>,  Andy Whitcroft
<apw at canonical.com>, Leann Ogasawara <leann.ogasawara at canonical.com>

Hi there,

I've been faffing around with various tests to see how transparent
hugepages perform and I've now got some results.

Attached are the results and a write up of the testing. From what I can
see, there is no good reason why we shouldn't enable
TRANSPARENT_HUGEPAGE_ALWAYS for Trusty.

Colin





-------------- next part --------------
A non-text attachment was scrubbed...
Name: huge-pages.ods
Type: application/vnd.oasis.opendocument.spreadsheet
Size: 77819 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20131212/c65fd1a9/attachment.ods>
-------------- next part --------------
TRANSPARENT_HUGEPAGE_ALWAYS vs CONFIG_TRANSPARENT_HUGEPAGE_MADVISE

In this set of tests, a recent 3.12-rc1 kernel was build and tested in 3
different configurations:

1. TRANSPARENT_HUGEPAGE_MADVISE
	madvise'd transparent hugepages

2. TRANSPARENT_HUGEPAGE_ALWAYS
	transparent hugepages

3. No huge pages (just 4K pages).

The tests were run on a 64 bit installation of Ubuntu Saucy with 3.12-rc1
on 4 core 2.3Ghz i5-4670T (Haswell, 6M cache) with 8GB memory. The CPU has
a 4-way set associative, 64 entries TLB.

Originally I hoped to exericse all the tests as described in Mel Gorman's LWN article [1], unfortunately two of these were SPEC tests and required a license
for the benchmarks.

== Stress VM ==

In this test various sized memory regions were allocated from 4K to 1GB and 
then memory was written in 4K strides in a 30 second period. The number of
page write operations was measured for this 30 second duration. No madvise()
MADV_HUGEPAGE advice was given to the kernel in this test.

This test will rapidly walk through all the pages in the allocation and 
exericse the TLB.  The results are in the "stress vm" tab of the spreadsheet. 

The first graph covers the entire data set from 4K to 1GB and generally the
different kernel configurations follow the same trend.

The second graph covers the data set from 16MB to 1GB and clearly shows that
the TRANSPARENT_HUGEPAGE_ALWAYS outperfoms the TRANSPARENT_HUGEPAGE_MADVISE
and the non-hugepage configuration.  TRANSPARENT_HUGEPAGE_ALWAYS in the 16MB
to 1GB case seems to improve performance by ~4.7 to 6.3%

== STREAM test ==

STREAM is a synthetic memory bandwidth benchmark that measures the
performance of four long vector operations: Copy, Scale, Add, and Triad.
It is used to calculate the number of floating point operations that can 
be performed during the time for the “average” memory access. Basically
speaking, the more bandwith the better the result.

Mel Gorman originally tested this on a PPC64, whereas we are testing this
on a 64 bit x64. His notes in [1] suggest that a 64 bit x64 will see a
0 to 4% improvement.  The 3 kernel configurations follow the same trends.
It seems that the 4-8MB size range TRANSPARENT_HUGEPAGE_ALWAYS shows a little
advantage where as in the in the Add and Triad tests more than 8MB there is
small impact in using TRANSPARENT_HUGEPAGE_ALWAYS.  Not entriely sure what to
deduce from this.

== postgres test ==

Originally I wanted to reproduce Mel's Postgres tests as described in [1] but
I was unabled to get a working configuration.  Instead, I populated a table
with 10 million rows of randomly sized text strings from 1 to 128 chars along
with a unique row id. I then selected the first N rows of the table and
summated the length of the first N strings and also the unique ID, hence 
traversing the data.  This was run 11 times per test, the first run 
populated the table into memory, and the subsequent 10 runs measured the
total run time of the select, the time was then averaged and used as the
test metric.  This was repeated for N from 10000 to 5120000 and the results
were plotted.

TRANSPARENT_HUGEPAGE_ALWAYS shows around 0% to 1.1% performance increase
compared to no huge pages for N > 10000, and a 0% to 1.9% performance increase
over TRANSPARENT_HUGEPAGE_MADVISE.

== Conclusion ==

Although the set of tests were not large, they all do show that the
TRANSPARENT_HUGEPAGE_ALWAYS can improve memory performance with large data
sets.  We do not see any massive detrimental affects using this by default
so enabling TRANSPARENT_HUGEPAGE_ALWAYS is advisable for Trusty.

References:
 [1] http://lwn.net/Articles/378641/  benchmarking with huge pages




More information about the kernel-team mailing list