kworker eating up cpu time when doing dd on 3.2.0 kernels

Christian Brandes christian.brandes at forschungsgruppe.de
Tue Feb 7 11:53:17 UTC 2012


Hi Andy and Stefan,

I am glad! Thanks for your help!

Yes, /tmp is on / and
no, / is not ext4, but xfs.

I installed Ubuntu 12.04 on two physical machines:
HP Compaq 6005 Pro Business PCs

But I tested it on oder Hardware, and had the same result, as well.
8,6 MB/s is very poor, even for a desktop hardware!
That is only the average. The first 8GB go in a few seconds at 95 MB/s and the rest only at a bit more then 1 MB/s.

What can the problem be?
Can it be that xfs on 3.2 gets so badly slow when it fills up more than 75%?

Now I tested with 10GB to not fill the filesystem too much, as I know very full filesystems get slow. But they should not at 80%. And xfs does not on 3.0 kernels.

Best regrads
Christian

----------------------------------------

~# /proc/version
Linux version 3.2.0-12-generic (buildd at crested) (gcc version 4.6.2 (Ubuntu/Linaro 4.6.2-12ubuntu1) ) #21-Ubuntu SMP Tue Jan 31 18:48:57 UTC 2012

The first 8 GB go in a few seconds and make about 50-80 MB/s. Top and bwm-ng look like that:

~# top
top - 12:01:06 up 1 min,  2 users,  load average: 1.64, 0.68, 0.25
Tasks: 121 total,   1 running, 120 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us,  6.9%sy,  0.0%ni,  0.7%id, 91.9%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:   1791108k total,  1237680k used,   553428k free,     1144k buffers
Swap:  7812092k total,        0k used,  7812092k free,  1101664k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2067 root      20   0 11500  632  516 D   15  0.0   0:01.64 dd
 1352 root      20   0     0    0    0 D    1  0.0   0:00.08 flush-8:0
   11 root      20   0     0    0    0 S    1  0.0   0:00.06 kworker/0:1
    1 root      20   0 24160 2192 1264 S    0  0.1   0:00.46 init
    2 root      20   0     0    0    0 S    0  0.0   0:00.00 kthreadd
    3 root      20   0     0    0    0 S    0  0.0   0:00.00 ksoftirqd/0
    4 root      20   0     0    0    0 S    0  0.0   0:00.00 kworker/0:0
    5 root      20   0     0    0    0 S    0  0.0   0:00.20 kworker/u:0

~# bwm-ng
 bwm-ng v0.6 (probing every 0.500s), press 'h' for help
  input: libstatdisk type: avg (30s)
  |         iface                   Rx                   Tx                Total
  ==============================================================================
              sda:          26.61 KB/s        95667.68 KB/s        95694.29 KB/s
  ------------------------------------------------------------------------------
            total:          26.61 KB/s        95667.68 KB/s        95694.29 KB/s

Then only 1-2 MB/s and top and bwm-ng look like that:

~# top
top - 12:02:26 up 2 min,  2 users,  load average: 1.75, 0.96, 0.38
Tasks: 119 total,   2 running, 117 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.8%us, 11.2%sy,  0.0%ni, 53.6%id, 34.2%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:   1791108k total,  1720964k used,    70144k free,      716k buffers
Swap:  7812092k total,        4k used,  7812088k free,  1556516k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    4 root      20   0     0    0    0 S   47  0.0   0:04.04 kworker/0:0
    9 root      20   0     0    0    0 R   45  0.0   0:04.07 kworker/1:0
 2067 root      20   0 11500  632  516 D    4  0.0   0:24.56 dd
 2098 root      20   0 17216 1248  908 R    2  0.1   0:00.01 top
    1 root      20   0 24160 2080 1152 S    0  0.1   0:00.46 init
    2 root      20   0     0    0    0 S    0  0.0   0:00.00 kthreadd
    3 root      20   0     0    0    0 S    0  0.0   0:00.00 ksoftirqd/0
    5 root      20   0     0    0    0 S    0  0.0   0:00.20 kworker/u:0

~# bwm-ng
bwm-ng v0.6 (probing every 0.500s), press 'h' for help
  input: libstatdisk type: avg (30s)
  \         iface                   Rx                   Tx                Total
  ==============================================================================
              sda:           0.00 KB/s         1471.86 KB/s         1471.86 KB/s
  ------------------------------------------------------------------------------
            total:           0.00 KB/s         1471.86 KB/s         1471.86 KB/s


~# dd if=/dev/zero of=/tmp/zero1.tmp bs=1024 count=10000000
10000000+0 records in
10000000+0 records out
10240000000 bytes (10 GB) copied, 1196,41 s, 8,6 MB/s

~# df -m /tmp
Filesystem     1M-blocks  Used Available Use% Mounted on
/dev/sda5          15248 12312      2937  81% /

~# mount
/dev/sda5 on / type xfs (rw)


<---------->


With 3.0.0 all 10GB go in a few seconds.

~# cat /proc/version
Linux version 3.0.0-13-generic (buildd at crested) (gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) ) #22-Ubuntu SMP Wed Nov 2 13:27:26 UTC 2011

~# top
top - 11:51:19 up 1 min,  2 users,  load average: 0.62, 0.21, 0.08
Tasks: 113 total,   3 running, 110 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.8%us, 13.6%sy,  0.0%ni,  5.1%id, 80.4%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1794104k total,  1730888k used,    63216k free,      504k buffers
Swap:  7812092k total,        0k used,  7812092k free,  1576080k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2134 root      20   0 11500  632  516 R   27  0.0   0:04.10 dd
 1341 root      20   0     0    0    0 D    3  0.0   0:00.48 flush-8:0
   10 root      20   0     0    0    0 S    2  0.0   0:00.24 kworker/0:1
   46 root      20   0     0    0    0 R    1  0.0   0:00.04 kswapd0
 2136 root      20   0     0    0    0 S    0  0.0   0:00.06 kworker/0:2

~# bwm-ng
 bwm-ng v0.6 (probing every 0.500s), press 'h' for help
  input: libstatdisk type: avg (30s)
  |         iface                   Rx                   Tx                Total
  ==============================================================================
              sda:           0.00 KB/s        91107.56 KB/s        91107.56 KB/s
  ------------------------------------------------------------------------------
            total:           0.00 KB/s        91107.56 KB/s        91107.56 KB/s

~# dd if=/dev/zero of=/tmp/zero1.tmp bs=1024 count=10000000
10000000+0 records in
10000000+0 records out
10240000000 bytes (10 GB) copied, 107,759 s, 95,0 MB/s


/proc/cpuinfo:
--------------
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 6
model name      : AMD Athlon(tm) II X2 B24 Processor
stepping        : 3
microcode       : 0x10000b6
cpu MHz         : 800.000
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
bogomips        : 5985.14
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 6
model name      : AMD Athlon(tm) II X2 B24 Processor
stepping        : 3
microcode       : 0x10000b6
cpu MHz         : 800.000
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
bogomips        : 5985.02
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate






More information about the kernel-team mailing list