[devel] perlbench: -Os vs -O2

ALT Linux Team development discussions
 help / color / mirror / Atom feed

From: Alexey Tourbin <at@altlinux.ru>
To: devel@altlinux.ru
Subject: [devel] perlbench: -Os vs -O2
Date: Mon, 22 May 2006 21:57:53 +0400
Message-ID: <20060522175753.GE23861@localhost.localdomain> (raw)

[-- Attachment #1: Type: text/plain, Size: 5690 bytes --]

Пока окончательно не забыл.
В perl58.spec есть такой комментарий:

# Custom optimization.  This will affect all perl binary modules.  Why really?
# Because the whole perlapi(1) is heavily based on macros.  Here is an example:
# $ echo "SvPV(x,y)" | gcc -include /usr/lib/perl5/i386-linux/CORE/perl.h -E - | tail -1
# (((x)->sv_flags & (0x00040000)) == 0x00040000 ? ((y = ((XPV*) (x)->sv_any)->xpv_cur), ((XPV*) (x)->sv_any)->xpv_pv) : Perl_sv_2pv_flags(my_perl, x,&y,2))
%define _optlevel s
%add_optflags -D_GNU_SOURCE -momit-leaf-frame-pointer

Теперь количественные характеристики эксперимента.  За основу был взят
пакет perlbench http://search.cpan.org/dist/perlbench/  Этот пакет с
высокой точностью замеряет время выполнения некоторых примитивов в
libperl.

Собирался perl-5.8.7 на прошлогоднем сизифе.  Замеры произведены на моем
моем новом компьютере, который мне подарил Дима Левин.  Процессор AMD
Athlon(tm) 64 Processor 3200+, 2050.195 MHz, 512 KB cache size, 4108.95 bogomips.

Результаты для -O2:

arith/mixed               34
arith/trig                42
array/copy               113
array/foreach             39
array/index               55
array/pop                 57
array/shift               58
array/sort-num            70
array/sort                36
call/0arg                 64
call/1arg                 62
call/2arg                 56
call/9arg                 91
call/empty                42
call/fib                  72
call/method               65
call/wantarray            87
hash/copy                 83
hash/each                 81
hash/foreach-sort         43
hash/foreach              52
hash/get                  35
hash/set                  37
loop/for-c                38
loop/for-range-const      73
loop/for-range            48
loop/getline              63
loop/while-my             45
loop/while                43
re/const                  40
re/w                      41
startup/fewmod             8
startup/lotsofsub         20
startup/noprog            11
string/base64            110
string/htmlparser         90
string/index-const        76
string/index-var          68
string/ipol               47
string/tr                 54

AVERAGE                   56

Результаты для -Os:

arith/mixed               38
arith/trig                50
array/copy               115
array/foreach             46
array/index               60
array/pop                 69
array/shift               69
array/sort-num            87
array/sort                48
call/0arg                 71
call/1arg                 69
call/2arg                 65
call/9arg                103
call/empty                45
call/fib                  74
call/method               70
call/wantarray            97
hash/copy                103
hash/each                 91
hash/foreach-sort         51
hash/foreach              63
hash/get                  42
hash/set                  44
loop/for-c                41
loop/for-range-const      75
loop/for-range            50
loop/getline              76
loop/while-my             49
loop/while                45
re/const                  60
re/w                      49
startup/fewmod            10
startup/lotsofsub         25
startup/noprog            15
string/base64            135
string/htmlparser        108
string/index-const        99
string/index-var          85
string/ipol               54
string/tr                 67

AVERAGE                   65

Большее число avarage соответствует более быстрому перлу.  При сборке
с -Os прирост производительности более 10%.

Далее, руководствуясь некоторыми соображениями, я собрал pp_hot.o с
оптимизацией -O2, а всё остальное с -Os (в файле pp_hot.o находятся
наиболее часто используемые внутренние функции).  Вот результат:

arith/mixed               37
arith/trig                49
array/copy               119
array/foreach             43
array/index               64
array/pop                 60
array/shift               62
array/sort-num            74
array/sort                41
call/0arg                 68
call/1arg                 66
call/2arg                 62
call/9arg                 98
call/empty                46
call/fib                  73
call/method               67
call/wantarray            88
hash/copy                 99
hash/each                 85
hash/foreach-sort         47
hash/foreach              58
hash/get                  39
hash/set                  44
loop/for-c                41
loop/for-range-const      79
loop/for-range            51
loop/getline              76
loop/while-my             47
loop/while                46
re/const                  52
re/w                      52
startup/fewmod            10
startup/lotsofsub         24
startup/noprog            13
string/base64            129
string/htmlparser        105
string/index-const        89
string/index-var          80
string/ipol               51
string/tr                 66

AVERAGE                   63

Видно, что результат несколько ухудшился.  Отсюда можно сделать вывод,
что практически без исключений весь перловый код выгоднее компилировать
с -Os, а не с -O2.

Своего рода объяснение полученным результатам можно прочитать здесь:
http://www.faqs.org/docs/artu/ch12s03.html  "The most effective way to
optimize your code is to keep it small and simple."  Это касается в том
числе и бинарного кода.  (Далее приводится правдоподобное с технической
точки зрения объяснение: "Processor cycles are almost free", а любое
непопадание в кеш сводит на нет всё оптимизацию по переупорядочиванию
инструкций).

[-- Attachment #2: Type: application/pgp-signature, Size: 191 bytes --]

next             reply	other threads:[~2006-05-22 17:57 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-22 17:57 Alexey Tourbin [this message]
2006-05-23  7:15 ` Anton Farygin
2006-05-23  7:48   ` Victor Forsyuk
2006-05-23 10:39   ` Igor Zubkov
2006-05-23 10:54     ` Alexey Tourbin
2006-05-23  8:57 ` Led
2006-05-23  9:30   ` Alexey Tourbin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060522175753.GE23861@localhost.localdomain \
    --to=at@altlinux.ru \
    --cc=devel@altlinux.ru \
    --cc=devel@lists.altlinux.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

ALT Linux Team development discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://lore.altlinux.org/devel/0 devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 devel devel/ http://lore.altlinux.org/devel \
		devel@altlinux.org devel@altlinux.ru devel@lists.altlinux.org devel@lists.altlinux.ru devel@linux.iplabs.ru mandrake-russian@linuxteam.iplabs.ru sisyphus@linuxteam.iplabs.ru
	public-inbox-index devel

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://lore.altlinux.org/org.altlinux.lists.devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git