From: Alexey Tourbin <at@altlinux.ru>
To: ALT Linux Team development discussions <devel@lists.altlinux.org>
Subject: Re: [devel] rpm: rsyncable deflate vs LZMA
Date: Fri, 30 May 2008 13:27:26 +0400
Message-ID: <20080530092725.GT7996@solemn.turbinal> (raw)
In-Reply-To: <20080529215609.GA20209@wo.int.altlinux.org>
[-- Attachment #1.1: Type: text/plain, Size: 1500 bytes --]
On Fri, May 30, 2008 at 01:56:10AM +0400, Dmitry V. Levin wrote:
> On Fri, May 30, 2008 at 01:31:14AM +0400, Alexey Tourbin wrote:
> [...]
> > У меня есть идея. Для выбора точек синхронизации (gzflush) можно
> > использовать не только "слепой" rsync hint, но и cpio hint -- как
> > только мы видим cpio magic "070707", мы знаем, что через несколько
> > байтов будет mtime и потом пойдёт имя и содержимое файла. То есть
> > sync можно делать в месте окончания очередного cpio header.
>
> Это заметно снизит степень сжатия, когда в архиве много маленьких файлов?
Можно оценить деградацию сжатия от уменьшения deflate блока.
$ rpm2cpio /ALT/Sisyphus/files/x86_64/RPMS/glibc-core-2.5.1-alt5.x86_64.rpm |gzip -9 |wc -c
1488757
$ gcc -DBUFSIZE=$((8*1024)) gztest.c -lz
$ rpm2cpio /ALT/Sisyphus/files/x86_64/RPMS/glibc-core-2.5.1-alt5.x86_64.rpm |./a.out |wc -c
1488040
$ gcc -DBUFSIZE=$((4*1024)) gztest.c -lz
$ rpm2cpio /ALT/Sisyphus/files/x86_64/RPMS/glibc-core-2.5.1-alt5.x86_64.rpm |./a.out |wc -c
1506758
$ gcc -DBUFSIZE=$((2*1024)) gztest.c -lz
$ rpm2cpio /ALT/Sisyphus/files/x86_64/RPMS/glibc-core-2.5.1-alt5.x86_64.rpm |./a.out |wc -c
1544170
$ gcc -DBUFSIZE=$((1*1024)) gztest.c -lz
$ rpm2cpio /ALT/Sisyphus/files/x86_64/RPMS/glibc-core-2.5.1-alt5.x86_64.rpm |./a.out |wc -c
1598928
$
Здесь deflate блок создаётся на каждые 8K, 4K, 2K, 1K входного потока.
Потери составляют 0%, 1.2%, 3.7%, 7.4%. На данных, которые сжимаются
лучше, потери могут быть выше.
[-- Attachment #1.2: gztest.c --]
[-- Type: text/plain, Size: 362 bytes --]
#include <zlib.h>
#include <stdio.h>
#include <assert.h>
#ifndef BUFSIZE
#define BUFSIZE BUFSIZ
#endif
int main()
{
char buf[BUFSIZE];
gzFile gz = gzdopen(fileno(stdout), "w9");
assert(gz);
int n;
while ((n = fread(buf, 1, sizeof(buf), stdin))) {
int m = gzwrite(gz, buf, n);
assert(n == m);
gzflush(gz, Z_SYNC_FLUSH);
}
gzclose(gz);
return 0;
}
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
next prev parent reply other threads:[~2008-05-30 9:27 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-29 12:38 Alexey Tourbin
2008-05-29 13:28 ` Alexander Bokovoy
2008-05-29 16:50 ` Alexey Tourbin
2008-05-29 18:37 ` Dmitry V. Levin
2008-05-29 19:50 ` Alexey Tourbin
2008-05-29 20:13 ` Alexey Tourbin
2008-05-29 20:28 ` Led
2008-05-29 20:42 ` Alexey Tourbin
2008-05-29 20:16 ` Alexander Bokovoy
2008-05-29 21:31 ` Alexey Tourbin
2008-05-29 21:56 ` Dmitry V. Levin
2008-05-29 23:23 ` Alexey Tourbin
2008-05-30 21:31 ` Alexey Tourbin
2008-05-31 10:09 ` [devel] rsyncability test: openoffice Alexey Tourbin
2008-05-30 9:27 ` Alexey Tourbin [this message]
2008-05-30 8:21 ` [devel] rpm: rsyncable deflate vs LZMA Anton V. Boyarshinov
2008-05-30 11:28 ` Alexey Tourbin
2008-05-30 10:44 ` Anton Farygin
2008-05-30 12:07 ` Alexander Bokovoy
2008-05-30 15:03 ` Anton V. Boyarshinov
2008-05-30 15:09 ` Dmitry V. Levin
2008-05-30 15:17 ` Anton V. Boyarshinov
2008-05-30 15:25 ` Mikhail Gusarov
2008-05-30 15:32 ` Anton V. Boyarshinov
2008-05-30 15:37 ` Mikhail Gusarov
2008-06-01 12:06 ` Anton Farygin
2008-05-31 10:25 ` Alexey Tourbin
2008-05-31 16:59 ` Kirill A. Shutemov
2008-06-01 0:33 ` Alexey Tourbin
2008-06-01 13:07 ` Mikhail Gusarov
2008-06-01 18:08 ` [devel] [JT] fortunezilla :) Michael Shigorin
2008-06-02 1:44 ` Sergey Balbeko
2008-06-02 5:06 ` Mikhail Gusarov
2008-06-02 7:54 ` Alexey I. Froloff
2008-06-02 8:21 ` Michael Shigorin
2008-06-01 19:05 ` [devel] rpm: rsyncable deflate vs LZMA Alexey I. Froloff
2008-05-30 11:47 ` Anton V. Boyarshinov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080530092725.GT7996@solemn.turbinal \
--to=at@altlinux.ru \
--cc=devel@lists.altlinux.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
ALT Linux Team development discussions
This inbox may be cloned and mirrored by anyone:
git clone --mirror http://lore.altlinux.org/devel/0 devel/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 devel devel/ http://lore.altlinux.org/devel \
devel@altlinux.org devel@altlinux.ru devel@lists.altlinux.org devel@lists.altlinux.ru devel@linux.iplabs.ru mandrake-russian@linuxteam.iplabs.ru sisyphus@linuxteam.iplabs.ru
public-inbox-index devel
Example config snippet for mirrors.
Newsgroup available over NNTP:
nntp://lore.altlinux.org/org.altlinux.lists.devel
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git