ALT Linux Team development discussions
 help / color / mirror / Atom feed
* [devel] [PATCH] pack.c: downgrade XZ->LZMA automatically for small payloads
@ 2020-12-19 22:43 Alexey Tourbin
  2020-12-20  0:38 ` Dmitry V. Levin
  0 siblings, 1 reply; 4+ messages in thread
From: Alexey Tourbin @ 2020-12-19 22:43 UTC (permalink / raw)
  To: devel; +Cc: Alexey Tourbin

Recent changes left downgradeLzmaLevel dysfunctional (because it does
not recognize 'T').  In my view, the only advantage of XZ over LZMA is
that XZ can can split input into blocks and compress them in parallel
(resulting in a speed-up).  Other advantages, such as a checksum, are
immaterial for our purpose (because the package manager must verify the
integrity of a package beforehand).  Therefore, downgradeLzmaLevel will
also downgrade XZ->LZMA automatically, for payloads smaller than
5*dictSize (the default blockSize being 3*dictSize, so that there are
at least two blocks, the second block not too small).
---
 build/pack.c | 54 ++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 42 insertions(+), 12 deletions(-)

diff --git a/build/pack.c b/build/pack.c
index 23ad89e72..a779e7918 100644
--- a/build/pack.c
+++ b/build/pack.c
@@ -416,30 +416,60 @@ static uint64_t calcArchiveSize(TFI_t fi)
 // which is 8-64M.  For smaller inputs, levels 7-9 are downgraded automatically.
 static void downgradeLzmaLevel(char *mode, uint64_t archiveSize)
 {
-#define C(c) if (!(c)) return
-    C(mode[1] == '7' || mode[1] == '8' || mode[1] == '9');
-    C(mode[2] == '.');
-    C(mode[3] == 'l' || mode[3] == 'x');
-    C(mode[4] == 'z');
+    if (!(mode[1] >= '0' && mode[1] <= '9'))
+	return;
+    char *p = &mode[2];
+    if (*p == 'T') {
+	do
+	    p++;
+	while (*p >= '0' && *p <= '9');
+    }
+    if (!(*p == '.' && (p[1] != 'l' || p[1] == 'x') && p[2] == 'z'))
+	return;
 #define S(m) ((m << 20) + (m << 10))
+    // For small payloads, downgrade XZ->LZMA automatically.  XZ only makes
+    // sense in multi-threaded mode (to speed up compression).  The default
+    // block size in multi-threaded mode is three times the dictionary size.
+    // There has to be at least two blocks, the second block not too small,
+    // so we require at least five times the dictionary size.
+#define T(m) ((m << 20) * 5)
+#define ForceLzma memcpy(&mode[2], ".lzdio", 7)
     switch (mode[1]) {
     case '9':
-	if (archiveSize > S(32))
-	    break;
+	if (archiveSize < T(64)) ForceLzma;
+	if (archiveSize > S(32)) break;
 	mode[1] = '8';
 	/*@fallthrough@*/
     case '8':
-	if (archiveSize > S(16))
-	    break;
+	if (archiveSize < T(32)) ForceLzma;
+	if (archiveSize > S(16)) break;
 	mode[1] = '7';
 	/*@fallthrough@*/
     case '7':
-	if (archiveSize > S(8))
-	    break;
+	if (archiveSize < T(16)) ForceLzma;
+	if (archiveSize > S(8)) break;
 	mode[1] = '6';
+	/*@fallthrough@*/
+    case '6':
+    case '5':
+	if (archiveSize < T(8)) ForceLzma;
+	break;
+    case '4':
+    case '3':
+	if (archiveSize < T(4)) ForceLzma;
+	break;
+    case '2':
+	if (archiveSize < T(2)) ForceLzma;
+	break;
+    case '1':
+	if (archiveSize < T(1)) ForceLzma;
+	break;
+    case 0:
+	// Dictionary size is 256K, but block size is 1M.
+	if (archiveSize < (7<<18)) ForceLzma;
     }
-#undef C
 #undef S
+#undef T
 }
 
 int writeRPM(Header *hdrp, const char *fileName, int type,
-- 
2.25.4



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [devel] [PATCH] pack.c: downgrade XZ->LZMA automatically for small payloads
  2020-12-19 22:43 [devel] [PATCH] pack.c: downgrade XZ->LZMA automatically for small payloads Alexey Tourbin
@ 2020-12-20  0:38 ` Dmitry V. Levin
  2020-12-20  0:49   ` Alexey Tourbin
  0 siblings, 1 reply; 4+ messages in thread
From: Dmitry V. Levin @ 2020-12-20  0:38 UTC (permalink / raw)
  To: Alexey Tourbin; +Cc: devel

On Sun, Dec 20, 2020 at 01:43:47AM +0300, Alexey Tourbin wrote:
> Recent changes left downgradeLzmaLevel dysfunctional (because it does
> not recognize 'T').  In my view, the only advantage of XZ over LZMA is
> that XZ can can split input into blocks and compress them in parallel
> (resulting in a speed-up).  Other advantages, such as a checksum, are
> immaterial for our purpose (because the package manager must verify the
> integrity of a package beforehand).  Therefore, downgradeLzmaLevel will
> also downgrade XZ->LZMA automatically, for payloads smaller than
> 5*dictSize (the default blockSize being 3*dictSize, so that there are
> at least two blocks, the second block not too small).
> ---
>  build/pack.c | 54 ++++++++++++++++++++++++++++++++++++++++------------
>  1 file changed, 42 insertions(+), 12 deletions(-)
> 
> diff --git a/build/pack.c b/build/pack.c
> index 23ad89e72..a779e7918 100644
> --- a/build/pack.c
> +++ b/build/pack.c
> @@ -416,30 +416,60 @@ static uint64_t calcArchiveSize(TFI_t fi)
>  // which is 8-64M.  For smaller inputs, levels 7-9 are downgraded automatically.
>  static void downgradeLzmaLevel(char *mode, uint64_t archiveSize)
>  {
> -#define C(c) if (!(c)) return
> -    C(mode[1] == '7' || mode[1] == '8' || mode[1] == '9');
> -    C(mode[2] == '.');
> -    C(mode[3] == 'l' || mode[3] == 'x');
> -    C(mode[4] == 'z');
> +    if (!(mode[1] >= '0' && mode[1] <= '9'))
> +	return;
> +    char *p = &mode[2];
> +    if (*p == 'T') {
> +	do
> +	    p++;
> +	while (*p >= '0' && *p <= '9');
> +    }
> +    if (!(*p == '.' && (p[1] != 'l' || p[1] == 'x') && p[2] == 'z'))

Did you mean p[1] == 'l' here?


-- 
ldv


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [devel] [PATCH] pack.c: downgrade XZ->LZMA automatically for small payloads
  2020-12-20  0:38 ` Dmitry V. Levin
@ 2020-12-20  0:49   ` Alexey Tourbin
  2020-12-20  8:11     ` Dmitry V. Levin
  0 siblings, 1 reply; 4+ messages in thread
From: Alexey Tourbin @ 2020-12-20  0:49 UTC (permalink / raw)
  To: ALT Linux Team development discussions

On Sun, Dec 20, 2020 at 3:38 AM Dmitry V. Levin <ldv@altlinux.org> wrote:
> > +    if (!(*p == '.' && (p[1] != 'l' || p[1] == 'x') && p[2] == 'z'))
>
> Did you mean p[1] == 'l' here?

Yes, thanks for spotting.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [devel] [PATCH] pack.c: downgrade XZ->LZMA automatically for small payloads
  2020-12-20  0:49   ` Alexey Tourbin
@ 2020-12-20  8:11     ` Dmitry V. Levin
  0 siblings, 0 replies; 4+ messages in thread
From: Dmitry V. Levin @ 2020-12-20  8:11 UTC (permalink / raw)
  To: devel

On Sun, Dec 20, 2020 at 03:49:42AM +0300, Alexey Tourbin wrote:
> On Sun, Dec 20, 2020 at 3:38 AM Dmitry V. Levin wrote:
> > > +    if (!(*p == '.' && (p[1] != 'l' || p[1] == 'x') && p[2] == 'z'))
> >
> > Did you mean p[1] == 'l' here?
> 
> Yes, thanks for spotting.

Applied with this amendment, thanks!


-- 
ldv


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-12-20  8:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-19 22:43 [devel] [PATCH] pack.c: downgrade XZ->LZMA automatically for small payloads Alexey Tourbin
2020-12-20  0:38 ` Dmitry V. Levin
2020-12-20  0:49   ` Alexey Tourbin
2020-12-20  8:11     ` Dmitry V. Levin

ALT Linux Team development discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://lore.altlinux.org/devel/0 devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 devel devel/ http://lore.altlinux.org/devel \
		devel@altlinux.org devel@altlinux.ru devel@lists.altlinux.org devel@lists.altlinux.ru devel@linux.iplabs.ru mandrake-russian@linuxteam.iplabs.ru sisyphus@linuxteam.iplabs.ru
	public-inbox-index devel

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://lore.altlinux.org/org.altlinux.lists.devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git