ALT Linux Team development discussions
 help / color / mirror / Atom feed
From: Alexey Tourbin <alexey.tourbin@gmail.com>
To: devel@lists.altlinux.org
Cc: Alexey Tourbin <alexey.tourbin@gmail.com>
Subject: [devel] [PATCH] pack.c: downgrade XZ->LZMA automatically for small payloads
Date: Sun, 20 Dec 2020 01:43:47 +0300
Message-ID: <20201219224347.230119-1-alexey.tourbin@gmail.com> (raw)

Recent changes left downgradeLzmaLevel dysfunctional (because it does
not recognize 'T').  In my view, the only advantage of XZ over LZMA is
that XZ can can split input into blocks and compress them in parallel
(resulting in a speed-up).  Other advantages, such as a checksum, are
immaterial for our purpose (because the package manager must verify the
integrity of a package beforehand).  Therefore, downgradeLzmaLevel will
also downgrade XZ->LZMA automatically, for payloads smaller than
5*dictSize (the default blockSize being 3*dictSize, so that there are
at least two blocks, the second block not too small).
---
 build/pack.c | 54 ++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 42 insertions(+), 12 deletions(-)

diff --git a/build/pack.c b/build/pack.c
index 23ad89e72..a779e7918 100644
--- a/build/pack.c
+++ b/build/pack.c
@@ -416,30 +416,60 @@ static uint64_t calcArchiveSize(TFI_t fi)
 // which is 8-64M.  For smaller inputs, levels 7-9 are downgraded automatically.
 static void downgradeLzmaLevel(char *mode, uint64_t archiveSize)
 {
-#define C(c) if (!(c)) return
-    C(mode[1] == '7' || mode[1] == '8' || mode[1] == '9');
-    C(mode[2] == '.');
-    C(mode[3] == 'l' || mode[3] == 'x');
-    C(mode[4] == 'z');
+    if (!(mode[1] >= '0' && mode[1] <= '9'))
+	return;
+    char *p = &mode[2];
+    if (*p == 'T') {
+	do
+	    p++;
+	while (*p >= '0' && *p <= '9');
+    }
+    if (!(*p == '.' && (p[1] != 'l' || p[1] == 'x') && p[2] == 'z'))
+	return;
 #define S(m) ((m << 20) + (m << 10))
+    // For small payloads, downgrade XZ->LZMA automatically.  XZ only makes
+    // sense in multi-threaded mode (to speed up compression).  The default
+    // block size in multi-threaded mode is three times the dictionary size.
+    // There has to be at least two blocks, the second block not too small,
+    // so we require at least five times the dictionary size.
+#define T(m) ((m << 20) * 5)
+#define ForceLzma memcpy(&mode[2], ".lzdio", 7)
     switch (mode[1]) {
     case '9':
-	if (archiveSize > S(32))
-	    break;
+	if (archiveSize < T(64)) ForceLzma;
+	if (archiveSize > S(32)) break;
 	mode[1] = '8';
 	/*@fallthrough@*/
     case '8':
-	if (archiveSize > S(16))
-	    break;
+	if (archiveSize < T(32)) ForceLzma;
+	if (archiveSize > S(16)) break;
 	mode[1] = '7';
 	/*@fallthrough@*/
     case '7':
-	if (archiveSize > S(8))
-	    break;
+	if (archiveSize < T(16)) ForceLzma;
+	if (archiveSize > S(8)) break;
 	mode[1] = '6';
+	/*@fallthrough@*/
+    case '6':
+    case '5':
+	if (archiveSize < T(8)) ForceLzma;
+	break;
+    case '4':
+    case '3':
+	if (archiveSize < T(4)) ForceLzma;
+	break;
+    case '2':
+	if (archiveSize < T(2)) ForceLzma;
+	break;
+    case '1':
+	if (archiveSize < T(1)) ForceLzma;
+	break;
+    case 0:
+	// Dictionary size is 256K, but block size is 1M.
+	if (archiveSize < (7<<18)) ForceLzma;
     }
-#undef C
 #undef S
+#undef T
 }
 
 int writeRPM(Header *hdrp, const char *fileName, int type,
-- 
2.25.4



             reply	other threads:[~2020-12-19 22:43 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-19 22:43 Alexey Tourbin [this message]
2020-12-20  0:38 ` Dmitry V. Levin
2020-12-20  0:49   ` Alexey Tourbin
2020-12-20  8:11     ` Dmitry V. Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201219224347.230119-1-alexey.tourbin@gmail.com \
    --to=alexey.tourbin@gmail.com \
    --cc=devel@lists.altlinux.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

ALT Linux Team development discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://lore.altlinux.org/devel/0 devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 devel devel/ http://lore.altlinux.org/devel \
		devel@altlinux.org devel@altlinux.ru devel@lists.altlinux.org devel@lists.altlinux.ru devel@linux.iplabs.ru mandrake-russian@linuxteam.iplabs.ru sisyphus@linuxteam.iplabs.ru
	public-inbox-index devel

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://lore.altlinux.org/org.altlinux.lists.devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git