ALT Linux Team development discussions
 help / color / mirror / Atom feed
From: "Vladimir D. Seleznev" <vseleznv@altlinux.org>
To: devel@lists.altlinux.org
Cc: vseleznv@altlinux.org
Subject: [devel] [PATCH] gb: add gb-task-build-post, optimize packages with identical rebuild
Date: Thu,  4 Jun 2020 22:58:11 +0300
Message-ID: <20200604195811.3881130-2-vseleznv@altlinux.org> (raw)
In-Reply-To: <20200604195811.3881130-1-vseleznv@altlinux.org>

Introduce task post-build processing. It finds subtasks with package
rebuild and if the rebuilt packages identical to the same packages in
the target repo it optimizes them.

Whether subtasks in the particular repository should be check or not for
such optimization is controlling via GB_OPTIMIZE_IDENTICAL_REBUILD
option.

* gb/gb-build-task: Add gb-task-build-post.
* gb/gb-sh-conf: Add GB_OPTIMIZE_IDENTICAL_REBUILD option.
* gb/gb-task-build-post: New file.
* gb/gb-task-build-post-i: Likewise.
---
 gb/gb-build-task        |   1 +
 gb/gb-sh-conf           |   1 +
 gb/gb-task-build-post   |   9 ++++
 gb/gb-task-build-post-i | 114 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 125 insertions(+)
 create mode 100755 gb/gb-task-build-post
 create mode 100755 gb/gb-task-build-post-i

diff --git a/gb/gb-build-task b/gb/gb-build-task
index 0fb076e..5f4b23e 100755
--- a/gb/gb-build-task
+++ b/gb/gb-build-task
@@ -61,6 +61,7 @@ gb-task-set-summary
 	gb-task-copy-packages
 	gb-task-build-prep
 	gb-task-build
+	gb-task-build-post
 	fail_if_task_abort_requested
 	gb-x-girar task-make-index-html "$id" ||:
 
diff --git a/gb/gb-sh-conf b/gb/gb-sh-conf
index f1366fe..52bc79e 100644
--- a/gb/gb-sh-conf
+++ b/gb/gb-sh-conf
@@ -38,3 +38,4 @@ GB_AREPO_DIR="$TMPDIR/gb-arepo-$USER-$GB_REPO_NAME"
 	remote_host="build-$USER-$GB_REPO_NAME-$arch"
 
 : ${GB_ALLOW_SAME_NEVR=}
+: ${GB_OPTIMIZE_IDENTICAL_REBUILD=}
diff --git a/gb/gb-task-build-post b/gb/gb-task-build-post
new file mode 100755
index 0000000..e0af1ff
--- /dev/null
+++ b/gb/gb-task-build-post
@@ -0,0 +1,9 @@
+#!/bin/sh -efu
+
+. gb-sh-functions
+
+fail_if_task_abort_requested
+
+for i in $(src_nums); do
+	$0-i "$i"
+done
diff --git a/gb/gb-task-build-post-i b/gb/gb-task-build-post-i
new file mode 100755
index 0000000..104c294
--- /dev/null
+++ b/gb/gb-task-build-post-i
@@ -0,0 +1,114 @@
+#!/bin/sh -efu
+
+. gb-sh-functions
+. gb-sh-tmpdir
+
+[ -n "$GB_OPTIMIZE_IDENTICAL_REBUILD" ] ||
+	exit 0
+
+i="$1"; shift
+query_filename='%{name}-%{version}-%{release}.%{arch}.rpm'
+
+is_rebuild=
+st_is_rebuild()
+{
+	[ -z "$is_rebuild" ] ||
+		return 0
+
+	local arch="$1"; shift
+
+	srpm="$(find "build/$i/$arch/srpm/" \
+		-mindepth 1 -maxdepth 1 -type f \
+		-name '*.rpm' -printf '%f')"
+
+	[ -f "$GB_REPO_DIR/files/SRPMS/$srpm" ] ||
+		return 1
+
+	is_rebuild=1
+
+	return 0
+}
+
+mixed_arches=
+narchs=
+get_narchs()
+{
+	[ -z "$narchs" ] ||
+		return 0
+
+	narchs="$(find "build/$i/$arch/rpms/" \
+		-mindepth 1 -maxdepth 1 \
+		-type f -name '*.rpm' \
+		-execdir rpmquery --qf '%{arch}\n' -p '{}' '+' |
+		sort -u |wc -l)"
+
+	[ "$narchs" -eq 1 ] ||
+		mixed_arches=1
+}
+
+stamp_echo >&2 "#$i: Starting post build processing"
+
+first_arch=1
+for arch in $GB_ARCH; do
+	[ ! -s "build/$i/$arch/excluded" ] ||
+		continue
+
+	# exit if subtask is not a rebuild
+	st_is_rebuild "$arch" || {
+		stamp_echo >&2 "#$i: subtask is not a rebuild, skipping"
+		exit 0
+	}
+
+	get_narchs
+
+	find "build/$i/$arch/rpms/" -mindepth 1 -maxdepth 1 -type f -name '*.rpm' \
+		-execdir rpmquery --qf "$query_filename %{arch}\n" \
+		-p '{}' '+' >"$tmpdir/built.$arch.pkgs"
+
+	while read -r brpm barch; do
+		# debuginfo rpms are out of our interest
+		case "$brpm" in
+		*-debuginfo-*) continue ;;
+		esac
+
+		# skip non-first built noarch rpm in mixed arches subtask
+		[ -z "$mixed_arches" ] || [ -z "$first_arch" ] || [ "$barch" != "noarch" ] ||
+			continue
+
+		bid="$(rpmidentity -p "build/$i/$arch/rpms/$brpm")"
+		printf "%s %s %s\n" "$brpm" "$barch" "$bid" >>"$tmpdir/built.identity"
+		rid="$(rpmidentity -p "$GB_REPO_DIR/files/$barch/RPMS/$brpm")"
+		printf "%s %s %s\n" "$brpm" "$barch" "$rid" >>"$tmpdir/repo.identity"
+	done <"$tmpdir/built.$arch.pkgs"
+
+	first_arch=
+done
+
+sort -k 3 -o "$tmpdir/built.identity"{,}
+sort -k 3 -o "$tmpdir/repo.identity"{,}
+join -j 3 -v2 "$tmpdir/built.identity" "$tmpdir/repo.identity" |
+	cut -d' ' -f2 >"$tmpdir/nonidentical.arches"
+
+# Mixed noarch and arches build is a special case: we have to optimize either
+# all packages of all architectures or none of them
+if [ -n "$mixed_arches" -a -s "$tmpdir/nonidentical.arches" ]; then
+	echo >&2 "#$i: non-identical rebuild with mixed arches, skip optimizing"
+	exit 0
+fi
+
+for arch in $GB_ARCH; do
+	[ ! -s "build/$i/$arch/excluded" ] ||
+		continue
+
+	if grep -q "$arch" "$tmpdir/nonidentical.arches"; then
+		echo >&2 "[$arch] #$i rebuild is non-identical, skip optimizing"
+		continue
+	fi
+
+	echo >&2 "[$arch] #$i rebuild is identical, optimize packages"
+	while read brpm barch; do
+		cp -p "$GB_REPO_DIR/files/$barch/RPMS/$brpm" "build/$i/$arch/rpms/$brpm"
+		echo >&2 "[$arch] #$i $brpm is optimized"
+	done <"$tmpdir/built.$arch.pkgs"
+done
+
-- 
2.25.4



  reply	other threads:[~2020-06-04 19:58 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-04 19:58 [devel] Optimize rebuilt subtask packages identical to the packages in the target repo Vladimir D. Seleznev
2020-06-04 19:58 ` Vladimir D. Seleznev [this message]
2020-06-05 13:40   ` [devel] [PATCH] gb: add gb-task-build-post, optimize packages with identical rebuild Alexey Tourbin
2020-06-05 14:22     ` Vladimir D. Seleznev
2020-06-06 13:42       ` Alexey Tourbin
2020-06-13 17:45         ` Dmitry V. Levin
2020-06-13 18:50           ` Andrey Savchenko
2020-06-13 20:48             ` Dmitry V. Levin
2020-06-13 20:57               ` Andrey Savchenko
2020-06-13 21:10                 ` Dmitry V. Levin
2020-06-13 22:30                   ` Andrey Savchenko
2020-06-13 19:32           ` Vladimir D. Seleznev
2020-06-13 21:03             ` Dmitry V. Levin
2020-06-13 22:03               ` Vladimir D. Seleznev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200604195811.3881130-2-vseleznv@altlinux.org \
    --to=vseleznv@altlinux.org \
    --cc=devel@lists.altlinux.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

ALT Linux Team development discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://lore.altlinux.org/devel/0 devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 devel devel/ http://lore.altlinux.org/devel \
		devel@altlinux.org devel@altlinux.ru devel@lists.altlinux.org devel@lists.altlinux.ru devel@linux.iplabs.ru mandrake-russian@linuxteam.iplabs.ru sisyphus@linuxteam.iplabs.ru
	public-inbox-index devel

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://lore.altlinux.org/org.altlinux.lists.devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git