From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on sa.local.altlinux.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=N0V8iVX/EtrZgRZDJP4hCCvabZIWIbCieLVhB5kLDDk=; b=cfe64pwgVekydp1IQ6+evynnnxiyttUTlVSJjcPw640qAk4b+Ao5HymaMWauoYO1zy iwulpFUa/AhggqPZQkXpZB8mgdnICnNUFbaLduGJ5LumFU01ATbiVYivXhLsiKgIHQ3m 96N6p4b9w+ixckS1jjexQr46lTArswqzmU4VoAU59Y58k1ieOsBbKWYGvkEOb7Sdwd2Y rExeUu/h4GT3RB5qcbF32RIM/e0QjshRGV6ypIZ1854nyU8s00TR2mMzJpb29Ss+T7nu wegUKroN0yG6YC+TfWOkrpYJkt/kLG/TjcsfXB2Jz7p9nz5Scx6zaN8KmxQuh40/u2c4 t1Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=N0V8iVX/EtrZgRZDJP4hCCvabZIWIbCieLVhB5kLDDk=; b=KY6Rqw6V+3jt92jS6DQj9a4AySBBIm5GxiRHs8X19iLBaGElZB00vjS1e84k2SsafG 3AuvLH/SRz/GegXqYbmG6tjhj9eOjup77PE7baagAapO57JiVrunl6JGBBqnHHXdg2yC QyjbCsXVoF/epB2fd5wuhbEWt0zqKJITc6w4qUiMaqMp/4vfe1J/rRMvxJnCX+j7OR8f Rc1mq5P/w1M08OTWt9xdw8OYjP17j/3hKzc44uBGRCd72WirXpGSQs+67meF9Vk9mp4M eX5fsHU7zZx7vvoWrrDfZsK2uwtbLIBVciPdXamUO51uEvvSP9zGti86eTCmcmyjSi8W UORg== X-Gm-Message-State: AGi0PuYNIaM2q89em0L25kGe2E7AacFpOJdqlREoEN2ttdqJt/LxEkW6 YpYsAq+QnXo384pLaIN1zkUX0VDWIiLDFWTdM4iY912E X-Google-Smtp-Source: APiQypIz9FBhhYb9tBrz6NZoDv2be6moLawChFJcOcEKKV1keV/1L0ZyUGjbqWt5chKR3De83x5rVkdG2m2LpCcOt/c= X-Received: by 2002:a5d:9281:: with SMTP id s1mr4131552iom.197.1586604606980; Sat, 11 Apr 2020 04:30:06 -0700 (PDT) MIME-Version: 1.0 References: <20200410231044.1436970-1-vseleznv@altlinux.org> <20200410231044.1436970-3-vseleznv@altlinux.org> In-Reply-To: <20200410231044.1436970-3-vseleznv@altlinux.org> From: Alexey Tourbin Date: Sat, 11 Apr 2020 14:29:55 +0300 Message-ID: To: ALT Linux Team development discussions Content-Type: text/plain; charset="UTF-8" Subject: Re: [devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo X-BeenThere: devel@lists.altlinux.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: ALT Linux Team development discussions List-Id: ALT Linux Team development discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Apr 2020 11:30:11 -0000 Archived-At: List-Archive: List-Post: On Sat, Apr 11, 2020 at 2:11 AM Vladimir D. Seleznev wrote: > +osrpm_identity= > +osrpm="$GB_REPO_DIR/files/SRPMS/$srpmsu" > +if [ -f "$osrpm" ]; then > + echo >&2 "$I: Found $srpmsu in the repo, this means the package was rebuilt" > + osrpm_identity="$(pkg_identity "$osrpm")" > +fi > + > for arch in $GB_ARCH; do > [ -d "$arch/srpm" -o ! -s "$arch/excluded" ] || continue > f="$arch/srpm/$srpmsu" > [ -f "$f" ] || continue > + srpm_identity="$(pkg_identity "$f")" > + echo >&2 "$I: $arch $srpmsu identity = $srpm_identity" > + # non-empty $osrpm_identity means the NEVR was rebuilt > + # optimize rebuilt sourcerpm if identities of original and rebuilt sourcerpms are equal > + if [ -n "$osrpm_identity" ] && > + [ "$osrpm_identity" = "$srpm_identity" ]; then > + echo >&2 "$I: $arch: optimize rebuilt $srpmsu cause its identity is equal to $srpmsu in the repo" > + install -p "$osrpm" "$f" > + fi > built_pkgname="$(rpmquery --qf '%{name}' -p -- "$f")" > echo "$built_pkgname" > pkgname > break So how does it work in practice? Suppose I first uploaded a .src.rpm package. Do we store the original src.rpm, the one with the uploader's signature? When it gets rebuilt, this should not affect the original .src.rpm (as if it was uploaded again). No special handling is required in this case. Then suppose I build a gearifeid package from Sisyphus for p9. But your code only handles GB_REPO_DIR, not the NEIGHBOUR_REPO_DIR the package comes from. To be clear, that information is lost: when you request to build a signed tag from /gears, it does not imply that there is a corresponding .src.rpm in any REPO_DIR. There is already a problem with cross-repo copying: if done in earnest, both repos need to be locked. And of course this is deadlock-prone. You can do better without any locking if you identify every package in all repos with your new identity hash. This can be done relatively easy, since you already have that big content-addressable storage. You can hardlink it into a shadow identity-addressable storage. Once you've done that, you obtain the global / beatific vision: given a package, you instantly know if you have already seen something like this. (On the second thought: you don't need locking because the -f test is atomic and files cannot be removed from the storage, but there will still be race conditions. It's not too bad in practice. Further those race conditions can be detected at the task-commit stage.) There is one specific problem with the outlined approach: the notion of identity is flawed, because the disttag may or may not matter. Sometimes you cannot substitute a package for another package with the same identity but a different disttag. Specifically this is the case with strict dependencies between subpackages. You cannot substitute a subpackage unless you also substitute all the other subpackages. This is further complicated by noarch subpackages: you need to coordinate substitution across architectures.