From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Sat, 11 Apr 2020 18:21:01 +0300 From: "Vladimir D. Seleznev" To: ALT Linux Team development discussions Message-ID: <20200411152101.GA1624106@portlab> References: <20200410231044.1436970-1-vseleznv@altlinux.org> <20200411110425.GG3341@glebfm.cloud.tilaa.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200411110425.GG3341@glebfm.cloud.tilaa.com> User-Agent: Mutt/1.10.1 (2018-07-13) Subject: Re: [devel] RFC: girar: optimize rebuild X-BeenThere: devel@lists.altlinux.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: ALT Linux Team development discussions List-Id: ALT Linux Team development discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Apr 2020 15:21:02 -0000 Archived-At: List-Archive: List-Post: On Sat, Apr 11, 2020 at 02:04:25PM +0300, Gleb Fotengauer-Malinovskiy wrote: > Hi, > > On Sat, Apr 11, 2020 at 02:10:42AM +0300, Vladimir D. Seleznev wrote: > > > > Hi! > > > > The first part of rebuilt packages optimization for girar. It introduces > > pkg_identity() and simple optimization of the rebuilt sourcerpm. > > Why do we rebuild source rpm at all when we already have one? I mean, > when we use hasher with --query-repackage this new rebuilt source rpm is > no better then original one. > > I think we can always save the original source rpm when we rebuild > a package or copy it from branch to branch (like we actually do for > packages originally built from src.rpm-s). I'm sorry, I was not clear. Sure when a package is built from the sourcerpm, no optimization is required in this case as girar saves only original sourcerpm. The different things happen when package is built from the gear. In the case when package is rebuilt from the gear, girar produce new source and binary rpms, and when the rebuilt task is done it saves all these new source and binary rpms. The proposed optimization is aimed for that case. > > pkg_identity() takes RPM package and returns a value called package identity, > > a hash of subset of RPM package header. That subset is the entire header > > without some nonessential artifacts like buildhost, buildtime, header hashsum, > > etc. > > > > The two package builds of the same NEVR might have equal or different > > package identities. The equal identities mean that build results of these > > packages are equal too, that allows build optimization. The practical > > example of simple rebuilt sourcerpm optimization also introduced. > > Did you consider adding all this identity logic on the rpm's side (as a > standalone helper may be)? I personally don't like the whole idea > of tracking rpm tags status on girar side. Also, this helper may be > useful outside of girar. I did, but it's a bit complicated. RPM community likes the idea, but there is no consensus about how it should work. Sure each project can realize it by its own specific way. So, whether we should calculate the package identity in the girar side or the rpm side? If it should be on rpm side, should it support rpm 4.0.4? > > The future work can be about optimization of "copied" to another branch > > sourcerpm with retrieved from archive sourcerpm, and binary packages > > optimization (this case has an issue when binary subpackages are mixed > > archs, i.e. arch and noarch, this probably could work only with single-arch > > builds). > > Looks like a good plan. I think optimization of binary packages is more > important then optimization which looks for archived packages. > We may want to take binary packages from archive too anyway. Ok. -- WBR, Vladimir D. Seleznev