LookasideRepos - daudo/ojdkbuild GitHub Wiki

This page describes the details of git repositories structure in ojdkbuild.

Outline:

Sources in RPM

RPM is a package management system used (among others) in RHEL/CentOS/Fedora GNU/Linux distributions.

Package source code consists of two parts:

  • sources from the upstream project
  • set of patches for these sources, specific to the chosen distribution (and its version)

Patches are stored in Git repository along with .spec file and possible other auxiliary files used during the build process.

Upstream sources are not stored inside the same Git repository, they are bundled into .tar.gz or .tar.xz tarball(s) and stored in compressed form in a separate storage called "lookaside cache". Such tarballs called "pristine sources".

Pristine sources

RPM is designed around the notion of "upstream projects" and "package maintainers".

Upstream projects (zlib as an example) are generally created to be used on a much wider scale than a single Linux distribution. Then such project is packaged for the concrete distro (zlib in CentoOS7 example).

Upstream project maintainers and distro package maintainers are usually different people with a different (and often conflicting) priorities on the development of that project.

Notion of "pristine sources" obtained from upstream project "untouched" allows to draw a clean separation line between the upstream and distro maintainers. If distro maintainer wants to add a small change to project source code (or build scripts etc) he can add such change as am RPM patch. Depending on a project package can have dozens of such patches (nss example).

Using "pristine sources + patches" model allows maintainer to move the project forward, releasing the new upstream version, that may be incompatible with some of the downstream distros. And at the same time allows distro maintainers to adjust new upstream versions for the distro needs.

It also facilitates the development of the project for distro needs. If patches become too big/complicated to maintain or cannot be used any more with a newer upstream versions - in that case distro maintainer must work with an upstream maintainer to move some of the required logic from patches to upstream.

Besides that, such model makes troubleshooting easier due to "traceability". For each change in final (patched) RPM sources it is always clear whether it was brought in by upstream change or by local patch.

Lookaside cache in CentOS 7

"Pristine sources" tarballs are fetched from a "looakside cache" at the beginning of the RPM build process (or during the bundling of SourceRPM - SRPM). Tarballs are usually fetched over HTTP(S). Different distributions have different addressing for storing tarballs in lookaside, different format for tarball meta-descriptors (name+hashsum files) and different tools for fetching tarballs.

ojdkbuild project uses sources from CentOS 7, thus all the following details are specific to CentOS. Details for Fedora may be found in its package maintenance guide.

CentOS 7 (official guide link) stores package sources inside Git repositories hosted at git.centos.org. Each repository has a .<package_name>.metadata file in its root that contains "hashsum"->"name" pairs (nspr example) for all tarballs that should be fetched from a lookaside storage.

Lookaside storage is hosted at git.centos.org/sources/ and tarballs can be fetched from there using HTTP(S) directly without any additional tooling (nspr example each hashsum link on that page corresponds to one tarball).

Manual fetching of tarballs is not convenient (for example, to restore original tarball name its hashsum should be matched against the corresponding metadata file). To make that process easier CentOS project developed centos-git-common tools for common packaging operations. Among these tools is a get_sources.sh script that should be used from the package sources root. This script will examine the .<package_name>.metadata file, will fetch all the tarballs from git.centos.org/sources/ and will store them using the specified names.

Alternatively to get_sources.sh centpkg tool can be used for the same purpose.

RPM and ojdkbuild

ojdkbuild by design uses sources from CentOS 7 lookaside storage. But having Windows as a main target platform makes using RPM build infrastructure non-trivial.

While RPM in theory can be ported to non-Linux platform (and is ported to some Unix-ones) it is designed for Linux. It has a number of assumptions about the tools (usually GNU tools) that must be available on the build system - bash, tar, patch etc. Such tools are not available on Windows (excluding various "*nix compatibility" layers with their various levels of "compatibility").

Process of fetching and patching lookaside tarballs can be "mimicked" on Windows using scripting and additional tools like curl and some GNU ones from one of its Windows ports. Such "mimicking" has been exercised in early (non-public) versions of ojdkbuild but has been scrapped, because idea to "rewrite RPM in batch files" appeared to be a bad idea.

Tarballs vs Git

Historically RPM design decision to store "pristine sources" as tarballs has been done a decade before the widespread use of DVCS (Git et al.). At that time upstream project may or may not have a public source code repo (or any version control at all), but most of the projects provided source releases as tarballs (zlib example).

At the present time majority of upstream projects have public DVCS (usually Git) available (zlib git example) and majority of lookaside tarballs are produced bundling sources from upstream repositories. Thus tarballs in RPM are effectively used as a "transport format" for importing upstream sources. For the projects that use Git the same effect can be achieved including upstream repositories as a Git submodules of the packaging project.

Lookaside repos structure in ojdkbuild

ojdkbuild for it's repositories structure chooses a "middle-ground" between "tarballs in a lookaside storage" and "upstream repos as git submodules".

For each dependency library ojdkbuild creates a lookaside_<libname> repository (nspr example) with the following features:

  • tarballs from CentOS 7 lookaside cache are unpacked and stored in lookaside_<libname> git branch of the repository
  • each version of the tarball is stored on top of the previous version preserving the directory structure, deleted files etc
  • change between the tarballs versions becomes an ordinary "git commit", effectively allowing lookaside_<libname> repo to be a "10000 foot view" of the history of original upstream repository
  • RPM patches are applied to the master branch of lookaside_<libname> repository
  • possible other patches, that are not included in CentOS RPM, are also applied to the master branch
  • master branch is merged with lookaside branch after each import of the new tarball version
  • builds are always done from a master branch
  • lookaside_<libname> repository is used as a git submodule of the main ojdkbuild project (see lookaside directory)
⚠️ **GitHub.com Fallback** ⚠️