]> www.infradead.org Git - users/dwmw2/linux.git/commitdiff
setlocalversion: work around "git describe" performance
authorRasmus Villemoes <linux@rasmusvillemoes.dk>
Mon, 18 Nov 2024 11:01:54 +0000 (12:01 +0100)
committerMasahiro Yamada <masahiroy@kernel.org>
Wed, 27 Nov 2024 23:11:56 +0000 (08:11 +0900)
Contrary to expectations, passing a single candidate tag to "git
describe" is slower than not passing any --match options.

  $ time git describe --debug
  ...
  traversed 10619 commits
  ...
  v6.12-rc5-63-g0fc810ae3ae1

  real    0m0.169s

  $ time git describe --match=v6.12-rc5 --debug
  ...
  traversed 1310024 commits
  v6.12-rc5-63-g0fc810ae3ae1

  real    0m1.281s

In fact, the --debug output shows that git traverses all or most of
history. For some repositories and/or git versions, those 1.3s are
actually 10-15 seconds.

This has been acknowledged as a performance bug in git [1], and a fix
is on its way [2]. However, no solution is yet in git.git, and even
when one lands, it will take quite a while before it finds its way to
a release and for $random_kernel_developer to pick that up.

So rewrite the logic to use plumbing commands. For each of the
candidate values of $tag, we ask: (1) is $tag even an annotated
tag? (2) Is it eligible to describe HEAD, i.e. an ancestor of
HEAD? (3) If so, how many commits are in $tag..HEAD?

I have tested that this produces the same output as the current script
for ~700 random commits between v6.9..v6.10. For those 700 commits,
and in my git repo, the 'make -s kernelrelease' command is on average
~4 times faster with this patch applied (geometric mean of ratios).

For the commit mentioned in Josh's original report [3], the
time-consuming part of setlocalversion goes from

$ time git describe --match=v6.12-rc5 c1e939a21eb1
v6.12-rc5-44-gc1e939a21eb1

real    0m1.210s

to

$ time git rev-list --count --left-right v6.12-rc5..c1e939a21eb1
0       44

real    0m0.037s

[1] https://lore.kernel.org/git/20241101113910.GA2301440@coredump.intra.peff.net/
[2] https://lore.kernel.org/git/20241106192236.GC880133@coredump.intra.peff.net/
[3] https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/

Reported-by: Sean Christopherson <seanjc@google.com>
Closes: https://lore.kernel.org/lkml/ZPtlxmdIJXOe0sEy@google.com/
Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
Closes: https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/
Tested-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
scripts/setlocalversion

index 38b96c6797f4080ee07fc40e074638994d5ba3d0..5818465abba98461817bc86372bdcdb8b006e137 100755 (executable)
@@ -30,6 +30,27 @@ if test $# -gt 0 -o ! -d "$srctree"; then
        usage
 fi
 
+try_tag() {
+       tag="$1"
+
+       # Is $tag an annotated tag?
+       [ "$(git cat-file -t "$tag" 2> /dev/null)" = tag ] || return 1
+
+       # Is it an ancestor of HEAD, and if so, how many commits are in $tag..HEAD?
+       # shellcheck disable=SC2046 # word splitting is the point here
+       set -- $(git rev-list --count --left-right "$tag"...HEAD 2> /dev/null)
+
+       # $1 is 0 if and only if $tag is an ancestor of HEAD. Use
+       # string comparison, because $1 is empty if the 'git rev-list'
+       # command somehow failed.
+       [ "$1" = 0 ] || return 1
+
+       # $2 is the number of commits in the range $tag..HEAD, possibly 0.
+       count="$2"
+
+       return 0
+}
+
 scm_version()
 {
        local short=false
@@ -61,33 +82,33 @@ scm_version()
        # stable kernel:    6.1.7      ->  v6.1.7
        version_tag=v$(echo "${KERNELVERSION}" | sed -E 's/^([0-9]+\.[0-9]+)\.0(.*)$/\1\2/')
 
+       # try_tag initializes count if the tag is usable.
+       count=
+
        # If a localversion* file exists, and the corresponding
        # annotated tag exists and is an ancestor of HEAD, use
        # it. This is the case in linux-next.
-       tag=${file_localversion#-}
-       desc=
-       if [ -n "${tag}" ]; then
-               desc=$(git describe --match=$tag 2>/dev/null)
+       if [ -n "${file_localversion#-}" ] ; then
+               try_tag "${file_localversion#-}"
        fi
 
        # Otherwise, if a localversion* file exists, and the tag
        # obtained by appending it to the tag derived from
        # KERNELVERSION exists and is an ancestor of HEAD, use
        # it. This is e.g. the case in linux-rt.
-       if [ -z "${desc}" ] && [ -n "${file_localversion}" ]; then
-               tag="${version_tag}${file_localversion}"
-               desc=$(git describe --match=$tag 2>/dev/null)
+       if [ -z "${count}" ] && [ -n "${file_localversion}" ]; then
+               try_tag "${version_tag}${file_localversion}"
        fi
 
        # Otherwise, default to the annotated tag derived from KERNELVERSION.
-       if [ -z "${desc}" ]; then
-               tag="${version_tag}"
-               desc=$(git describe --match=$tag 2>/dev/null)
+       if [ -z "${count}" ]; then
+               try_tag "${version_tag}"
        fi
 
-       # If we are at the tagged commit, we ignore it because the version is
-       # well-defined.
-       if [ "${tag}" != "${desc}" ]; then
+       # If we are at the tagged commit, we ignore it because the
+       # version is well-defined. If none of the attempted tags exist
+       # or were usable, $count is still empty.
+       if [ -z "${count}" ] || [ "${count}" -gt 0 ]; then
 
                # If only the short version is requested, don't bother
                # running further git commands
@@ -95,14 +116,15 @@ scm_version()
                        echo "+"
                        return
                fi
+
                # If we are past the tagged commit, we pretty print it.
                # (like 6.1.0-14595-g292a089d78d3)
-               if [ -n "${desc}" ]; then
-                       echo "${desc}" | awk -F- '{printf("-%05d", $(NF-1))}'
+               if [ -n "${count}" ]; then
+                       printf "%s%05d" "-" "${count}"
                fi
 
                # Add -g and exactly 12 hex chars.
-               printf '%s%s' -g "$(echo $head | cut -c1-12)"
+               printf '%s%.12s' -g "$head"
        fi
 
        if ${no_dirty}; then