Development Environment

Using Mercurial’s bisect extension to find bugs

One week ago Openbravo ERP code was moved from Subversion to Mercurial. I am completely new to Mercurial and to distributed SCMs since I always worked with Subversion but, apart from the new concepts it incorporates, the transition was very smooth, at least till the moment. I’ve spent some time during the last days looking the extensions Mercurial has and, for me, one of the nicest ones is bisect. Bisect can be very useful to find the changeset when a bug was introduced in the code.

A real example

Recently I was assigned this bug. I discovered that it was not present in  r2.40 but it was in development branch, furthermore, I find out the bug was caused because a line had been removed from the code. At this point I wanted to know which commit removed that line, just to know if it was a mistake or it was done on purpose trying to fix another bug, so the tedious work started: given two revisions one that has the bug (head of development branch) and another one that has not it (r2.40 tag) try different revisions in between to find which one removed the line that causes the bug. This is not only tedious but also very time consuming.

A good solution: bisect

Bisect is an extension for Mercurial that makes this kind of work much faster. Bisect’s behavior is pretty simple, you tell it the good and bad changesets and it updates your working copy to another one between them, then you can test if in that one the bug is present and you mark it as good or bad, then the process is run again until it finds the changeset that introduces the bug.  In fact bisect just decides for you which is the next changeset to test in. It sounds not to be a very high improvement, but if you combine it with some simple (at least in some occasions) automated test the results can be awesome.

Let me explain it through a simple example:

  • First prepare the environment: a file with a lot of lines, and a commit somewhere there removing a line which we’ll look for afterwards.
$ hg init testBisect
$ cd testFile
$ cat testFile
This is
a file
with some feature
and this
line here
is needed to work fine
$ hg ci -A -m "init file"
adding testFile
$   for (( i = 0; i < 335; i++ )); do echo "line"$i >> testFile; hg commit -m "change here"; done
$ sed -i 's/line here//' testFile
$ hg ci -m "this commit is buggy"
$ for (( i = 0; i < 872; i++ )); do      echo "line"$i >> testFile;      hg commit -m "change here";      done
$ hg parents
changeset:   1208:7568a581b554
tag:         tip
user:        Asier Lostalé
date:        Mon Mar 02 11:13:39 2009 +0100
summary:     change here
  • Now we have 1208 changests! Let’s make the script to decide whether a revision is buggy or not and to continue looking in case it is not:
$  cat test1.sh
#!/bin/sh
MIN_ARGS=2
if [ $# -lt $MIN_ARGS ]; then
  echo "Usage: $(basename $0) FILE TEXT_TO_FIND" >&2
  exit 1
fi
FILE=$1
shift
TEXT_TO_FIND=$*
check() {
   grep -q "$TEXT_TO_FIND" $FILE && RESULT=good || RESULT=bad
   echo $RESULT

   hg bisect --$RESULT
}
while :
do
  if check | grep -q 'Testing changeset'
then
  echo
  hg bisect
else
  hg bisect
  exit 0
fi
done
  • Now we are ready to start testing, first of all reset bisect and tell it which is the bad and good known revisions. Bad is current one and good is the first one.
$ hg bisect --reset
$ hg bisect --bad
$ hg bisect --good 1
Testing changeset 604:9d6a42635e81 (1207 changesets remaining, ~10 tests)
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  • Finally just execute the test to find out who removed the line.
$ time ./test1.sh testFile 'line here'
Testing changeset 302:132a5339324e (603 changesets remaining, ~9 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 453:aa92eb899545 (302 changesets remaining, ~8 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 377:5c8e69bdb1ce (151 changesets remaining, ~7 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 339:05f7bb18e505 (75 changesets remaining, ~6 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 320:3107aee2dbd2 (37 changesets remaining, ~5 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 329:0d907ee53cdb (19 changesets remaining, ~4 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 334:8a0d38375333 (10 changesets remaining, ~3 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved

Testing changeset 336:dc2037e24dfc (5 changesets remaining, ~2 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 335:939ca611ae0f (2 changesets remaining, ~1 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
The first bad revision is:
changeset:   336:dc2037e24dfc
user:        Asier Lostalé
date:        Mon Mar 02 11:11:52 2009 +0100
summary:     this commit is buggy
real    0m1.374s
user    0m1.128s
sys    0m0.204s

And we are done: in less than 1.5s we know which is the commit that removed the line!

Though this example is quite theoretical I think bisect is a very good solution for this kind of searches. Do you have experience with this extension? Any comment about it is welcome.

Posted in Openbravo Tagged: mercurial, scm

Previous post

Mercurial ready for Openbravo ERP

Next post

Quick status update on ERP 2.50

No Comment

Leave a reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>