Bisecting Jsoup - bootstraponline/meta GitHub Wiki
Using git bisect to identify which commit introduced a bug.
jsoup-1.5.2 is a known good tag. Check out jsoup-1.5.2 and verify that the exit code returned is zero (good).
git checkout jsoup-1.5.2 ;\
./build.sh
Repeat the same with jsoup-1.7.1 to verify it currently fails on the latest tag.
git checkout jsoup-1.7.1 ;\
./build.sh
With both a good and bad commit found, we're ready to start bisecting.
git bisect start ;\
git bisect good jsoup-1.5.2 ;\
git bisect bad jsoup-1.7.1
The run command must be ./build.sh and not build.sh or the build script will not be found.
git bisect run ./build.sh
After bisecting, remember to reset.
git bisect reset
Bisecting succeeded.
8749726a79c22451b1f01b14fb2137f734e926b4 is the first bad commit
commit 8749726a79c22451b1f01b14fb2137f734e926b4
Author: Jonathan Hedley <[email protected]>
Date: Tue May 10 22:13:23 2011 +1000
Reimplementation of parser and tokeniser, to make jsoup a HTML5 conformat parser, against the
http://whatwg.org/html spec.
:100644 100644 bffc3f44e5d6209de7b3776fbb92eeda79c9c1f4 d7804d0c0e6360521774e7c4688a767ee9b613a9 M CHANGES
:040000 040000 d4a4bba4819036fb82482124a719f10165334a2c 677c41662eb3a6619fbfd6687a1fa0bd80e2209f M src
bisect run success
Let's find out which tag includes this commit.
$ git tag --contains 8749726a79c22451b1f01b14fb2137f734e926b4
jsoup-1.6.2
jsoup-1.6.3
jsoup-1.7.1
On the master branch.
$ git branch --contains 8749726a79c22451b1f01b14fb2137f734e926b4
master
The commit can be referenced based on the 1.6.2 tag location. 76 commits behind.
$ git describe --contains 8749726a79c22451b1f01b14fb2137f734e926b4
jsoup-1.6.2~76
$ git show jsoup-1.6.2~76
Has the commit been cherry picked? If not then there'll be no output for the following cmd.
$ git cherry -v 8749726a79c22451b1f01b14fb2137f734e926b4
+ a28fb8eae225ac16279a6086420f310145bf6100 Added test to verify that solidus as end of unquoted attribute in tag is handled as part of attribute, and not a self-closing tag, which was the old behaviour of jsoup.
...
The commit is in 1.6.0 even though git contains --tags doesn't say so. Now that we know it has been cherry picked, let's look for that second commit.
$ git log --grep="Reimplementation of parser and tokeniser" --all
commit 8749726a79c22451b1f01b14fb2137f734e926b4
Author: Jonathan Hedley <[email protected]>
Date: Tue May 10 22:13:23 2011 +1000
Reimplementation of parser and tokeniser, to make jsoup a HTML5 conformat parser, against the
http://whatwg.org/html spec.
commit 45a3cc68a2d44b9be2cfac65d075f399060cf65b
Author: Jonathan Hedley <[email protected]>
Date: Tue May 10 22:13:23 2011 +1000
Reimplementation of parser and tokeniser, to make jsoup a HTML5 conformat parser, against the
http://whatwg.org/html spec.
Finally, we have our answer that the commit is indeed present in 1.6.0.
git tag --contains 45a3cc68a2d44b9be2cfac65d075f399060cf65b
jsoup-1.6.0
jsoup-1.6.1
You can't find 45a3cc68a2d44b9be2cfac65d075f399060cf65b from 8749726a79c22451b1f01b14fb2137f734e926b4 without using grep. A great reason to avoid cherry-pick, read more on stackoverflow
JSoupTest.java
import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
public class JSoupTest {
// HTML code from dma https://github.com/jhy/jsoup/issues/249
public static String html() {
StringBuilder _builder = new StringBuilder();
_builder.append("<html>");
_builder.append("\n");
_builder.append("<body> ");
_builder.append("\n");
_builder.append(" ");
_builder.append("<table>");
_builder.append("\n");
_builder.append(" ");
_builder.append("<form action=\"/hello.php\" method=\"post\">");
_builder.append("\n");
_builder.append(" ");
_builder.append("<tr><td>User:</td><td> <input type=\"text\" name=\"user\" /></td></tr>");
_builder.append("\n");
_builder.append(" ");
_builder.append("<tr><td>Password:</td><td> <input type=\"password\" name=\"pass\" /></td></tr>");
_builder.append("\n");
_builder.append(" ");
_builder.append("<tr><td><input type=\"submit\" value=\"login\" /></td></tr>");
_builder.append("\n");
_builder.append(" ");
_builder.append("</form>");
_builder.append("\n");
_builder.append(" ");
_builder.append("</table>");
_builder.append("\n");
_builder.append("</body>");
_builder.append("\n");
_builder.append("</html>");
_builder.append("\n");
return _builder.toString();
}
// Modified version of dma's test code https://github.com/jhy/jsoup/issues/249
public static void main(final String[] args) throws IOException {
Document doc = Jsoup.parse(html());
Elements forms = doc.select("form");
System.out.println(forms.toString());
int exitCode = forms.toString().contains("<form action=\"/hello.php\" method=\"post\"></form>") ? 1 : 0;
System.out.println(exitCode + " exit code");
System.exit(exitCode);
}
}
Build.sh
# clean
mvn clean > /dev/null 2>&1
# build jsoup and skip tests
mvn package --quiet -Dmaven.test.skip=true > /dev/null 2>&1
# Seperate build failures from code producing invalid results.
EXIT=$?
# If mvn package did not exit with 0
if [ $EXIT -ne 0 ];
then
# log build failure and exit as 0
# this prevents git bisect from marking
# the commit as failed
echo $EXIT " mvn build failure"
exit 0
fi
# remove old jar if it exists
if [ -f ./target/jsoup.jar ];
then
rm ./target/jsoup.jar
fi
# rename jar
# force move to overwrite
mv -f ./target/jsoup*.jar ./target/jsoup.jar
# run test
javac -classpath ./target/jsoup.jar JSoupTest.java
java -classpath .:./target/jsoup.jar JSoupTest
# Git wil use the exit code of the last command.
# 0 = success