I have been playing with TAP for some time and even implemented a Java API to let TestNG, JUnit and other Test Frameworks to produce and consume TAP. TAP is a standard format for test output that first appeared with Perl 1 in 1987. It is human and machine readable, easy to be serialized, language independent and extensible1 through the use of YAML.
Some days ago while I was designing a plug-in to show TAP test results in Jenkins I stumbled across a message in Jenkins dev-list where Max Magee and Nick Wiesmueller were discussing about a way of showing more details about the test executions. I thought that the TAP Plug-in would fit perfectly, until one of the users, Robert Collins, mentioned SubUnit.
Shame on me, but I hadn’t heard of SubUnit until that message. Max Magee and I exchanged some messages after that, talking about a initial design and analysis for the TAP Plug-in2. Here is the initial idea:
- The plug-in will be able to parse one or more test formats (maybe SubUnit, TAP and the formats available in xUnit?).
- The test results will be displayed the same way JUnit tests are displayed in Jenkins (I think Jenkins supports JUnit format by default, but you can use objects and create test results data, independently of the test framework that you are using).
- There will be a table containing the Test Name, Description and Status and an expandable section.
- Inside this expandable section will be available all the details about the test.
- In case there are images within the test details, they should be displayed as a lightbox gallery.3
Although I have worked with TAP and spent some good time writing the tap4j port for Java, I am not convinced it is the best solution for this issue yet. Hence I am posting this initial comparison between TAP and SubUnit hoping that more people will contribute with the design of this solution. My goal is not only having a super cool plug-in for Jenkins, but ease integration of test results in different tools and collaborate with both TAP and SubUnit. Another objective that I have in mind is improving the way that test results are displayed in Jenkins and enabling it to be an alternative for tools like Smolder, TestRepository or Tribunal. Because I believe the tasks done by these tools could be all done in my favorite CI Server, and it would increase the productivity of Build & Release professionals
)
Comparison table4
I believe tables are a good way to compare different technologies. However if anybody has any recommendation on a different way of doing it, I would be glad to give it a try. In case there are missing items or other suggestion, please, do not hesitate in getting in touch.
| TAP | SubUnit | |
|---|---|---|
| Human and Machine readable format | Yes | Yes |
| Language independent | Yes | Yes |
| Programming languages supported5 | Perl, Python, PHP, Java, C, C++, C#, Lua, Shell, Ruby, Javascript, Pascal, PostgreSQL, Haskell, Lisp, Forth, Limbo | Python, C, C++ and Shell |
| Since | 1987 | 2006 |
| Grouping tests in some category/tag style | Proposal67 | Yes |
| Extensible | Yes, YAML | N/A? |
| Documentation | Good, but old. | Few examples, blogs or Wikis for beginners. |
| Used in real-world? | Yes, an enormous number of modules in CPAN use it | Yes (e.g.: Samba) |
| Format specification | Draft at IETF | Information on Python Package Index 8 |
| Show time of tests | Yes, with YAML | Yes, natively |
| Use custom test status | No | No |
| Attach files to test result | Yes, Base64 encoded in YAML | Yes, Base64 encoded in test output |
Examples
TAP:
[css]
1..1
not ok 1 Wrong length
—
wanted: 5
found: 4
time: 2011-02-01 00:09:01-07
extensions:
files:
1.txt:
name: 1.txt
file-type: text/plain
file-size: 43
content: c2FtcGxl
…
[/css]
SubUnit (using Python + nose):
[css]
time: 2011-05-23 22:49:38.856075Z
test: my_test.SampleTestCase.runTest
failure: my_test.SampleTestCase.runTest [
Traceback (most recent call last):
File "/media/windows/dev/java/qa_workspace/python_nose_tests/src/my_test.py", line 11, in runTest
self.assertEqual(len(s), 4, 'Wrong length')
AssertionError: Wrong length
]
time: 2011-05-23 22:49:38.858163Z
[/css]
Final considerations
Although I posted this comparison in my blog my intention is turning it to the community somehow, probably putting it in Wikipedia. Perhaps my thoughts here were biased by my proximity with TAP, however I am open to suggestions, ideas or critics (as a proof that I am open to SubUnit, I included it to the list of ‘Other test protocols’ in TAP Wiki, as there was only TST
)
Initially I wrote this post as a draft and sent it to Max, Nick, Cesar Fernandes and to Robert Collins for revisions (hadn’t heard back from Robert, unfortunately). Later I plan sending it to the TAP development team and for the guys responsible for the Automake GSoC TAP/SubUnit project. Then decide which protocol stick with to develop the TAP Plug-in (or SubUnit
). When this analysis is finished I will write an alpha version of this plug-in to send to the Jenkins dev-list, let me know if you would like to give it a try too.
I believe that the easiest way to spread TAP or SubUnit as the de facto standard is using it, and asking for maintainers of test frameworks such as TestNG and JUnit to add support for these formats in theirs tools or make it the default output format.
Edit
As pointed by Renormalist, some tools that generate TAP also use another kind of diagnostics for extending the test protocol. In this approach, in the next line after a test result the first character is a ‘#’ followed by a message. A test result may have several comment lines with diagnostic information. The comments in this case, belong to the test result above it. Perl Test::More module produces diagnostics in this way by default. Below you find an example of these diagnostics.
[css]
1..1
not ok 1 – There’s a foo user
# Failed test ‘There’s a foo user’
# at /home/kinow/perl/workspace/tests_with_testmore/main.pl line 2.
# Since there’s no foo, check that /etc/bar is set up right
# Looks like you failed 1 test of 1.
[/css]
1 Available in TAP version 13, http://testanything.org/wiki/index.php/YAMLish.
2 During this article I use TAP Plug-in to refer to the plug-in to display detailed test result, though after talking with Max we agreed that perhaps it would be a good idea implement it in some generic manner, not specific to TAP. We also agreed it would be good check other plug-ins like xUnit to see if we can extend it or use some code as basis.
3 Still got think more about it. Probably the images will be enclosed in the TAP Stream (or another format) Base64 encoded. Perhaps we would have to decode each attachment in the test result and display it according to its mimetype (zips, pdf, etc). But a lightbox gallery for attachments would be awesome!
4 For the first version of this comparison table I added the items that I could think of, and other items retrieved from the comparison done in Automake GSoC discussion list about the choice between TAP and SubUnit.
5 Here we considered languages that have at least one producer for the the protocol.
6 Test Blocks proposal, http://testanything.org/wiki/index.php/Test_Blocks.
7 Test Groups proposal, http://testanything.org/wiki/index.php/Test_Groups.
8 Python Package Index for python-subunit 0.0.6, http://pypi.python.org/pypi/python-subunit/0.0.6. This was the only one that I could find, but there may have another specification somewhere else.
Referecens
Jenkins dev-list discussion where the TAP Plug-in idea was sent to http://jenkins.361315.n4.nabble.com/Re-Additional-Test-Result-Display-Idea-tt3510669.html.
Another discussion in Jenkins dev-list about Test Result refactoring http://jenkins.361315.n4.nabble.com/Review-requested-Test-Result-Refactoring-tt978100.html.
Test Anything Protocol Wiki – http://www.testanything.org.
Perl Wikipedia Article, History section – http://en.wikipedia.org/wiki/Perl#History.
automake – Interfacing with a test protocol like TAP or subunit (GSoC) http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/slattarini/1.











What’s the status of that?
Did you go with TAP or SubUnit?
Your comparison table together with
the examples is very helpful. Thanks.
What does syntactically happen in the SubUnit example when the Traceback example message itself would contain a closing bracket as first character of a line? How is that unbalanced bracket nesting parsed?
To complement the TAP example with similar diagnistics you can use comment lines, starting with ‘#’, e.g.:
1..2
ok – foo
ok – bar
# Traceback (most recent call last):
# File “/src/my_test.py”, line 11,
# in runTest self.assertEqual [...
# ] <– some lonely bracket
# AssertionError: Wrong length
In TAP consumers like TAP::DOM
comment lines sematically belong to the
respective ok line just before it. However,
it's not strictly a semantical standard.
Ok, that just my 2 cents.
Thanks again for the comparison of TAP vs. SubUnit.
Renormalist
Hi Renormalist!
Sorry for taking so long to reply, I’ve been updating testlink-plugin (which also uses/supports TAP
documentation.
I went with TAP. I thought about implementing an abstraction layer and adding support to both test protocols. However as SubUnit has no current port for Java, it was too hard to accomplish. Check the result here: https://wiki.jenkins-ci.org/display/JENKINS/TAP+Plugin.
I created a sample project in github (https://github.com/kinow/nose_and_subunit). If you execute in the src folder ‘nosetests –with-subunit’ having the modules nose, subunit and nose-subunit, then you will have an SubUnit output containing the the following:
time: 2011-10-12 04:12:50.431871Z test: test_me.test_b failure: test_me.test_b [ Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/kinow/python/workspace/nose_and_subunit/src/test_me.py", line 4, in test_b assert a[0] == 'c' AssertionError ] time: 2011-10-12 04:12:50.433577ZThis was the closest that I could get to testing your hypothesis. Nose and nose-subunit control how the SubUnit stream is generated, thus I’m not able to put a ‘[‘ as first character in a line, at least I think so
I’m very happy that you liked the comparison, I looked for something similar on Internet and thought it was a shame when I found nothing.
I will add the part about diagnostics with ‘#’ to the post, hope you don’t mind. Thanks for the heads up!
in tap4j each comment belongs to the entity on its left. So a line with a comment will be just a loose comment
Perhaps a future feature in tap4j.
Thanks for your comment!
Hey, sorry that I did not get back to you… mid this year was a very busy time – changing jobs and employers.
Anyhow, I’d like to add to your comparison a couple of tweaks, if I may.
Firstly, subunit handles binaries without base64 encoding – it supports HTTP style length-prefix chunking – and these files are introspectable in the protocol.
Secondly, subunit is more delimiting even when streaming – last I checked, with tap if you don’t specify a test count, tap cannot tell if a test run being passed as a stream stops early due to a test crashing. subunit can tell when a test that starts crashes (but not when a process crashes between two tests – for that you need to see the test process exit status.
You may be interested to know that Openstack are mid-migration to subunit (using testrepository to coordinate test execution), and there are a number of projects in the mysql space using subunit – you can check with stewart for details, I only know because we were comparing tooling for coordinating subunit processes at OSDC
Hi Robert! Thanks for the reply, and no worries about the delay
)
I will update the post later with your information. Thanks!
> Firstly, subunit handles binaries without base64 encoding – it supports HTTP style length-prefix chunking – and these files are introspectable in the protocol.
That’s very interesting. I’ll take a look on this (would you have an example?).
>You may be interested to know that Openstack are mid-migration to subunit (using testrepository to coordinate test execution), and there are a number of projects in the mysql space using subunit – you can check with stewart for details, I only know because we were comparing tooling for coordinating subunit processes at OSDC
Thanks for sharing this. My main interest is integrating tests in Jenkins. The one thing I missed in subunit were examples. Though testanything.org website, from where I took most of the examples while parsing TAP has been down for weeks too
There are some examples in the README in the source tree, but that doesn’t in clude the HTTP chunking stuff. That is included in the BNF though (attached).
I’d be delighted to give you a gzip of a subunit stream if you’d like – thats trivial to get my hands on
. Should I mail that to the address you use on the Jenkins lists?
test|testing|test:|testing: test LABEL
success|success:|successful|successful: test LABEL
success|success:|successful|successful: test LABEL DETAILS
failure: test LABEL
failure: test LABEL DETAILS
error: test LABEL
error: test LABEL DETAILS
skip[:] test LABEL
skip[:] test LABEL DETAILS
xfail[:] test LABEL
xfail[:] test LABEL DETAILS
uxsuccess[:] test LABEL
uxsuccess[:] test LABEL DETAILS
progress: [+|-]X
progress: push
progress: pop
tags: [-]TAG …
time: YYYY-MM-DD HH:MM:SSZ
LABEL: UTF8*
DETAILS ::= BRACKETED | MULTIPART
BRACKETED ::= ‘[' CR UTF8-lines ']‘ CR
MULTIPART ::= ‘[ multipart' CR PART* ']‘ CR
PART ::= PART_TYPE CR NAME CR PART_BYTES CR
PART_TYPE ::= Content-Type: type/sub-type(;parameter=value,parameter=value)
PART_BYTES ::= (DIGITS CR LF BYTE{DIGITS})* ’0′ CR LF
Hi Robert!
I’ll take a look at the BNF during holidays. Thanks a lot! If you could send me a gzip of a subunit stream that would be great. I would like to learn more about subunit and write a new post about TAP and SubUnit. I’ll think how to add support to subunit to Jenkins
Thanks a lot and happy holidays,
-Bruno