Package testing, resumed
I left the topic of package testing dangling last time. Detecting bugs before we commit them to the platform is the foremost goal of our testing operations, and nearly all of that goal falls on package testing.
We need the package owners to give us the tests that will qualify a package drop for platform commit. They’ve got the competence and resource to develop the kind of no-kidding tests needed for this quality gate; we haven’t. We’ll expect a package drop to have passed its own tests when we get it; but we’ll want run them again to be sure. The baseline platform build that the drop was tested against might be older than the latest one we’ve reached.
If we haven’t got the capability in-house to shoulder package testing development ourselves, how are we going to know if the package tests we get are good enough? Well we’ll have the expertise to tell if tests that a package comes with are obviously inadequate. But if they’re inadequate and it’s not obvious, it will become apparent because the package gets a bad record for being at the bottom of platform bugs that turn up down the line. Then we’ll need a conversation with the package owner about the quality of those tests.
I want a package to contain the recipe and ingredients for testing it, in automatable form. Package drops will get built automatically. If a build succeeds, I want the automation to carry right on and run the package tests. In GNU autotools terms, the package testing will be a make check step.
We need to define the way in which the recipe and ingredients for package testing can be expressed in the package, so that our automation can understand it. Our automation isn’t actually autotools based; it’s ANT-based. So the package testing recipes could be ANT-files, if we think its fair enough to insist that package owners will also use ANT – and therefore Java – for their own testing. That’s not a trifling committment, so some XML protocol of our own devising, with lighter baggage, could be the answer. An open question.
But the problem of how the package testing recipes can be expressed in an automatable fashion can’t be isolated from the question of the automation technology and infraststructure. To push the culinary metaphor, it’s great if we can understand the instruction “Put the tomatoes in the blender”; but we need a blender to carry it out. And if we tell the world “We’ve got some blenders. We’ve got no pressure cookers”, then anybody who’s got a pressure cooker but no blender can’t send us recipes to cook.
As well as defining how package testing recipes can be expressed, we’ll have to specify the testing technologies at our disposal, so that contributors can give us recipes we’re able to carry out. We are not going to be in the business or originating new testing technologies. The ones at our disposal are going to be ones that are out there open-source already, or else ones that are contributed to Symbian (in which case they’ll be open-source once contributed).
Any automated testing technology we get is going to come with its own provisions about how to express the recipe and ingredients for a test. But I don’t want to bind us to a single testing technology by standardising on the test specification protocol that it lays out. If a prospective contributor is using a testing technology that we don’t support, I don’t want to be telling them there’s nothing doing unless they adopt one that we do support. I want the option that they can contribute their testing technology to us, and that we can hook it up and add it to our repertoire without a whole lot of disruption. I want this extensibility built into our testing automation.
So I’m thinking of one frontend, potentially many backends; like compiler architecture. The automation that we develop here should be a pretty thin despatching layer that can identify the type of backend a package elects to be tested with, and hand it off to a backend of that type to do the real work. The backends will be contributed technologies and resources. The package testing recipes should be expressed in a frontend protocol that we define, and they’ll be “meta-recipes” essentially like this:
- Recipe: Test this package with Gnomotest
- Ingredients: One Gnomotest test specification, from package.
Anticipations of the testing strategy
Up until today I was not only the lead test engineer here at the Symplex, I was the only test engineer. Today I’m going to be joined by Louis Nayegon, who is an old hand at testing Symbian OS and has a lot more technical depth in it than I have. We’ve got another test engineer in the bag, to join next month, and recruitment goes on urgently. Finally we should run to eight test engineers.
That might sound like plenty to test any body of software, but to test an OS distribution properly, it’s a tiny fraction of what it takes. It’s a tiny fraction of the resource that went into testing Symbian OS in its proprietary past; likewise a tiny fraction of the resource that went into testing S60, or UIQ or MOAP, which are all going to come together in the Symbian platform.
We need to test the hell out of the platform, and eight people doesn’t approach the magnitude of the task. But we knew that when we started. We’re in the same boat with the core team of any Linux distro. We can’t ourselves mount anything like the whole testing effort that’s needed. We have to count for most of it on our upstream contributors. We need each package maintainer and that package team to be able and willing to test the hell out of their package, and we need ourselves to be able to certify that a package drop is tested to hell before we commit it to our platform repository. We need to get this contract understood and supported by all our contributors.
Fortunately all our initial contributors have a formidably strong testing culture and they already possess huge assets of test sets, test systems and testing expertise. They’re well able to test the hell out of package drops. We only need to enable that muscle to work with us and sustain the same culture as new contributors come in.
I see the division of testing labour like this. The Symbian test team can and should develop platform testing capability: that is nobody else’s responsibility. We should also provide a protocol by which package owners can communicate to us the tests that they use to certify the quality of their package drops, and the tooling to let us get those tests and replicate them for gating commits to the platform. The package owner’s end of the deal is specify, implement and perform package tests that convincingly show that the package behaves just as it’s supposed to in the setting of the latest good platform build they could get in time to prep the package drop. Our end of the deal is (a) to be really convinced that a package drop behaves the way it’s supposed to before we commit it to the platform; (b) to specify, implement and perform tests that convincingly show that platform as a whole makes the grade for a daily baseline, or for a fortnightly or 6-monthly release.
Package testing is our only operational defense against bugs getting into the platform. That will be where we win or lose platform stability each day.
A thumbnail of the Symbian platform lifecycle.
I see that some scene setting articles will be in order to ground this blog. Here’s my first.
What is the process by which the Symbian platform will evolve? – It will be a cycle within a cycle within a cycle.
(I’m talking in the future tense because right now we’re still getting this act together. Which also means that I might have to correct what I say here later on, but I’m pretty sure this won’t be drastically wrong.)
The Symbian platform is made up of packages, like a Linux distribution. Like a Linux distribution too, we (Symbian) own the platform, but we don’t own the packages. They are owned by upstream contributors, who could be any person or organisation. The platform will evolve by the contribution of new packages, or new releases of old ones , and occasionally the retirement of packages. We’ll be geared to deal with a daily inflow of fresh package drops.
The longest cycle is our official release cycle. We’ll make an official release of the platform every 6 months. A release will deliver a planned set of new features or improvements (and possibly removals of obselete features). An official release will be for manufacturers to build devices on and developers to build apps on.
Inside the official release cycle comes our fortnightly release cycle. Every fortnight we’ll make a public release of the platform that represents the most stable, usable drop of the platform that we’ve achieved en route to the next official release. A fortnightly release will be deeply tested. It will let device makers and developers take stock of how we’re getting on, allow them to keep their own developments in step with ours, and let them find bugs or gaps to feed back for fixing.
Inside the fortnightly release cycle comes our daily build and test cycle. This cycle has two threads that will run in parallel: platform build & test, and package build & test.
Every day – at least every day when any change has been comitted to the platform – we will build the whole lot, to make dead sure that we still can. We’ll also test the daily builds lightly, to make sure we can at least still boot a device or an emulator and that it’s not useless when we do. That’s the daily platform build & test thread.
The essential function of the daily platform builds will to generate every day a rev of the platform that’s stable enough to serve as the baseline build for the second thread: building and testing the daily inflow of package drops. We’ll publish the daily builds, including logs and reports so that package owners can see the latest status and get hold of the latest good baseline to build their packages against.
When a package owner sends us a drop, the critical question for us is whether it builds and tests successfully in the context of the latest good rev of the latest good daily platform build that we’ve got. So we’ll build the fresh package source together with the unchanged remainder of that baseline build. If the build is good, we’ll then go on to test it deeply. If the package also tests successfully, then we’ll commit the fresh package source to our mainline platform repository. That night, the whole platform will be rebuilt from that repository, with fresh source from the package drops that have been comitted that day. All being well, we get a new good baseline to publish the next morning, and so it goes…
All not being well, we’ve got build breaks or we’ve got test failures in the platform. We’ll raise defects to the owners of the faulty packages, and our baseline will mark time until the fixes come in.
It’s the daily cycle, of course, that will set our pace. Keeping good builds rolling out steadily is what contributors need from us to keep good new code rolling safely in.
Introducing myself
I’m Mike Kinghan, founder of this blog. I’ve been at the Symbian Foundation since Day 1 (April 1, ‘09) and for now, I’m the lead test engineer. For the previous 6-and-half years I was an integration engineer at Symbian Software Ltd. – our previous corporate incarnation. For the last 4 years of that I led the Automated Build & Test team, so I’m a willing and able fit for the challenge of building an automated test capability here, from square one. I’ll be blogging mostly about that mission as long as I have it, but the engineering stories here are all connected and I’ll blog them as the spirit leads.