Skip to content

Testing add-on startup performance

Our add-on performance initiative is getting lots of attention for, lets say, various reasons. There have been objections about transparency and our testing methods, so I decided to add something valuable to the discussion and document my own testing process.

I revisited my old add-on performance article and noticed that the contents of the Measuring Startup wiki page have changed substantially since I originally linked to it. It now recommends installing an add-on  to measure startup performance. I haven’t tried it, but there are a few reasons I can think this is not the best approach. (Update: I’ve been informed that the add-on is only a display for data that is gathered and stored locally. You can make test runs and then install the add-on to look at the data. That dispels my previous doubts about this approach.) Regardless, I’m documenting the old testing method here, because it is the one I have been using for a while and is also very similar to the one implemented on our automated Talos testing framework.

I have been doing lots of add-on startup testing recently, mainly to double check if the results of the Talos tests are sound. We also correlate them to real world usage data that we have been collecting since early versions of Firefox 4. This data, manual testing and source code review are the main backup sources that give us a good confidence level in the results we display on our infamous performance page (it has been linked enough).

Here’s what I do.

Setup

  1. Create a new profile dedicated for testing add-on performance (I called it startuptest).
  2. Download this HTML page and save it somewhere convenient. The page is blank if you open it directly. All it does is run some JS that extracts a timestamp after the # character in the URL, compares it against the timestamp when the script is run, and shows the difference on the page.
  3. Set up a console command that opens Firefox in your testing profile and opens the downloaded file, with the current timestamp embedded after the # character. On my system (Mac OS), this command is the following:
/Applications/Firefox.app/Contents/MacOS/firefox-bin -P startuptest -no-remote file:///Users/jorge/startup.html#`python -c 'import time; print int(time.time() * 1000);'`

The old version of the Measuring Startup page explains how to set this up on Windows.

Testing

  1. Locate the testing profile folder and delete all files in it, if there are any.
  2. Open Firefox on this profile. You can use the console command or any other shortcut if you prefer.
  3. Copy and paste the add-on listing page URL on the new profile and open the page.
  4. Install the add-on using the install button and restart if necessary.
  5. Optionally, set up the add-on in a realistic way. For example, if this is a Facebook add-on, it may make sense to log in to a Facebook account since otherwise most of the add-on’s functionality would be inactive.
  6. Quit Firefox.
  7. Run Firefox using the console command.
  8. Note the result in the startup page.
  9. Quit Firefox. I prefer using the Quit key shortcut, to interact with Firefox as little as possible.
  10. Repeat steps 7-9. I discard the 2 first runs, which are normally much slower than the rest, and measure the 10 runs after that.

Interpreting results

For your results to make any sense, it’s also necessary to make a test run without any add-ons installed, and use that as your baseline. It’s also a good idea to run all the tests consecutively to have some certainty that they are all running under similar conditions. I record and compare my results on a spreadsheet, like this one where I tested both my add-ons.

Looking at the results, Fire.fm has a somewhat noticeable impact on startup. This is not surprising because it is a complex add-on with a very complex overlay and startup processes. I documented on improving startup code in my old blog post, and we’re planning on greatly simplifying its overlay soon(ish). I doubt we’ll make the coveted 5%, but we’ll see. Remote XUL Manager is clearly simpler, and it shows how the results should not be taken at face value. Since all it does in the overlay is add a menu item that opens a separate window, it’s understandable that its impact is negligible. But does it really improve startup? No, of course not. This just means that the error margin is larger than its real performance impact.

The key takeaway here is that the results of manual tests shouldn’t be taken literally, but they’re still a good indicator of the performance impact of an add-on. Even if the error margin is not ideal (or even measurable under these conditions), you can still get a good idea of who’s fast and who’s slow. They have been very valuable to us when comparing them against Talos results.

How does this compare to Talos?

On one hand, these tests are influenced by how the testing system is set up. I have several applications open at all times, and I don’t close them all for testing. I do take care in not running anything heavy simultaneously, like Time Machine or MobileMe Sync. And then there’s clearly the fact that I have to spend some time setting things up and running the tests. The longer the tests take, the more likely it is that some other process affects the results.

On the other hand, it’s easier for me to recognize errors during testing. Many of the complaints we’ve received about the testing system is that it makes silly mistakes like trying to install an add-on from an incorrect URL, or trying to install an add-on that is not compatible with the Firefox version being tested. These are things that one can clearly see when testing manually, but they weren’t obvious when running the tests automatically. Those add-ons have been getting very good performance rankings because they’re not really being loaded, so those results are not reliable.

Luckily, the people complaining about our testing are also filing bugs and talking to us directly, so we’re looking into the issues and trying to get them resolved as soon as possible. Special thanks to Wladimir and Nils, who have been very helpful filing and categorizing bugs. More details coming up in the Add-ons Blog.

As always, the developer community proves itself as an invaluable asset for Mozilla (well, you are Mozilla). Even if our discussions can become harsh and are generally very public, the outcome is almost always a set of improvements both in our technical and communication fronts. Getting things right take a lot of work and a lot of patience, and I hope we can quickly get to a place where we’re all satisfied.

{ 4 } Comments

  1. Wladimir Palant | 2011/04/12 at 3:36 AM | Permalink

    Jorge, to get results that are comparable to Talos it is a good idea to actually use Talos ;)

    It’s not like it is complicated to set up. You download http://hg.mozilla.org/build/talos/archive/tip.zip, unpack it, add a symbolic link called “firefox” to that directory pointing to your Firefox directory (on Windows I recommend using junction utility for that). You make a copy of sample.config and remove all tests but “ts” from it – that’s your baseline. Then you make another copy where you replace “extensions: {}” by “extensions: [ 'path/to/addon.xpi' ]. Done. Now you can run the tests with:

    python run_tests.py -s baseline.config
    python run_tests.py -s addon.config

    It will create and initialize the Firefox profile for you, do 20 runs according to your config, print the individual results and even calculate the average (but not standard deviation for some reason).

  2. Wladimir Palant | 2011/04/12 at 3:42 AM | Permalink

    PS: To test the profile that Talos sets up I usually kill the script after the browser window shows up the first time. You can then find this profile in your temp directory, use “firefox -no-remote –profile /tmp/tmpXXXXX/profile” to run Firefox with that profile.

    PPS: Changing extra_args: ” into extra_args: ‘-no-remote’ in the Talos config files is also recommendable, this will allow the tests to run while another Firefox instance is open.

  3. Nils Maier | 2011/04/12 at 4:42 AM | Permalink

    Nice article.

    about:startup is a simple tool to display the internal startup measurements via getStartupInfo:
    http://mxr.mozilla.org/mozilla-central/source/toolkit/components/startup/public/nsIAppStartup.idl#137
    The real world data (amo pings) you mentioned is based on getStartupInfo, if I’m not completely mistaken.

    The add-on, being bootstrapped, has a small performance impact itself.
    You can get these counters not only through that add-on, but from any chrome privileged code, off the startup path, so you can avoid any additional startup perf impact.
    You can even hack the Talos page and also return the numbers there and compare it to what Talos measures.
    It should be noted, however, that Talos ts and getStartupInfo values measure different things. The closest thing to ts should be sessionRestored (not firstPaint!), but only if you don’t have (many) tabs to restore. Hence any correlation between real world data and Talos ts is not as strongly correlated as you might expect, but may still give a trend, at least.

    Setting up Talos can be a little more “demanding” than Wladimir described, as you also need to install some prerequisites, such as Python. But it is indeed not very hard.

  4. Xavier | 2011/04/13 at 2:06 AM | Permalink

    Hi and thanks for this useful article !

    In the command line to launch Firefox, you can avoid calling Python by replacing:
    `python -c ‘import time; print int(time.time() * 1000);’`
    with
    `t=$(date +%s%N);echo ${t%??????}`

    Okay I’m not sure this is a major breakthrough, but it still looks cool and saves about 30ms on my machine (wow).

Post a Comment

Your email is never published nor shared. Required fields are marked *