One of the things I did for our new 4.0 release was write some GUI tests with AutomatedQA's TestComplete for our new and improved AJAX Image Viewer and Thumbnail Viewer. Along the way I learned a lot about how to write and maintain automated GUI scripts that I'd like to share. This is not a review of TestComplete -- I'm sure these tips would be applicable to any GUI scripting solution.

For those not familiar with products like this, TestComplete allows you to record and playback GUI events and then compare screenshots against the original run. It's analogous to the role NUnit plays in unit testing, but with specific features to drive a GUI. We are using the web product which allows us to address the DOM inside a page shown in IE for fine grain access to the controls of the GUI.

  1. "Write" tests rather than record them. When you are just getting started with the tool, you're going to record tests and play them back, just to see how it works.  It's extremely important that you break yourself out of that habit as soon as possible and start writing scripts directly.  Otherwise you'll have scripts that look like this:

    function Test1()
    {
      var  w1;
      var  w2;
     
      w1 = Sys["Process"]("iexplore")["Window"]("IEFrame", "*");
      w2 = w1["Window"]("Shell DocObject View")["Window"]
         ("Internet Explorer_Server");
     
      w2["ToURL"]("http://localhost/ThinClientViewerTests/");
      w2["Wait"]();
      w1 = w2["Page"]
        ("http://localhost/ThinClientViewerTests/")["document"];
      w1["all"]["Item"]("DropDown_MouseTool")
        ["ClickItem"]("Selection");
      w1["frames"]["frame"]("WebImageViewerMain_ov")
        ["document"]["all"]["Item"](6)["Drag"](77, 86, 354, 254);
      if(!Regions.Compare("AfterSelection.bmp",
          Sys["Process"]("iexplore")["Window"]
          ("IEFrame", "*")["Window"]("Shell DocObject View")
          ["Window"]("Internet Explorer_Server")["Page"]
          ("http://localhost/ThinClientViewerTests/")
          ["document"]["all"]["Item"]("WebImageViewerMain_om")))
              Log.Error("The regions are not identical.");
    }
     

    instead of this:

    function TestSelection ()
    {
      var w = GetIEProcess();  
      GotoTestPage(w);
     
      var doc = GetDocument(w, GetUrl());
     
      ChooseMouseTool(doc, "Selection");
      GetImage(doc)["Drag"](77, 86, 354, 254);
      AssertRegionIsEqual("AfterSelection.bmp",
            GetImageViewerControl(doc));
    }

    Granted, you're likely to want to rewrite the recorded code a little bit, but I'm advocating a full refactoring so that the scripts share as much code as possible. Once you've done this a few times, you will find yourself just writing the tests directly instead of recording them. Some of the following tips are only possible if you take this step.

  2. Make tests resilient to change. One of the most frustrating aspects to GUI testing is that the look and components of the GUI are not stable until late in the development cycle. They are also more likely to change in a new version than software APIs, for which we have developed many practices and patterns to make them backwards compatible. Changes in a GUI, by their nature, are not compatible at the level we need for an automated test, so you must make your tests easy to update. There are two things you should do:

    1. Take snapshots carefully. Do not take a full window screenshot to compare unless it is necessary -- instead take a snapshot of the relevant area.  If this part of the window moves around, most tools will know how to find it (they poke into the underlying form, or in my case, DOM, to find its location on the screen). If your tool supports it, use its image comparison tolerance and masking features to compare just the important parts.

    2. Access controls through accessor functions. Each piece of the GUI that you control or snapshot should be accessed via a function that you write.  That way, if its name changes, or if something has to be done before accessing it, you have a hook. If you've refactored the recorded scripts, you're probably already doing this.

  3. Make failures easy to figure out. The easiest way to do this is to log breadcrumbs along the way.  Again this is easy to do if you've refactored code. In my case, AssertRegionIsEqual logs what it is comparing so that I know how to interpret any failures.

  4. Make tests individually callable. In TestComplete, this means making each test function void and taking no arguments. This is the default when recording them, but when I was rewriting them, it was very tempting to try to make them OO or add arguments. This makes it hard to just right click on one and choose "Run" from the context menu, which I consider a must-have feature. So, even though the code could be better organized, I won't do it if the tests can't be called from the editor.

  5. Have the tests put the GUI into known states. One thing I noticed early on is that I'd sometimes have an image comparison error because a control had the mouse over it or the control had focus when playing back, but not during the snapshot. This is because the snapshot tool requires you to move the mouse and take focus away, but playback does not. It's important to put the mouse exactly where you want it and give focus to a control or area that won't change it's look (by clicking the background, e.g.) before you take a snapshot.  Once I realized this, I wrote a function to do it and called it in my snapshot comparison function.

  6. Write focused tests. Do not have one giant function that exercises your entire GUI. Break it down into individual, stand-alone tests.

  7. If you use a bug-tracker, write GUI tests to confirm a fix. While developing, it great to test-first and write an automated unit test that reproduces a bug before you fix it.  While confirming those fixed bugs (in acceptance testing), it is equally important to write a repeatable test that confirms that the bug is fixed. Of course, you should comment the case number in the test script and the script name in your bug-tracker.
In general, I find it's harder to make a reliable repeatable GUI test than a Unit test. Having a maintainable code base makes it possible to add in code later that makes it more reliable. Also, it's easy for me to to change my code from a testing script to a snapshotting one, when I want to make a different snapshot repository for different initial conditions (in my case for when the image viewer is viewing a different test image). These things would be impossible if I didn't rewrite the recorded script, which I think is the key to making tools like these more useful.