May 24, 2008

What is a Darkstar test?

As a new member of the Project Darkstar team, one of the initial tasks that I am facing is to come up with a method of automating distributed Darkstar tests. As the codebase is beginning to stabilize, much of our time is being spent evaluating performance. However, this is a very difficult and time consuming task when we don't have a mechanism to easily reproduce our test scenarios. For example, we have several sets of tests to run against the Project Wonderland application. In rough terms, execution of these tests requires:
  1. Launching the Wonderland server application by hand.
  2. Tapping into the Darkstar server profiling data (if enabled, Darkstar will spit profiling data out via a specific port)
  3. Launching sets of simulated clients by hand.
  4. Visually monitoring the Darkstar profiling data (using a set of human eyes)
  5. Making a decision on whether or not the system is overloaded based on the profiling data.
  6. Adding more clients into the system by hand if the system is not overloaded.
  7. Rinse and repeat until satisfied that you have determined the maximum load on the system.
This might be fun to do once (or not), but if I'm trying to do some performance tuning, the mechanics of setting up the tests, running the tests, and determining the results can quickly become frustrating. Not only that, but running against a multi-node configuration is even more difficult, error-prone, and complicated to monitor. Maybe I would script a few pieces of the process, but it would likely just be an ad-hoc rig, incapable of supporting other tests.

Clearly we have a problem. We need a test harness that can execute tests such as the one described above, and also collect, record, and report results in a way that is easy for consumers to interpret. It should also be abstract enough to plug in any Darkstar application test, such as the Wonderland example above. Before we can build that, though, we need to answer a more basic question: What is a Darkstar test? Here's my stab at it:
A Darkstar test consists of four main components:
  1. A Darkstar server distribution (i.e. the server package downloaded from projectdarkstar.com)
  2. A Darkstar server application
  3. A set of Darkstar client application simulators
  4. A set of system probes
A Darkstar test must be run on a set of resources (presumably pieces of hardware):
  1. A set of systems to run the server application (more than one if multi-node).
  2. A set of systems to run the client application simulators.
A Darkstar test reports results based on what type of test it is:
  1. Functional test: The client application simulators are responsible for running tests and reporting pass/fail decisions as results.
  2. Load test: A preconfigured number of client application simulators are introduced into the system. The system probes are responsible for monitoring conditions of the system. If the specific condition that a probe is monitoring violates a preconfigured threshold, a fail decision is reported as a result.
  3. Capacity test: The client application simulators are incrementally introduced into the system until one or more of the system probes reports a threshold violation. The number of clients that the system can support without violating the thresholds is reported as a result.
I'm starting to design an automated test harness around the above definition (details, of which, I'll leave to a later post). There is one thing, though, that is obviously apparent about such a system: it should be valuable for testing both Darkstar itself, and applications built on top of Darkstar. This would include Project Wonderland, any example applications that we have already developed, and also applications developed in the community. With that said, I will be treating this project as though it will be consumable by community members outside of the core Darkstar team, and hope to engage others in my efforts through this blog.
Stay tuned...

2 comments :

  1. 2) Load test: A preconfigured number of client application simulators are introduced into the system. The system probes are responsible for monitoring conditions of the system. If the specific condition that a probe is monitoring violates a preconfigured threshold, a fail decision is reported as a result.
    3) Capacity test: The client application simulators are incrementally introduced into the system until one or more of the system probes reports a threshold violation. The number of clients that the system can support without violating the thresholds is reported as a result.


    What are these system probes? Please provide an example from the wonderland applicaiton

    ReplyDelete
  2. Also an example of preconfigured threshold and specific condition

    ReplyDelete