Managing data during performance testing
We are going to talk about data in this blog post, predominately test data required for performance testing.
This is something that makes the life of a performance tester extremely difficult and awkward as because of the huge quantities required, in the right state, that match the criteria required for your test to run.
We often have to approach the use of large quantities data for the purpose of performance testing in a number of ways:
- Use a small subset,
- Ask business users or functional test teams for a set of data, that may become redundant after an execution of your testing because you may alter the state during testing making this data short lived and therefore requests to the business become frequent,
- Restore your database after each test iteration, this may not be possible if the database is shared and is certainly not agile,
- Create your own data before the tests start, this can be long-winded and may require multiple legacy system interaction or overnight batch processes to get the data in the right state.
Fear not, help is at hand from your friends at OctoPerf.
As we have seen above there are a number of valid ways to get data but the one we are going to explore and the one that make our tests reusable and agile is to query the database of the application under test, or any supporting application databases, to retrieve usable data using SQL this way we avoid our defined pitfalls above by:
- Taking as much data as we need from our databases,
- Not needing to ask the business because we have the relevant SQL at our disposal,
- Not needing to restore the database as we just take more of the data we need from our data-sources
- Avoid creating your own data using the user interface.
Ok, we need to caveat this if you have a performance test environment that does not contain a copy of production data, or artificially constructed representative data volumes then this approach will not work but you could argue that performance testing on a system that does not have production quantities of data is not representative so lets assume you do, because I think most will.
Before moving on there is obviously the ability, in JMeter, to create random data that conform to any number of rules (Dates, Random Strings or Integers, Values from Arrays, a value from any number of the JMeter native, or 3rd party, function libraries.
But if you need data that exists in your application under test to either use, update or delete then you need to know the values before testing starts and this is what we are looking at in this post.