Performance Monitoring choosing a Sampling Period

I was recently visiting the St Ives Tate Art Gallery (I know I am so cultured/it was raining) and came across an art work called Measuring Niagara with a Teaspoon by Cornelia Parker. The blurb about the work is based on her interest in the absurd task of measuring an enormous object with such a small object. In this case Niagara falls using a tea spoon.

So, as a performance engineer this got me thinking about how we choose the sampling period when collecting performance data. We generally believe the more granular (shortest sampling period) the better as detail is lost when data is sampled over longer periods. For example, the graphic below is for CPU utilisation sampled at different periods and as you can see the odd behavior of the CPU does not become apparent until sampling every second.

However, the shorter the sampling period the more data we collect and have to anaylse. There are often times when we don’t have the luxury of being able to collect data with a really short sampling period. In which case how do we choose the best sample period?

For me, it comes down to what am I looking for in the data. For example, users are complaining about an intermittent problem where for about 15 minutes during the day response time is really slow across all user transactions. The time delay in the incident reporting means I can’t rely on using a circular buffer to store a couple of hours of highly sampled data which could be flushed just after an incident. So, I need to collect day, I would live with a sampling period of about 5 to 10 minutes. This means I would have enough data to correlate changes in resource usage with when the users reported performance issues.

Another example would be for looking for reasons behind slow response times. For example we are looking for reasons the response time has increase to over 6 seconds. In that case the minimum I would accept would be 3 seconds.

You can see in most cases the rule I apply is at least sample at 1/2 the duration of the period of interest. Ideally I would go 1/3 the period of interest. As they say you need 3 data points to draw a curve!

There is possibly some science behind this. There is the Nyquist Theorem which is used in signal processing. It says you need to sample twice as much as the frequency of the analogue signal you are converting. There is a bit of maths to prove the theorem but basically it makes sense to think you need to have at least two data points to detect a noticeable change in the system you are sampling.

Author: loadtestguy

Just a performance tester that wants to record the things I should remember about LoadTesting

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: