您的位置:首页 > 其它

Model-Based Long Haul Testing

2011-10-31 16:54 197 查看



“Long-haul” is an important testing technique aiming to simulate product usage over an extended period of time. Such kind of testing is necessary because some difficult bugs in software may get caught via long-haul usage only.
Model-based testing is an advanced testing methodology that promises to increase the productivity of the professional tester. I discussed some aspects of modeling and model-based testing in previous articles:

here,
here,
here and
here.
Model-based testing implies that the test developer has less control over test scenarios in exchange for increased productivity. This may seem at odds with long-haul testing which requires
more control over the test runs.
Can we reconcile the two approaches? This article tries to give an answer.

What exactly is long haul testing?




Informally speaking, long haul testing is a repetition of select functional scenarios. The goal is to emulate long-term usage – days, weeks or months – without having to wait days, weeks or months. Another goal is to detect special software bugs such as memory
leaks that accumulate slowly.
Let us consider Microsoft Word users. People using documents heavily may open dozens of files a day within a single Word session. We may assume that some of these people keep Word open for days or even weeks in a row. This means that Word
must be able to open and then close hundreds or even thousands of documents in a single session.
We can test this via automation: we write a test script that opens and then closes a document. Then we run the script 10000 times. If each round takes less than a second then we are done in less than three hours.
Something more complex – like open, edit a little and
close? No problem. We write a little test script doing just that (open,
edit a little, close), we run it 10000 times and voilà, we have long haul testing for
open-edit a little-close.
It seems long haul is no big deal: we take a scenario, we run it many times, we check that the system under test is nice and happy. If it is, we’re done. If not, we file a bug.
However, this is not what happens in real life. It is true that people do open and close documents and it is true that they usually edit a little in between. However, it’s not the same kind of editing that takes place. Yet, the dumb repetition
of the same scenario over and over again does just that: it repeats. Quite boring.
The question is: can we diversify the enacted scenarios to resemble as much as possible what happens in the actual usage extended over a long period of time?
Yes, we can and this is where modeling comes to play.

Models as simplified behavior




A model of a software system may be seen as a simplification of the system’s behavior. Therefore, analyzing the behavior of a model means, in fact, analyzing the behavior of the actual system. By consequence, long haul testing may be seen as enforcing those
model behaviors that last as long as possible, up to a limit imposed by the tester.
How this “long lasting” behavior gets represented depends heavily upon the model. For state-based models, this means finding long paths within the state graph – cycles are great for such purpose. For models represented the functional way,
this means detecting recursions. For models represented as iterative executions, this means detecting loops.
This article deals with state-based models with NModel because this is the framework I’ve been talking about in my previous posts.
Long paths! How long?
When we are talking about long paths within the state graph of a model we’re actually talking about chains of states that do not end or that end as late as possible. It is impossible to process the state graph of a real life model since the
number of states is virtually infinite. All we can do it is to try to avoid the end states as much as possible.

Avoiding the end




Fortunately enough, NModel lets us know whether a state is final or not. Unfortunately enough, we do not have any means to “think reversely”, i.e. to go backwards from a final state towards the states that are leading to it. In the absence of such “reverse
thinking” we have to resort to probabilities.
Theoretically we can assign to any state
S
a probability of the event consisting in “the automaton reaches a final state when commencing from
S
”. However, computing this probability is virtually impossible. We have to approximate it - we’ll see below how.
Given that it’s hard to asses the states themselves since they come in fabulous numbers, we have to replace them with state transitions – whose number is finite. We used this technique with

frequency-based testing.

Putting in buckets



Assuming
we are talking about actions from this point on, we can approximate the above mentioned probabilities by using a recursively defined set of distinct collections (“buckets” of actions):
1. each action that leads to a final state at least once goes to
Bucket1
.

2. if an action
A
gets followed by an action
B
from
BucketN
, then:

2.1 if
A
doesn’t belong to any “bucket”, then it goes to
BucketN+1
.

2.2 if
A
belongs to
BucketM
, then it’s moved (if necessary) to
Bucketmin(M, N+1)
.
Any action that remains outside the “buckets” may be considered as belonging to a “bucket” with a very high index, like
BucketCount+1
where
Count
is the number of buckets. Obviously, the set of “buckets” must persist between test runs for the whole system to have any meaning.
When executed repeatedly, the transitions from one state to another will make the actions to “bubble up” from “bucket” to “bucket” towards
Bucket1
until the whole system eventually stabilizes.
At that moment the “bucket” system tells us how probable is for a certain action to lead to a final state: the lower the index of the “bucket”, the higher the probability of the action to lead to a final state. Naturally,
Bucket1
corresponds to the probability 1 of the event “may lead to a final state”.

In (Markov) chains




Not all the actions are equal within the same “bucket”. Some actions lead to “buckets” placed further away from
Bucket1
, other actions get closer to
Bucket1
(but no closer than the “bucket” immediately next to them, otherwise they move to another “bucket” further up).
It is important to choose those actions that lead to “buckets” as farthest away as possible from
Bucket1
. How can we know upfront to what “bucket” a certain action will lead?
We cannot know that precisely since the actual states get constructed on the fly. Yet, we can keep an
average target index obtained from averaging the indexes of all the “buckets” that the action has lead to in the past. The greater this average target index is, the lower the chances that the action will lead to a final state so the more eligible that
action should become. This approach resembles the state machines with probabilistic transitions known as Markov chains (hence the title of the section).
Because the “bucket” of a certain action changes over time, it is not recommended to keep an average of all the target indexes from the very beginning but it’s better to use a formula that gives more weight to the newest occurrences and gradually
“forgets” the oldest ones.
An appropriate formula is:
AvgN = AvgN-1 + (TargetN-AvgN-1)*K

where
TargetN
is the current index of the target “bucket”,
AvgN-1
is the previous average index and
AvgN
is the new average index.
K
is a number greater than
0
and smaller than
1
.
By decomposing the recursive formula from above one can see that it’s actually a weighted sum of all the previous target indexes, the weights being powers of
(1-K)
. This summed geometric progression with a factor smaller than
1
leads to the older indexes getting “forgotten”. A value of
K
closer to
0
produces more stability but also more latency (more past values are relevant).

Choosing wisely




The “bucket” system adorned with average target indexes makes choosing the next action an easy task: we parse the “buckets” from the highest index towards 0 until we reach a “bucket” that has at least one action that can follow the current action. From that
“bucket” we choose the eligible action with the highest average target index.
Assuming the bucket
(“bucket index”, “average target index”)
couple is a fairly good approximation of the chance to reach a final state from a given state, it results that the final states get avoided without the cost of exploring
the state space in its entirety.
The “bucket” system is not perfect, of course. The probability approximations given by the
(“bucket index”, “average target index”)
pairs are pretty coarse in the beginning so the first runs may not be particularly long.
Yet, as the runs repeat, the “bucket” system stabilizes and it yields longer and longer sequences – up to detecting and following infinite cycles within the state graph.

Avoiding boredom




The bucket system is efficient at avoiding final states – hence producing longer paths – yet it has a major drawback: it is completely ignorant of how often a certain action has been selected. Recall that the average target index of an action evaluates only
how far away from the final states the action will lead and not how many times the action has been executed.
The result of this ignorance is that the system may get stuck within an infinite cycle without ever trying to escape because choosing the same action or group of actions over and over again doesn’t correlate with the probability of leading
towards a final state. Moreover, the chance of such dull repetitions grows tremendously if the cycle at fault is in the proximity of the start state.
So, we must provide a mechanism to “spice up” the selection of actions so that the system under test doesn’t get “bored” from being exercised the same way for a too long time.
The next section suggests some ways to do it.

“Spycing up” test scenarios




There is more than one way to increase the variety of path selection and to produce more lively scenarios. We discuss several of them from the most simple to more complex.
Randomizing
The first method is to randomize. If two actions belong to the same “bucket” and have about the same average target index, then choose one randomly.

Pros: the method is simple.

Cons: it doesn’t really avoid infinite loops if all the eligible actions lead to such loops.
Adding frequencies to the “buckets”
The second method is to combine the average target index with the frequency computed according to the method shown

in a previous post. We can do that in at least two ways:

we choose first by average target index (rounded to integers) and
then by frequency.

we compute a number based on average target index and frequency and we choose based on that number. A simple way to compute that number is to divide the average target index by the frequency. It is not advisable to do another operation
because both count and reverse of frequency are akin to probability measures whereas their combination is akin to intersecting probabilistic events.

The first way preserves the probability of avoiding a final state better. The second way preserves the chance to avoid boring infinite loops better.

Pros: the method preserves the general framework based on “buckets”.

Cons: if all the states within a certain “bucket” lead to infinite cycles there’s still chance to get stuck in unproductive repetitions.
Combining “buckets”, target indexes and frequencies together
The third method consists in maintaining a value
f(bucket_index, average_target_index, frequency)
for each action and to choose the eligible action with the highest value for
f
.
That function
f
must have the following properties:

[align=justify]it must increase as the bucket index increases.[/align]

[align=justify]it must increase as the average target index increases.[/align]

[align=justify]it must decrease as the frequency increases.[/align]

Here are some possible forms for the
f
number:

[align=justify]linear:
f(Bucket, Avg, Freq) = A*Bucket + B*Avg - C*Freq
[/align]

[align=justify]geometric:
f(Bucket, Avg, Freq) = A*Bucket*Avg / B*Freq
[/align]

[align=justify]exponential:
f(Bucket, Avg, Freq) = (Bucket*Avg)A/Freq
[/align]

[align=justify]invers-exponential:
f(Bucket, Avg, Freq) = (Bucket*Avg)1-Freq/M
[/align]

[align=justify]rational:
f(Bucket, Avg, Freq) = A*Bucket*Avg*(1-1/(M-Freq))
[/align]

M
is a positive number larger than
max(Freq)+1
. We can maintain a value large enough by choosing initially an arbitrary value and then by increasing it whenever a frequency surpasses it.
It should be noted that the
f
number should decrease smoothly with the frequency, otherwise the chains of states get curtailed too early. Unfortunately, only the last two formulas from above satisfy this condition.

Pros: the method ensures that testing is not stuck in infinite cycles since each cycle “erodes” over time.

Cons: choosing an appropriate
f function for a given state machine may not be easy or even possible.

The dangers of getting too high




The previous section shows how we can use action frequencies to diversify the long-haul scenarios by “eroding” cycles that get exercised too much. Also, the “erosion” may go very smoothly initially, thus protecting the cycles from getting curtailed too early.
Using unlimited action frequencies has a drawback, though: the impact of a single change decreases over time as the value of the frequency gets higher and higher. For this reason it is better to limit the frequencies. The simplest way is
to limit the
M
coefficient and whenever some frequency equals
M-1
all the frequencies have to be divided by a value greater than
1
.

Conclusions

Long haul testing is an important part of quality assurance because it simulates usage over a long period of time and it uncovers software errors hard to detect by other means.
Long haul testing and model-based testing seem to be at odds because long haul testing requires more control over the test runs from the part of the tester whereas model-based testing implies less control over scenario generation.
This article proposes a method to reconcile long haul testing with model-based testing by using a system based on probabilistic classes named “buckets” combined with frequency considerations to preserve the variety of test scenarios during
the long runs.

Posted by
Marius Filip
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: