Twitter recently released a new tool called Twitter Diffy (https://github.com/twitter/diffy) with the goal of comparing apis showing diffs of its responses for one or more given requests. As a curious tester I obviously tried it (aka tested it) and here are some of my experience with it and some thoughts.
First of all what is the diffy?
a nutshell it is a service that proxies a request to three endpoints. In those endpoints you have the same service, with your stable version (or previous tested production version) and your new version, your candidate to production version. So Diffy receives a request and replicate the request to both apis and then reads its responts, providing a very nice report on what is the difference between the old (production version) and the candidate.
In a nutshell, it reads the response (including all readers, even Content-Length) and run a diff on all of them. It is that kind of thing that you read about and ask yourself why you never did it before… but anyway, Twitter itself is one of those things.
If you work with microservices a bunch of questions and thoughts will run through your mind, such as:
Mehhh… it is not going to work. This kind of static verification has being proved more harmful and costly than helpful
What happens with timestamps, UUIDs, tokens and other data that will surly differ among both versions?
Humm… so it means that I am able to simply add this diffy thingie to my cloud/docker and point my two versions and voila, free api level testing?
Well, there are several other thoughts that come to mind when you talk about it, good and bad. That is exacly why I tried to have no bias (almost impossible since I love the way Twitter opensources stuff) and ran it in my local using docker. I did even shared a small repository with it, so you can enjoy it as well (https://github.com/camiloribeiro/dockdiffy). Continue reading →
Since a long time I am using Factory-Girl as fixtures replacement to run my tests in Ruby on Rails and Sinatra applications, but sometimes I use ruby with Selenium/Water/Cello/Rest-Client or other gems to run black box tests on a webpage or rest apis. In these cases we do not have access to the application’s internal implementation (intentionally or not) so it get a bit harder to use FactoryGirl and most of the libs for Factory in Ruby (many dependencies). In those cases, when the only think that you want to do is to have a factory that returns the objects that you need in your test cases, you could consider using or getting some inspiration on IceCream (https://github.com/camiloribeiro/icecream).
As my other gems, this one also is named using my terrible sense of humor and use a hard-to-get analogy with IceCream’s flavors. We are going to get deeper into it soon, but firs let’s introduce the concept and get in use with the idea.
Factory is a common design pattern to use when automating tests. It saves us a lot of time and reduces the complexity of managing testing data. This design pattern allows you to have a common interface that returns numerous different objects with custom or default values. In another words, you can have for example, a group of test person previously created, and when needed you just ask to your factory to provide the one that best fits your needs. Some libs also allow you to dynamically change attributes of your objects in run time, so you can do changes for specific cases without having a completely new entity. It reuses testing code, testing data and reduces complexity at the same time that allows you to increase the number of test cases with minimum effort.
To exemplify the analogy, you can imagine a freezer full of different flavors of ice creams. This freezer has a infinite number of many different ice creams. Every time that you need a different flavor, you just go to the freezer and take it. As simple as that. If you need to customize any flavor to create a new one based on an existing one, it is as easy as attribute a value to a variable. Simplicity is power, and IceCream is all about simplicity.
But where can I use it in my tests?
Pretend that we have to write tests to the create form of Ice Creams in an Ice Cream Shop. In order to do that , we will use the form to the left. We’ll write, in a superficial way, some test cases that would apply for this case, based uniquely on the form:
Create a flavor with all possible fields
Create a flavor with only the required fields
Create a flavor for each special “Light”, “Diet” and “Full of sugar”;
Try to create a flavor without each one of the required fields “Flavor”, “price” e “Color”;
Create a flavor wkth each one of the colors
Create a flavor with cents only
Try to create a flavor with letters in the price
etc, imagination is the limit for your test cases
All those test cases listed before are very simple test cases that any tester would think about when writing test cases for this feature. If you think about how to write a script or a test to test it, you will noticed that the code itself will not differ that much between the many tests that we write. If you follow this blog (or the http://www.bugbang.com.br), you probably saw how you can use smart ways to automate and reuse code in the blog post “I think, therefore I automate”.
It was quite common in Brazil to have big discussions on mailing lists about using or not testing automation. Yes, this post is dated 2011, and at that time I was vey in use with automation and continuous delivery, but a huge part of the testing community in Brazil was still drowned in the dark of exhaustive and repetitive manual testing for the same feature over and over again. Many readers claimed that this blog post, originally in Portuguese, helped them to think out of the box and understand a bit more about why it is so important to think about automation for daily tasks, such as apis, gui and features that will rest long enough to be changed and released often.
Even almost four years later, it is still common to have discussions where people are completely against test automation, not only in Brazil, so I think this post is still valid and could help other people to think more about how improve their tests.
The reason for many testers to stop automating tests and blame automation as a cost and not an investment is not the automation itself, but the lack of planning and the careless that many people have with their test codebases. When I say “planning”, I am not talking about hours discussing in meetings or extensive documentation, but about having good testers and developers working together and thinking about the test codebase just like they think about their production codebases. We are talking about designing your test codebase to be easy to use, sustainable, extensible and mainly easy to change.
For this exercise, we will use a well know front end testing framework called Watir.
If you want to to run the examples in your local, you will need to have ruby 1.8.7 (yes, old old old) or higher and then install the gem. I strongly recommend you to use a linux virtual machine, since I am not responsible for any issue you may have when running the examples here so keep calm and virtualize it!
With a machine running with ruby, you will have to install the gem. You can do it in many ways, but for this example, and assuming you are running a virtual machine, please run the following comand
$ gem install watir-webdriver
To exemplify how it works in practical terms, we are going to imagine a scenario super simple that will get harder and complex after time, just like any software project you have been:
A ordinary Quality Engineer was assigned with the task to go to google search and search for “Automação Rocks!” and check if the text “The Bug Bang Theory 2.0” shows in the first page. This engineer could choose among three different alternatives to perform this task:
Manual Exploratory Testing: Without saving the evidences or document the tests (planning != documenting). In “theory”, this is the fastest way to test and most simple. Two of the many disadvantages in this case are the lack of evidences and uncertainty when it comes to test coverage.
Test using a management tool to keep track of execution and its evidences: Documenting all test execution and saving its results with support of some testing management tool, such as TestLink. This would be for some people the second fastest way of testing, since you are not doing any code. It also does not suffer from the problems related to the first approach, since the documentation is saved together with the logs or test evidences.
Automating the tests: Write test as code supporting a continuous and reproducible cycle of testing with known coverage and free regression test.
The issue with the previous examples, the ones I am inspired on, is that it indeed keep each feature file in a different thread, but it run each thread sequentially, what at the end makes the test even slower. It happens, not because of the gradle itself, but because of the javaexec implementation, that for some reason, cannot run in parallel. I tried in many ways to run it, but it didn’t work. I could run different features using the previous exemples, but it did not run in parallel, instead, it was running sequentially, what in my case, was not very helpful.
This will run all the cucumber feature in sequence and it will take more or less 33 seconds. The out put may look like:
Feature: This just contains four scenarios that sleeps for one second each
This feature file just sleeps for a 4 seconds (one second each scenario)
It is not meant to do anything, but sleep
So we can prove that it runs in parallel
Scenario: Example 1 that sleep 1 seconds # ExampleOne-FourScenariosOneSecond.feature:7
Given I have the this
18 Scenarios (18 passed)
54 Steps (54 passed)
After running this, you can also check the logs on the terminal and the json reports “build/reports/cucumber/cucumber.json”.
Now you go back to the terminal and run the following command:
This post is the first in the series “Twelve Lessons Learned with Performance Testing Server Side”. This serie was adapted of the post “12 Lições Aprendidas em Testes de Performance Server Side” posted in portuguese in my blog “The Bug Bang Theory“, originally published on January 31, 2013. The original post was developed mostly based on my experience after several months working close to some great performance engineers when I was a Consultant at ThoughtWorks. At that time I was leading the performance testing for the biggest magazine company in south america. The projects were developed mostly in Ruby on Rails based on micro-services with high availability and high requirements for response times and simultaneous users.
As main performance testing engineer in the account, it was my responsibility to develop the tests, collect the data, diagnose main problems and its dependencies, write the reports periodically to the client, follow and highlight the improvements and make sure that the performance tests were part of the continuous delivery pipeline. In order to do that, we choose the Apache JMeter, a tool that I was in use with, that is open source, easy to use and well known and very well tested in the open source world. Another feature that was very welcome on JMeter was its cli (command line interface) that made it possible to develop script in order to make it automatable with minimum effort.
Apart of JMeter, another tool that was very useful to evaluate the problems and to identify issues faster was the NewRelic, a monitoring tool in real time, that opened our eyes to the internal behaviour and bottlenecks while the JMeter was loading and stressing the apis and collecting information about part of the external behaviours.
Below you will find some observations and lessons learned during those months in the awesome world of performance testing:
Lesson 1 – The Average Response Time is the Fool’s gold of Performance Testing
It is quite common to see performance acceptance criteria based exclusively on average response time. I have seen many professionals in forums and blogs and even in some materials such as training and certification booklet, referring to the average response time as a metric that defines if your performance is acceptable or not, but it is not true at all.
The average response time is the sum of all response times divided by the number of samples, in this case by the number of requests. For that reason, it is commonly taken as the response time that a visitor will get when he or she visits a page under a predefined load. The average response time must be seen as a indicator among many other indicators much more important them the average response time, so it never, ever should be used as the primary indicator and specially not as the only one to evaluate the performance of a website, page or service.
To exemplify how the average response time can cause more harm than good if not taken with many other indicators, we can pretend that we just run a performance test and we got the follow response times:
5, 11, 5, 1, 5, 2, 1, 5 e 1.
The data shown previously can be graphically represented as the follow line chat:
To make it easy to understand, we are working with a very small number of samples.
Now let’s pretend that our product owner or whoever take the business/technical decisions on the performance subject, said that usually, for this kind of system, the user give up on the loading of a page or a service after four and a half seconds without a complete response. In this case, it is clear that we want the system to have a response time under four and a half seconds.
If we take the average response time as the ultimate indicator, when evaluating the previous data, we will have something like that:
(5+11+5+5+5+2+1+1+1) / 9 = 4 seconds
In this case, using only the average response time, we could say that based in our test data, in the scenario where we have the given load, the users are happy, because the average response time is the border of the acceptance criteria. But if we look from another perspective, we will realize that from our nine samples, only four are below the point where the users get frustrate and abandon the page. The data is the same, the way you look at it is different (we will see this perspective in detail in the next lesson ahead in this post).
Do not take me wrong, the average response time has a lot to say, but if not taken carefully, it can be very dangerous and guide your towards false positives. It is unquestionable that it can highlight slower services and pages and it is not a problem to use it to get a quick perception of the response times during a first evaluation, but it is one of the poorest ways to interpret your test data and should not be used without other metrics.
When we do something, we become more and more experienced on it. We know it well, because it doesn’t worth to pay an expensive classical guitar course if we don’t practice it daily, it does not worth read a good technical book if we don’t include what we’ve read in our work and so on. Testing automation isn’t different.
But how someone who actually doesn’t work with automation can introduce it and learn practicing in his or her daily?
To do it you don’t need expensive courses, high tech tools, super computers or any other unusual thing. Not even a different operating system is required. To be honest you just need want to learn, because what I propose here, although simple and easy, can be boring in the beginning.
The programming language Ruby provides a resource to work in the console/terminal executing small pieces of code called Interactive Ruby Shell, well known as irb. The irb is used by developers to test small pieces of code, execute simple tasks, debug defects, test a code change, etc. It provides a way to execute small snips of code with flexibility and support enough to use all the resources Ruby language provides.
With irb resources we will practice a little of two useful testing automation frameworks today. At the end of this sort tutorial, you will learn how to practice whenever you want and using your favorite framework, not limited to the ones I chose to exemplify the use of irb. The first goal I propose it to comment this blog post without touch the browser directly, using only watir-webdriver and/or selenium-webdriver API to do taht. Good luck!