QuickCheck module for testing the new string module

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

QuickCheck module for testing the new string module

Björn Gustavsson-4
I have written a QuickCheck module to test Dan's new string module
(https://github.com/erlang/otp/pull/1330). It found some bugs.

Here is the gist for anyone interested:

https://gist.github.com/bjorng/03a869392d5a969ebf2c40044b664190

Comments are welcome. This is my first major use of QuickCheck. I am
interested in how I could improve the QC specifications and
generators.

/Björn
--
Björn Gustavsson, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Alex S.
Hey Björn,

for those of us who don’t have QC handy, could you provide the sample run output, please?

> 6 апр. 2017 г., в 16:26, Björn Gustavsson <[hidden email]> написал(а):
>
> I have written a QuickCheck module to test Dan's new string module
> (https://github.com/erlang/otp/pull/1330). It found some bugs.
>
> Here is the gist for anyone interested:
>
> https://gist.github.com/bjorng/03a869392d5a969ebf2c40044b664190
>
> Comments are welcome. This is my first major use of QuickCheck. I am
> interested in how I could improve the QC specifications and
> generators.
>
> /Björn
> --
> Björn Gustavsson, Erlang/OTP, Ericsson AB
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Björn Gustavsson-4
Running on the latest string.erl is boring since all tests pass. But I
found this this output in one of my xterm windows:

24> c(string_eqc), eqc:quickcheck(eqc:numtests(10000, string_eqc:prop_chomp())).
....................................................................................................(x10)..............................................................................Failed!
After 881 tests.
{{[1060,10,13,13,10,10,13,13],[1060,10,13,13,10,10,13,13]},
 {list,{split,2,{{split,1,{list,binary}},list}}}}
flat: passed
binary: passed
mixed: failed
string:chomp([[[[1060]|<<"\n\r\r">>],10,10,13,13]])
Shrinking .xxxxxxxxx..xxxxxxx.(4 times)
{{"\r\r\n\n\r","\r\r\n\n\r"},{list,{split,2,{{split,1,{list,binary}},list}}}}
mixed: failed
string:chomp([[[[]|<<"\r\r">>],10,10,13]])
false
25> string:chomp([[[[]|<<"\r\r">>],10,10,13]]).
[13,<<"\r">>,10,13]

(Dan has now fixed this bug.)

/Bjorn

On Thu, Apr 6, 2017 at 3:30 PM, Alex S. <[hidden email]> wrote:

> Hey Björn,
>
> for those of us who don’t have QC handy, could you provide the sample run output, please?
>> 6 апр. 2017 г., в 16:26, Björn Gustavsson <[hidden email]> написал(а):
>>
>> I have written a QuickCheck module to test Dan's new string module
>> (https://github.com/erlang/otp/pull/1330). It found some bugs.
>>
>> Here is the gist for anyone interested:
>>
>> https://gist.github.com/bjorng/03a869392d5a969ebf2c40044b664190
>>
>> Comments are welcome. This is my first major use of QuickCheck. I am
>> interested in how I could improve the QC specifications and
>> generators.
>>
>> /Björn
>> --
>> Björn Gustavsson, Erlang/OTP, Ericsson AB
>> _______________________________________________
>> erlang-questions mailing list
>> [hidden email]
>> http://erlang.org/mailman/listinfo/erlang-questions
>



--
Björn Gustavsson, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Kostis Sagonas-2
On 04/06/2017 03:43 PM, Björn Gustavsson wrote:
> Running on the latest string.erl is boring since all tests pass.

Is the *original* (or at least the one you started from) string.erl
available anywhere?   Can it be posted here?

Kostis

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Björn Gustavsson-4
On Thu, Apr 6, 2017 at 3:46 PM, Kostis Sagonas <[hidden email]> wrote:
> On 04/06/2017 03:43 PM, Björn Gustavsson wrote:
>>
>> Running on the latest string.erl is boring since all tests pass.
>
>
> Is the *original* (or at least the one you started from) string.erl
> available anywhere?   Can it be posted here?

You will also need a few other modules (unicode, unicode_util).

From the Git's reflog, I have recovered the branch as
it looked when I started working on string_eqc. I have
created a new branch and pushed it to github:

https://github.com/bjorng/otp/tree/buggy-new-string-module

/Bjorn


--
Björn Gustavsson, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Michael Truog
In reply to this post by Björn Gustavsson-4
It would help if you switched to using PropEr instead of QuickCheck, since QuickCheck for Erlang requires a license.  Using PropEr instead would be more similar to using QuickCheck in Haskell due to its availability.  That should also allow tests like these to be included easily in Erlang/OTP source and reused when testing the Erlang/OTP source code.

Best Regards,
Michael

On 04/06/2017 06:26 AM, Björn Gustavsson wrote:

> I have written a QuickCheck module to test Dan's new string module
> (https://github.com/erlang/otp/pull/1330). It found some bugs.
>
> Here is the gist for anyone interested:
>
> https://gist.github.com/bjorng/03a869392d5a969ebf2c40044b664190
>
> Comments are welcome. This is my first major use of QuickCheck. I am
> interested in how I could improve the QC specifications and
> generators.
>
> /Björn

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Krzysztof Jurewicz
Michael Truog writes:

> It would help if you switched to using PropEr instead of QuickCheck, since QuickCheck for Erlang requires a license.  Using PropEr instead would be more similar to using QuickCheck in Haskell due to its availability.  That should also allow tests like these to be included easily in Erlang/OTP source and reused when testing the Erlang/OTP source code.

Unfortunately, PropEr is licensed under GNU GPL 3.0, which is not compatible with Apache License 2.0. Probably this will not change anytime soon. See this issue for further reference: https://github.com/manopapad/proper/issues/29

triq may be a better solution, as it is licensed under Apache License 2.0. Here is the most active fork: https://github.com/triqng/triq
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Jesper Louis Andersen-2
In reply to this post by Björn Gustavsson-4
On Thu, Apr 6, 2017 at 3:26 PM Björn Gustavsson <[hidden email]> wrote:

Comments are welcome. This is my first major use of QuickCheck. I am
interested in how I could improve the QC specifications and
generators.


I think it's looking rather good. If you have the commercial variant of EQC, here are some things you may want to do:

I'm very fond of using eqc:module({testing_budget, 30}, Mod) because it gives you 30 seconds of tests equally distributed over the properties in the file. Time bounds are usually nicer than number of tests. You can also weight your properties so those which most often fail are tested a bit more. I tend to pick 15 seconds when developing, 2-5 minutes for coffee runs, 30 minutes for lunch and 12 hours for when you leave the office or go to sleep.

You can use the in_parallel transform on your file to execute your test cases on all cores. The speedup is more or less linear in the number of cores.

Start using classification in your test cases. You want to classify on the structure of your generated strings, so you can see if you actually cover a realistic set of strings or if you are looking at rather small strings only.

Collect information about the length of your strings. I have some tooling in https://github.com/jlouis/eqc_lib/blob/master/eqc_lib.erl for summarizing data in the form of what R does on a data set (and stem+leaf plots). Again, the goal is to verify that your generator is generating a realistic input set.

Since we are trying to handle unicode, I would lace the input with a frequency generator which deliberately creates strings which are known to be naughty[0][1]. In principle we should hit them randomly after a while, but it is often simpler to just generate all the nasty strings more often in the code. Normal tests and use are likely to quickly hit the common faults. So go straight for the jugular: hit all the corner cases early and often. you want to hit errors in less than 100 test cases if possible. The goal here is to crash the code base. In general, look up what people in the security world are using as fuzzing inputs.

Another point, which you may already cover, is that of negative testing:

* Positive: Valid inputs must succeed with the right value
* Negative: Invalid inputs should return the right error or throw an error

In my maps_eqc tests, which are available at [2], we have the following lines:

https://github.com/jlouis/maps_eqc/blob/3ab960018684785415e7265245889caf083e330c/src/maps_eqc.erl#L320-L379

which verifies the property of the maps module if you input values which are not valid maps or inputs. We can, in each case, predict what the error should be, especially in the situation of {badkey, K} errors. This in turn ensures that the error cases are hit in all cases.

Typical strategy here is either to use the fault/2 generator and then use a parameter to alter the fault injection rate. Or to have separate properties which always generate faulty input. Lace the generator with a 10% fault injection rate at each part of your tree, say, so the chances of generating a fault is fairly high when multiple such are taken together. Then guard it with a ?SUCHTHAT on acutally having a fault. But beware having to search too much in the ?SUCHTHAT as that slows down test case generation. Classification of the types of faults become paramount here. You can find some of these strategies used in my enacl test cases[3]

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Benoit Chesneau-2
In reply to this post by Krzysztof Jurewicz

what is the issue in using a tool using the gpl3 license? it's not like you will include in your code.

Anyway how does triq compare? does it provides the same level of features. It would be indeed a lot better to provides tests in a form that anyone can launch without having to buy a really expensive license...

- benoit

On 6 April 2017 at 20:34:58, Krzysztof Jurewicz ([hidden email]) wrote:

Michael Truog writes:

It would help if you switched to using PropEr instead of QuickCheck, since QuickCheck for Erlang requires a license. Using PropEr instead would be more similar to using QuickCheck in Haskell due to its availability. That should also allow tests like these to be included easily in Erlang/OTP source and reused when testing the Erlang/OTP source code.

Unfortunately, PropEr is licensed under GNU GPL 3.0, which is not compatible with Apache License 2.0. Probably this will not change anytime soon. See this issue for further reference: https://github.com/manopapad/proper/issues/29

triq may be a better solution, as it is licensed under Apache License 2.0. Here is the most active fork: https://github.com/triqng/triq
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Björn Gustavsson-4
Note that string_eqc.erl is in Gist in my Github user and
not in the otp repository.

The tests that we "provide" are in a common_test test suite:

https://github.com/dgud/otp/blob/1ac2425d7c4791fbce55cb662154e3646c5f9af0/lib/stdlib/test/string_SUITE.erl

Those tests don't require QuickCheck or any similar tool.
Tests for all bugs that we found using QuickCheck have
been incorporated into string_SUITE.erl.

We have no plans to incorporate QuickCheck into our
test suites or daily builds.

Using QuickCheck was just a way to find more bugs in
a (more or less) new module before we released it.
Instead of doing a traditional code review, we thought
that we would find more bugs if we would write
QuickCheck properties based on the documentation.

And, yes, we could have used PropEr. I was not
personally aware of triq. I choose QuickCheck
because Ericsson has a license and because I
have previously used eqc_erlang_program to
generate random Erlang programs to test the
compiler.

/Björn


On Fri, Apr 7, 2017 at 7:14 PM, Benoit Chesneau <[hidden email]> wrote:

>
> what is the issue in using a tool using the gpl3 license? it's not like you
> will include in your code.
>
> Anyway how does triq compare? does it provides the same level of features.
> It would be indeed a lot better to provides tests in a form that anyone can
> launch without having to buy a really expensive license...
>
> - benoit
>
> On 6 April 2017 at 20:34:58, Krzysztof Jurewicz
> ([hidden email]) wrote:
>>
>> Michael Truog writes:
>>
>> It would help if you switched to using PropEr instead of QuickCheck, since
>> QuickCheck for Erlang requires a license. Using PropEr instead would be more
>> similar to using QuickCheck in Haskell due to its availability. That should
>> also allow tests like these to be included easily in Erlang/OTP source and
>> reused when testing the Erlang/OTP source code.
>>
>>
>> Unfortunately, PropEr is licensed under GNU GPL 3.0, which is not
>> compatible with Apache License 2.0. Probably this will not change anytime
>> soon. See this issue for further reference:
>> https://github.com/manopapad/proper/issues/29
>>
>> triq may be a better solution, as it is licensed under Apache License 2.0.
>> Here is the most active fork: https://github.com/triqng/triq
>> _______________________________________________
>> erlang-questions mailing list
>> [hidden email]
>> http://erlang.org/mailman/listinfo/erlang-questions



--
Björn Gustavsson, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Björn Gustavsson-4
In reply to this post by Jesper Louis Andersen-2
Thanks! I will certainly save your email to
use as a reference.

I have not done any negative tests, mostly
because I ran out of time. It is also somewhat
tricky, because many functions don't always
traverse all of their input string (e.g. string:equal/2
return false as soon as two characters are not
equal; the rest of the input strings are not validated).

/Björn

On Fri, Apr 7, 2017 at 5:10 PM, Jesper Louis Andersen
<[hidden email]> wrote:

> On Thu, Apr 6, 2017 at 3:26 PM Björn Gustavsson <[hidden email]> wrote:
>>
>>
>> Comments are welcome. This is my first major use of QuickCheck. I am
>> interested in how I could improve the QC specifications and
>> generators.
>>
>
> I think it's looking rather good. If you have the commercial variant of EQC,
> here are some things you may want to do:
>
> I'm very fond of using eqc:module({testing_budget, 30}, Mod) because it
> gives you 30 seconds of tests equally distributed over the properties in the
> file. Time bounds are usually nicer than number of tests. You can also
> weight your properties so those which most often fail are tested a bit more.
> I tend to pick 15 seconds when developing, 2-5 minutes for coffee runs, 30
> minutes for lunch and 12 hours for when you leave the office or go to sleep.
>
> You can use the in_parallel transform on your file to execute your test
> cases on all cores. The speedup is more or less linear in the number of
> cores.
>
> Start using classification in your test cases. You want to classify on the
> structure of your generated strings, so you can see if you actually cover a
> realistic set of strings or if you are looking at rather small strings only.
>
> Collect information about the length of your strings. I have some tooling in
> https://github.com/jlouis/eqc_lib/blob/master/eqc_lib.erl for summarizing
> data in the form of what R does on a data set (and stem+leaf plots). Again,
> the goal is to verify that your generator is generating a realistic input
> set.
>
> Since we are trying to handle unicode, I would lace the input with a
> frequency generator which deliberately creates strings which are known to be
> naughty[0][1]. In principle we should hit them randomly after a while, but
> it is often simpler to just generate all the nasty strings more often in the
> code. Normal tests and use are likely to quickly hit the common faults. So
> go straight for the jugular: hit all the corner cases early and often. you
> want to hit errors in less than 100 test cases if possible. The goal here is
> to crash the code base. In general, look up what people in the security
> world are using as fuzzing inputs.
>
> Another point, which you may already cover, is that of negative testing:
>
> * Positive: Valid inputs must succeed with the right value
> * Negative: Invalid inputs should return the right error or throw an error
>
> In my maps_eqc tests, which are available at [2], we have the following
> lines:
>
> https://github.com/jlouis/maps_eqc/blob/3ab960018684785415e7265245889caf083e330c/src/maps_eqc.erl#L320-L379
>
> which verifies the property of the maps module if you input values which are
> not valid maps or inputs. We can, in each case, predict what the error
> should be, especially in the situation of {badkey, K} errors. This in turn
> ensures that the error cases are hit in all cases.
>
> Typical strategy here is either to use the fault/2 generator and then use a
> parameter to alter the fault injection rate. Or to have separate properties
> which always generate faulty input. Lace the generator with a 10% fault
> injection rate at each part of your tree, say, so the chances of generating
> a fault is fairly high when multiple such are taken together. Then guard it
> with a ?SUCHTHAT on acutally having a fault. But beware having to search too
> much in the ?SUCHTHAT as that slows down test case generation.
> Classification of the types of faults become paramount here. You can find
> some of these strategies used in my enacl test cases[3]
>
> Feel free to question me with stuff if needed!
>
> [0] http://www.lookout.net/2011/06/special-unicode-characters-for-error.html
> [1] https://github.com/minimaxir/big-list-of-naughty-strings
> [2] https://github.com/jlouis/maps_eqc
> [3] https://github.com/jlouis/enacl/blob/master/eqc_test/enacl_eqc.erl



--
Björn Gustavsson, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Krzysztof Jurewicz
In reply to this post by Benoit Chesneau-2
Benoit Chesneau writes:

> what is the issue in using a tool using the gpl3 license? it's not like you
> will include in your code.

You have to include PropEr header file which is licensed under GPL v3. (I’m not saying that there would be no licensing problems if this file was not included).

> Anyway how does triq compare? does it provides the same level of features.

I would guess that PropEr is technically superior to triq, but I have no systematic comparison to prove it.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: QuickCheck module for testing the new string module

Michael Truog
On 04/09/2017 02:48 AM, Krzysztof Jurewicz wrote:
> Benoit Chesneau writes:
>
>> what is the issue in using a tool using the gpl3 license? it's not like you
>> will include in your code.
> You have to include PropEr header file which is licensed under GPL v3. (I’m not saying that there would be no licensing problems if this file was not included).
>
>> Anyway how does triq compare? does it provides the same level of features.
> I would guess that PropEr is technically superior to triq, but I have no systematic comparison to prove it.
>
Using PropEr as a build/test dependency is similar to relying on autoconf (under GPL license), but it would be more ideal to have triq usage within the Erlang/OTP installation, to be used like QuickCheck is used in Haskell, i.e., for any testing without restriction.  Having repeatable tests is an important part of science, so this is a very basic, fundamental concern.

Best Regards,
Michael
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Loading...