Quantcast

ETS: binary matching in match specifications

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

ETS: binary matching in match specifications

Aaron Bawcom

There doesn’t seem to be any type of binary data type matching in a match specification besides ‘==’ or ‘/=’. To compare apples to apples and to exclude OS/hardware variances the following test shows the differences in using a 4 segment binary as a key (128bits total) and a 4-tuple of integers as a key. The segments of the binary and the tuple values for the lookups are randomly generated so as to eliminate any type of issue there. The methodologies in the 2 algorithms are exactly the same save the form of the key. Based on this data the speed of binary based lookups are somewhat superior than tuple based lookups. The problem is that we have a need to do certain types of ets matching on the binary keys (3 segments are _ whereas one segment is an int()). The question is this:

 

Why are there no other ets match conditions than ‘==’ and ‘/=’?

 

Is it just a development time/complexity issue or is there a more fundamental reason such as reducing the use of match specifications in favor of another approach. Or is it simply convention that this type of problem is best solved using the tuple approach? Our issue is that 99.999% of the time we need the speed of the binary ‘==’ match and the rest of the time we need an incredibly simple match on a portion of the binary. And the memory cost of storing an alternate structure for that small use case is not acceptable. It would be great if even ‘band’ could operate on a binary type in a match specification.

 

Thoughts?

 

% Col1 == Number of elements in table

% Col2 == Memory size of table

% Col3 == Average ets:member/sec for 10M lookups

 

1> test:do_ets_test(). % Using 4 segment 128bit Binary for Key lookup

4,    362,   7369196

19,   557,   6476683

79,   1337,  6476683

319,  4457,  6345177

1279, 16937, 5882352

 

3> test:do_ets_test(). % Using 4 element tuple of integers for Key lookup

4,    358,   6345177

19,   538,   5773672

79,   1258,  5777007

319,  4138,  5724098

1279, 15658, 5479452

 


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: ETS: binary matching in match specifications

Jeff Schultz
On Sun, Aug 07, 2011 at 08:08:43PM -0400, Aaron Bawcom wrote:
> % Col1 == Number of elements in table

> % Col2 == Memory size of table

> % Col3 == Average ets:member/sec for 10M lookups

> 1> test:do_ets_test(). % Using 4 segment 128bit Binary for Key lookup

> 1279, 16937, 5882352

> 3> test:do_ets_test(). % Using 4 element tuple of integers for Key lookup

> 1279, 15658, 5479452

There's something missing in these memory figures.  128b is 16 byte.
16937/1279 = 13.24, 15658/1279 = 12.24.  Doesn't look like these
figures include the keys.  (Hmm, 16937-15658 = 1279.  Looks like the
reported memory use is exactly 1 byte / element more for the binary
key version.)


In any case, the performance difference you show between the two
versions isn't large.  Does it matter that much?


    Jeff Schultz
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: ETS: binary matching in match specifications

Aaron Bawcom
1) For a lot of the binary keys the first 32 bits are zero which I believe may be optimized out from a memory perspective. I've seen this in other tests. But again, any savings there as far as memory size are visible in the numbers for both tests.
2) The memory sizes are what ets:info(tid()) returns
3) Yes an approximate 10% drop in speed matters.

Do you know the answer to the question? Why aren't there any binary pattern match conditions available in match specs? Is the answer simply "no one has needed them"?

-----Original Message-----
From: Jeff Schultz [mailto:[hidden email]]
Sent: Sunday, August 07, 2011 8:38 PM
To: Aaron Bawcom
Cc: [hidden email]
Subject: Re: [erlang-questions] ETS: binary matching in match specifications

On Sun, Aug 07, 2011 at 08:08:43PM -0400, Aaron Bawcom wrote:
> % Col1 == Number of elements in table

> % Col2 == Memory size of table

> % Col3 == Average ets:member/sec for 10M lookups

> 1> test:do_ets_test(). % Using 4 segment 128bit Binary for Key lookup

> 1279, 16937, 5882352

> 3> test:do_ets_test(). % Using 4 element tuple of integers for Key
> 3> lookup

> 1279, 15658, 5479452

There's something missing in these memory figures.  128b is 16 byte.
16937/1279 = 13.24, 15658/1279 = 12.24.  Doesn't look like these figures include the keys.  (Hmm, 16937-15658 = 1279.  Looks like the reported memory use is exactly 1 byte / element more for the binary key version.)


In any case, the performance difference you show between the two versions isn't large.  Does it matter that much?


    Jeff Schultz
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Loading...