|
Hi guys,
I wanted your take on how to implement a simple small key-value cache, maybe holding around a few hundred tuples. The thing is, I don't want the cache to be consuming all my memory so that entries in the cache expires if it isn't read for some time or some maximum size is met. Jan Vincent Liwanag [hidden email] ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
Hello,
# Jan Vincent 2009-11-10: > I wanted your take on how to implement a simple small key-value > cache, maybe holding around a few hundred tuples. The thing is, I > don't want the cache to be consuming all my memory so that entries > in the cache expires if it isn't read for some time or some maximum > size is met. An ETS table owned by a gen_server that runs periodic cleanup on it? -- Jachym ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
On Tue, Nov 10, 2009 at 11:25 AM, Jachym Holecek <[hidden email]> wrote:
> Hello, > > # Jan Vincent 2009-11-10: >> I wanted your take on how to implement a simple small key-value >> cache, maybe holding around a few hundred tuples. The thing is, I >> don't want the cache to be consuming all my memory so that entries >> in the cache expires if it isn't read for some time or some maximum >> size is met. > > An ETS table owned by a gen_server that runs periodic cleanup on it? > I find myself writing purpose built gen_servers that maintain the cache their state for this sort of thing. To 'expire' items in the cache, you could run another process as a timer that calls an expire method on the gen_server. There are more moving parts here, but they're decoupled and avoid using ETS for what's a pretty simple caching requirement. Garrett ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
Why wouldn't an ETS be good for this?
On Nov 11, 2009, at 1:45 AM, Garrett Smith wrote: > On Tue, Nov 10, 2009 at 11:25 AM, Jachym Holecek <[hidden email]> wrote: >> Hello, >> >> # Jan Vincent 2009-11-10: >>> I wanted your take on how to implement a simple small key-value >>> cache, maybe holding around a few hundred tuples. The thing is, I >>> don't want the cache to be consuming all my memory so that entries >>> in the cache expires if it isn't read for some time or some maximum >>> size is met. >> >> An ETS table owned by a gen_server that runs periodic cleanup on it? >> > > I find myself writing purpose built gen_servers that maintain the > cache their state for this sort of thing. To 'expire' items in the > cache, you could run another process as a timer that calls an expire > method on the gen_server. There are more moving parts here, but > they're decoupled and avoid using ETS for what's a pretty simple > caching requirement. > > Garrett Jan Vincent Liwanag [hidden email] ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
> On Nov 11, 2009, at 1:45 AM, Garrett Smith wrote:
> >> On Tue, Nov 10, 2009 at 11:25 AM, Jachym Holecek <[hidden email]> wrote: >>> Hello, >>> >>> # Jan Vincent 2009-11-10: >>>> I wanted your take on how to implement a simple small key-value >>>> cache, maybe holding around a few hundred tuples. The thing is, I >>>> don't want the cache to be consuming all my memory so that entries >>>> in the cache expires if it isn't read for some time or some maximum >>>> size is met. >>> >>> An ETS table owned by a gen_server that runs periodic cleanup on it? >>> >> >> I find myself writing purpose built gen_servers that maintain the >> cache their state for this sort of thing. To 'expire' items in the >> cache, you could run another process as a timer that calls an expire >> method on the gen_server. There are more moving parts here, but >> they're decoupled and avoid using ETS for what's a pretty simple >> caching requirement. On Wed, Nov 11, 2009 at 9:41 AM, Jan Vincent <[hidden email]> wrote: > Why wouldn't an ETS be good for this? I'm sure it'd be fine for it :) I personally like to avoid using the "global" structures for data management whenever I can get by with a process. I like being able to look at a function header and know that I'm seeing everything that could effect the result. That said, I'm just not experienced with the various patterns in Erlang to have strong opinions either way. Just my inclination. Garrett ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
On Wed, Nov 11, 2009 at 10:23:01AM -0600, Garrett Smith wrote:
> > On Nov 11, 2009, at 1:45 AM, Garrett Smith wrote: > > > >> On Tue, Nov 10, 2009 at 11:25 AM, Jachym Holecek <[hidden email]> wrote: > >>> Hello, > >>> > >>> # Jan Vincent 2009-11-10: > >>>> I wanted your take on how to implement a simple small key-value > >>>> cache, maybe holding around a few hundred tuples. The thing is, I > >>>> don't want the cache to be consuming all my memory so that entries > >>>> in the cache expires if it isn't read for some time or some maximum > >>>> size is met. > >>> > >>> An ETS table owned by a gen_server that runs periodic cleanup on it? > >>> > >> > >> I find myself writing purpose built gen_servers that maintain the > >> cache their state for this sort of thing. To 'expire' items in the > >> cache, you could run another process as a timer that calls an expire > >> method on the gen_server. There are more moving parts here, but > >> they're decoupled and avoid using ETS for what's a pretty simple > >> caching requirement. > > On Wed, Nov 11, 2009 at 9:41 AM, Jan Vincent <[hidden email]> wrote: > > Why wouldn't an ETS be good for this? > > I'm sure it'd be fine for it :) > > I personally like to avoid using the "global" structures for data > management whenever I can get by with a process. I like being able to > look at a function header and know that I'm seeing everything that > could effect the result. > > That said, I'm just not experienced with the various patterns in > Erlang to have strong opinions either way. Just my inclination. > > Garrett > ets can be private to a particular process ; -module(session). -export( [start/2, loop/2] ). start( Name, Timeout ) -> spawn( fun() -> init( Name, Timeout ) end) . init( Name, Timeout ) -> register( Name, self() ) , Tbl = Name , ets:new( Tbl, [named_table, private] ) , ets:insert( Tbl, {start, httpd_util:rfc1123_date()} ) , ?MODULE:loop( Name, Timeout ) .%init/1 loop( Name, Timeout ) -> Tbl = Name , [{start, Start}] = ets:lookup( Tbl, start) , receive {From, _Msg} -> io:fwrite("hello ~p, I started at: ~p~n", [From, Start]) , loop( Name, Timeout ) after Timeout -> io:fwrite("tired now, started at: ~p~n", [Start]) end .%loop/2 1> c(session). 2> session:start(fu, 10000). <0.42.0> 3> fu ! {self(), hello}. hello <0.35.0>, I started at: "Wed, 11 Nov 2009 17:11:11 GMT" {<0.35.0>,hello} 4> ets:lookup(fu, start). ** exception error: bad argument in function ets:lookup/2 called as ets:lookup(fu,start) 5> tired now, started at: "Wed, 11 Nov 2009 17:11:11 GMT" $ erl -man ets search for private -- Michael McDaniel Portland, Oregon, USA http://trip.autosys.us ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
In reply to this post by Jan Vincent Liwanag
See "Why not use ETS to do this?" part of Cherly:
http://github.com/cliffmoon/cherly On Thu, Nov 12, 2009 at 12:41 AM, Jan Vincent <[hidden email]> wrote: > Why wouldn't an ETS be good for this? ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
> > Why wouldn't an ETS be good for this?
I think ETS is better for 'get' operations than going through a managing process: A process-approach suffers from sequential access. This might not be so important since lookups can be very fast, but worth keeping in mind. A public/protected ETS table with one owning manager-process doesn't suffer from this problem. If you want to do LRU-eviction strategy or something like that, you could let your 'set' operations still go through the gen_server (and evict if required) and 'get' operations go directly to the ETS table. Or use Cherly as noted above :) ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
Thijs <[hidden email]> wrote:
> A process-approach suffers from sequential access. This might not be > so important since lookups can be very fast, but worth keeping in > mind. A public/protected ETS table with one owning manager-process > doesn't suffer from this problem. ... except that public/protected ETS table access isn't free, either. The VM has to play games with controlling multi-scheduler access to ETS's underlying hash table/binary tree. I haven't measured with an R13B release, but with R12B-5 the cost was definitely non-zero and significant enough for us to change implementation in one case. -Scott ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
On Nov 13, 8:36 am, Scott Lystig Fritchie <[hidden email]>
wrote: > ... except that public/protected ETS table access isn't free, either. > The VM has to play games with controlling multi-scheduler access to > ETS's underlying hash table/binary tree. I haven't measured with an > R13B release, but with R12B-5 the cost was definitely non-zero and > significant enough for us to change implementation in one case. You are right, but in my experience a public ETS table significantly outperforms process-mediated access. I'm curious how did you optimize this further? I found that accessing the ETS table is quite cheap (not zero indeed), but the cost mainly depends on the size of the data you are accessing. I have small values for each key and then access is really very fast, without the complexities of the process (and queueing messages in it's mailbox). ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
On Nov 12, 2009, at 5:17 PM, Thijs wrote: > On Nov 13, 8:36 am, Scott Lystig Fritchie <[hidden email]> > wrote: >> ... except that public/protected ETS table access isn't free, either. >> The VM has to play games with controlling multi-scheduler access to >> ETS's underlying hash table/binary tree. I haven't measured with an > > You are right, but in my experience a public ETS table significantly > outperforms process-mediated access. I'm curious how did you optimize > this further? Let's solve this with numbers! Test || Multiplier || Percent Slowdown || Reads Per Second = = = = = = = ======================================================================== ets plain 1.00x 0% 553,384.3 ets public 1.01x 1% 550,329.3 ets public named table 1.10x 10% 502,504.7 mnesia ram copies 1.87x 87% 296,011.2 process based cache 9.86x 886% 56,141.6 BDB driver 56.37x 5,537% 9,816.3 While I have my benchmarking hat on, let's check function calls too: Call Method || Multiplier || Percent Slowdown || Calls Per Second = = = = = = = ======================================================================== apply(?MODULE, foo, []) 1.00x 0% 96,089,747.8 ?MODULE:foo() 1.01x 1% 94,809,477.4 foo() 1.09x 9% 88,485,052.9 fun() -> ok end() 3.15x 215% 30,489,431.6 apply(foo, []) 3.50x 250% 27,481,044.5 Module:Fun() 5.22x 422% 18,392,358.4 The slowest method of function invocation still yields 18+ million calls per second. Notes: • Tests were run on my ancient (almost four year old) iMac (Core Duo 2GHz): Erlang R13B02 (erts-5.7.3) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false] • No scheduler binding to CPUs. • For the function call benchmark: foo() -> ok. Module = ?MODULE, Fun = foo. • The "process based cache" is my fleshed out production version of the duomark process cache. Disclaimer: The results vary in absolute numbers for multiple runs of the benchmark, but results retain their relative positions in the rankings each time. I'll get around to making a statistically valid erlang benchmark suite one day. -Matt -- Matt Stancliff San Jose, CA AIM: seijimr iPhone: 678-591-9337 "The best way to predict the future is to invent it." --Alan Kay ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
On Nov 13, 1:10 pm, Matt Stancliff <[hidden email]> wrote:
> Let's solve this with numbers! Is this a sequential or parallel benchmark? I suspect it is sequential access? ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
In reply to this post by Matt Stancliff
Matt Stancliff wrote:
> While I have my benchmarking hat on, let's check function calls too: > Call Method || Multiplier || Percent Slowdown || Calls > Per Second > = > = > = > = > = > = > = > ======================================================================== > apply(?MODULE, foo, []) 1.00x 0% > 96,089,747.8 > ?MODULE:foo() 1.01x 1% > 94,809,477.4 > foo() 1.09x 9% > 88,485,052.9 > fun() -> ok end() 3.15x 215% > 30,489,431.6 > apply(foo, []) 3.50x 250% > 27,481,044.5 > Module:Fun() 5.22x 422% > 18,392,358.4 Is this a surprise to anyone else? I myself would have expected foo() to be the fastest, since it knows exactly which function to call, whereas the two that rank above it have to look for the most recent version of foo/0. Anyone have an explanation for this? ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
This post has NOT been accepted by the mailing list yet.
In reply to this post by Jan Vincent Liwanag
I decided to implement a very simple cache mechanism using ETS:
https://github.com/ricardobcl/ETScache For now, it only controls the number of elements in cache, because it's faster and it did what I wanted, but can be easily modified to prune it terms of bytes.
|
| Powered by Nabble | Edit this page |
