Inserting without overwriting with mnesia or ets

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Inserting without overwriting with mnesia or ets

Ulf Wiger (AL/EAB)

The solution below is perhaps not fast, but it's generic.
Iff you know that the table exists as a ram_copy locally
(+ you know that the object cannot have been created earlier
in the same transaction, in which case it would exist
temporarily in another ets table - and that the object could
not be in the process of being created in another
simultaneous transaction, etc.),
you can "cheat" and use ets:member/2, but you need to make
sure you understand the implications of this, and determine
whether it is safe to do so in _your_ case. Then you should
highlight the code with comments reminding yourself that you've
done something dirty.

Given that some of the above conditions hold, you could check
with mnesia:dirty_read/1, which works on all types of tables
and even if the table is remote, but still has the same
problems re. transaction safety.

/Uffe

> -----Original Message-----
> From: owner-erlang-questions
> [mailto:owner-erlang-questions]On Behalf Of Einar Karttunen
> Sent: den 7 maj 2004 14:29
> To: erlang-questions
> Subject: Inserting without overwriting with mnesia or ets
>
>
> Hello
>
> I need a way to insert into ets/mnesia and fail if the
> key already exists. These failures should be very rare
> and exceptional, but not checking for them could corrupt
> data... Is there an efficient solution to this?
>
> The naive approach is not very good:
>
> insert(K,V) ->
> fun() ->
> case mnesia:wread(K) of
> [Exist] -> {abort, Exist};
> _ -> mnesia:write(V)
> end
> end.
>
> - Einar Karttunen
>


Reply | Threaded
Open this post in threaded view
|

Inserting without overwriting with mnesia or ets

Hal Snyder-2
"Ulf Wiger (AL/EAB)" <ulf.wiger> writes:

> The solution below is perhaps not fast, but it's generic.

Just to make sure I'm on the same page with you guys, let me rephrase
things a little. Feel free to correct if I missed something.

I guess the thread is about distributed atomic test-and-set. The
"generic approach" below is needlessly slow if

a. updates always happen on one node
b. replicas are kept on other nodes for reliability only

in which case you don't want the overhead of locking the replicas when
doing test-and-set.

A faster approach would be to take a process somewhere, give it a
table (mnesia or ets or hash) updated by no other process and put it
in a receive loop to provide ACID test-and-set to the rest of the
system. Or does some class of mnesia calls already provide the
aforesaid serialization?

Next comes a question about mnesia and dirty ops, which I think is
mostly answered by Ulf's previous reply: under what conditions is a
sequence of dirty reads and writes from a single process to a mnesia
table consistent, i.e. each read or write sees the table as if all
previous updates have completed in order, if no other process writes
to the table?

The answer seems to be:

a. table exists as ram_copy locally (+ .. ) as in next paragraph

b. remote tables too? Doesn't that get into questions of message ordering
   as discussed recently on this list?


> Iff you know that the table exists as a ram_copy locally (+ you know
> that the object cannot have been created earlier in the same
> transaction, in which case it would exist temporarily in another ets
> table - and that the object could not be in the process of being
> created in another simultaneous transaction, etc.), you can "cheat"
> and use ets:member/2, but you need to make sure you understand the
> implications of this, and determine whether it is safe to do so in
> _your_ case. Then you should highlight the code with comments
> reminding yourself that you've done something dirty.
>
> Given that some of the above conditions hold, you could check with
> mnesia:dirty_read/1, which works on all types of tables and even if
> the table is remote, but still has the same problems re. transaction
> safety.

...

[generic solution]

>> [mailto:owner-erlang-questions]On Behalf Of Einar Karttunen
>> Sent: den 7 maj 2004 14:29
>> To: erlang-questions
>> Subject: Inserting without overwriting with mnesia or ets

>> The naive approach is not very good:
>>
>> insert(K,V) ->
>> fun() ->
>> case mnesia:wread(K) of
>> [Exist] -> {abort, Exist};
>> _ -> mnesia:write(V)
>> end
>> end.


Reply | Threaded
Open this post in threaded view
|

Inserting without overwriting with mnesia or ets

Ulf Wiger (AL/EAB)
On Sat, 08 May 2004 18:29:49 -0500, Hal Snyder <hal> wrote:


> Just to make sure I'm on the same page with you guys, let me rephrase
> things a little. Feel free to correct if I missed something.
>
> I guess the thread is about distributed atomic test-and-set. The
> "generic approach" below is needlessly slow if
>
> a. updates always happen on one node
> b. replicas are kept on other nodes for reliability only

Actually, the "generic approach" is needlessly slow unless one
has to account for the possibility of multiple processes
simultaneously performing the test-and-set operation, and there
is no practical way to serialize the operations using a server
(see below.)

A very simple and useful pattern in Erlang is to create a
test-and-set server, which serializes all accesses to the
resource in question. The server can then opt to use dirty operations
or ets, depending on storage criteria. In some cases, one may
still want to use transactions for their rollback semantics
(another aspect on atomicity), but in this particular case, that
doesn't add much benefit. This solution is very easy to understand,
and it is also quite fast.


> Next comes a question about mnesia and dirty ops, which I think is
> mostly answered by Ulf's previous reply: under what conditions is a
> sequence of dirty reads and writes from a single process to a mnesia
> table consistent, i.e. each read or write sees the table as if all
> previous updates have completed in order, if no other process writes
> to the table?

The sequence is consistent if no other process writes to the table,
and provided that the sequence runs to completion. (:

> The answer seems to be:
>
> a. table exists as ram_copy locally (+ .. ) as in next paragraph

No, whether it is ram, disc, or disc_only doesn't matter for dirty ops.
They will also honour indeces. Storage and location independence, as
well as replication support are basically what you gain from using
dirty ops instead of ets.


> b. remote tables too? Doesn't that get into questions of message ordering
>    as discussed recently on this list?

Dirty operations support remote tables; ets operations don't.
Not really, as long as the criterion that only one process is updating
the resource still holds.

/Uffe
--
Ulf Wiger