On Thu, 24 Apr 2003, Hakan Mattsson wrote:

>On Thu, 24 Apr 2003, Ulf Wiger wrote:

>

>Uffe> Given the current implementation of mnesia:match_object(), I

>Uffe> don't think preserving the order would have to impact

>Uffe> performance, esp. not on large sets (and on small sets, I

>Uffe> don't think it matters.)

>

>For fragmented tables it would have a tremendous impact, as

>the order only is preserved within each fragment. In order

>to return an ordered result, it would require a quite

>substantial sort operation.

True, unless one writes a custom mnesia_frag_hash callback

which manages fragments with some global order. This is

what I had in mind to do with XML Query for Mnesia.

It does bring up the question how far one should try to

preserve the semantics. It would of course be nice if one

could guarantee the same semantics for fragmented tables as

for basic tables, but certainly not at any price (allowing

only the lowest common denominator is also a price to pay,

of course.)

>Of course Mnesia could ensure that the result of a

>match_object on an ordered_set always should be ordered,

>but is it worth the performance penalty?

Given the current implementation, I'd say yes. ;-)

This is the function in mnesia.erl that adds objects from

the transaction store to the found set:

add_match([], Objs, Type) ->

Objs;

add_match([{Oid, _, delete}|R], Objs, Type) ->

add_match(R, deloid(Oid, Objs), Type);

add_match([{Oid, Val, delete_object}|R], Objs, Type) ->

add_match(R, lists:delete(Val, Objs), Type);

add_match([{Oid, Val, write}|R], Objs, Type) ->

case Type of

bag ->

add_match(R, [Val | lists:delete(Val, Objs)], Type);

_ ->

add_match(R, [Val | deloid(Oid,Objs)],Type)

end.

deloid(Oid, []) ->

[];

deloid({Tab, Key}, [H | T]) when element(2, H) == Key ->

deloid({Tab, Key}, T);

deloid(Oid, [H | T]) ->

[H | deloid(Oid, T)].

Given that Type == ordered_set, Objs would be an ordered

list to start with, and sort_insert(Key, Val, Objs) would

not be slower than [Val | deloid(Oid,Objs)]. In fact, the

following sort_insert(Key,Val,Objs) is actually more than

twice as fast as the current implementation in mnesia.erl

(on a given sample with 1000 Objs, 10 written objs in

the lower key range, and 10 in the upper key range.)

sort_insert(Key, Val, [H|T]) when element(2,H) == Key ->

[Val | T];

sort_insert(Key, Val, [H|_]=L) when element(2,H) > Key ->

[Val | L];

sort_insert(Key, Val, [H|T]) ->

[H | sort_insert(Key, Val, T)];

sort_insert(_, _, []) ->

[].

The reason for this is mostly that deloid/2 is too generic.

It traverses the whole list each time, even after it's found

a matching key. Assuming set semantics, this is not

necessary. I fail to see why lists:keydelete/3 wouldn't do

the job.

The modified mnesia:add_match/3 would then become:

add_match([{{Tab,Key}, Val, write}|R], Objs, Type) ->

case Type of

bag ->

add_match(R, [Val | lists:delete(Val, Objs)], Type);

ordered_set ->

add_match(R, sort_insert(Key,Val,Objs), Type);

_ ->

add_match(R, [Val |

lists:keydelete(Key,2,Objs)],Type)

end.

Even with this fix, my sort_insert/3 is some 20% faster. (:

At the moment, I fail to see why on this particular test.

On a non-contiguous range of objects, where R contains new

records as well as updated ones, my sort_insert version is

much faster, as deloid/keydelete must traverse the entire

set for each new object.

/Uffe

--

Ulf Wiger, Senior Specialist,

/ / / Architecture & Design of Carrier-Class Software

/ / / Strategic Product & System Management

/ / / Ericsson AB, Connectivity and Control Nodes