map elements in defined order

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
39 messages Options
12
Reply | Threaded
Open this post in threaded view
|

map elements in defined order

Ulf Wiger-2
Looking at the maps API, I see no function that can present the contents of a map in the defined order.

Because there IS a defined order, described in the Erlang Reference Manual:

«Maps are ordered by size, two maps with the same size are compared by keys in ascending term order and then by values in key order. In maps key order integers types are considered less than floats types.»

Note that the key ordering described above doesn't follow normal Erlang term ordering, so lists:sort(maps:to_list(Map)) will produce something will a well-defined order, but comparing the sorted lists of two maps is not guaranteed to have the same outcome as comparing the maps themselves.

I can solve this by using lists:sort/2, but this seems contrived.

To clarify, e.g. msgpack:pack(Map) will perform a maps:to_list(Map). This is not guaranteed to produce the same result across Erlang versions, since maps:to_list/1 says pairs "are returned in arbitrary order".

Yet there is no alternative API function that will return pairs in the defined sort order, although presumably there is an efficient internal implementation for producing it.

Wouldn't it be reasonable to have such a function?

BR,
Ulf W

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Roger Lipscombe-2
On 26 October 2017 at 16:30, Ulf Wiger <[hidden email]> wrote:
> Wouldn't it be reasonable to have such a function?

First thought: No, because people would start abusing it. Maps *aren't* ordered.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Sverker Eriksson-4
In reply to this post by Ulf Wiger-2

On 10/26/2017 05:30 PM, Ulf Wiger wrote:

> Looking at the maps API, I see no function that can present the
> contents of a map in the defined order.
>
> Because there IS a defined order, described in the Erlang Reference
> Manual:
>
> «Maps are ordered by size, two maps with the same size are compared by
> keys in ascending term order and then by values in key order. In maps
> key order integers types are considered less than floats types.»
>
> Note that the key ordering described above doesn't follow normal
> Erlang term ordering, so lists:sort(maps:to_list(Map)) will produce
> something will a well-defined order, but comparing the sorted lists of
> two maps is not guaranteed to have the same outcome as comparing the
> maps themselves.
>
> I can solve this by using lists:sort/2, but this seems contrived.
>
> To clarify, e.g. msgpack:pack(Map) will perform a maps:to_list(Map).
> This is not guaranteed to produce the same result across Erlang
> versions, since maps:to_list/1 says pairs "are returned in arbitrary
> order".
>
> Yet there is no alternative API function that will return pairs in the
> defined sort order, although presumably there is an efficient internal
> implementation for producing it.
>

There is an efficient internal implementation to compare two terms in
map-key order (undocumented erts_internal:cmp_term/2).

But there is no efficient implementation to produce a map-key-value
ordered list. That would require N*log(N) sorting.

'==' for hashmaps (> 32 keys) does a one pass iteration in key hash
order over both trees (HAMTs) and keeps track of the minimal key or
value seen so far.



/Sverker


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Ulf Wiger-2
In reply to this post by Roger Lipscombe-2
But they *are* ordered. Otherwise, comparison of two maps would be undefined. 

BR, 
Ulf W

Den 26 okt. 2017 18:20 skrev "Roger Lipscombe" <[hidden email]>:
On 26 October 2017 at 16:30, Ulf Wiger <[hidden email]> wrote:
> Wouldn't it be reasonable to have such a function?

First thought: No, because people would start abusing it. Maps *aren't* ordered.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Karl Nilsson-2
Doesn’t it just mean that for any two maps with the same keyvalues the internal order is the same for a given erlang version? That should be sufficient for comparison but unlikely be useful for anything else.

On Thu, 26 Oct 2017 at 19:07, Ulf Wiger <[hidden email]> wrote:
But they *are* ordered. Otherwise, comparison of two maps would be undefined. 

BR, 
Ulf W


Den 26 okt. 2017 18:20 skrev "Roger Lipscombe" <[hidden email]>:
On 26 October 2017 at 16:30, Ulf Wiger <[hidden email]> wrote:
> Wouldn't it be reasonable to have such a function?

First thought: No, because people would start abusing it. Maps *aren't* ordered.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Jesper Louis Andersen-2
In reply to this post by Ulf Wiger-2
It is generally a bad idea to rely on order in anything which uses a hash function. The hash function is often subject to change---rather quickly I might add if it proves to be a security bug. Picking a family of hashes and seeding it randomly is usually a good trick.

Our "sister language" Go *randomizes* iteration order on its maps. This is to force programmers into not relying on the map order at all, even if it happens to be ordered right now. This opens up implementations in the future.

If you wanted order in a map, it would be *far* better if you could create a map based on RB-trees or the like. Those are naturally ordered by structure. OCaml, for instance, defines Hash Tables as well as Maps. The latter is the ordered variant.


On Thu, Oct 26, 2017 at 8:07 PM Ulf Wiger <[hidden email]> wrote:
But they *are* ordered. Otherwise, comparison of two maps would be undefined. 

BR, 
Ulf W


Den 26 okt. 2017 18:20 skrev "Roger Lipscombe" <[hidden email]>:
On 26 October 2017 at 16:30, Ulf Wiger <[hidden email]> wrote:
> Wouldn't it be reasonable to have such a function?

First thought: No, because people would start abusing it. Maps *aren't* ordered.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Ulf Wiger-2
But again, Jesper, just about everyone relies on the fact that maps follow the general principle that there is a well-defined term comparison order. Otherwise, maps would be highly unsuitable to use in keys, and generally treacherous to use as a replacement for records. Following the Principle of Least Surprise, it's a darned good thing that Erlang doesn't randomize the key order in its maps.

I doubt that anyone would abuse an extra function that produces the map element pairs in the internally defined sort order, given that the documentation would clearly state that it's more expensive than maps:to_list/1 (though likely faster than lists:sort(maps:to_list(M), fun custom_sort_fun/2) - not to mention less error-prone.)

But it's not a feature I'm willing to go to war over. If no one else sees a use for it, I'm willing to concede that it has low priority. :)

BR,
Ulf W

2017-10-26 20:31 GMT+02:00 Jesper Louis Andersen <[hidden email]>:
It is generally a bad idea to rely on order in anything which uses a hash function. The hash function is often subject to change---rather quickly I might add if it proves to be a security bug. Picking a family of hashes and seeding it randomly is usually a good trick.

Our "sister language" Go *randomizes* iteration order on its maps. This is to force programmers into not relying on the map order at all, even if it happens to be ordered right now. This opens up implementations in the future.

If you wanted order in a map, it would be *far* better if you could create a map based on RB-trees or the like. Those are naturally ordered by structure. OCaml, for instance, defines Hash Tables as well as Maps. The latter is the ordered variant.


On Thu, Oct 26, 2017 at 8:07 PM Ulf Wiger <[hidden email]> wrote:
But they *are* ordered. Otherwise, comparison of two maps would be undefined. 

BR, 
Ulf W


Den 26 okt. 2017 18:20 skrev "Roger Lipscombe" <[hidden email]>:
On 26 October 2017 at 16:30, Ulf Wiger <[hidden email]> wrote:
> Wouldn't it be reasonable to have such a function?

First thought: No, because people would start abusing it. Maps *aren't* ordered.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

José Valim-3
In reply to this post by Jesper Louis Andersen-2
 
Our "sister language" Go *randomizes* iteration order on its maps. This is to force programmers into not relying on the map order at all, even if it happens to be ordered right now. This opens up implementations in the future.

Randomizing how elements are stored/hashed is also useful to avoid hash collision attacks.


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Michael Truog
In reply to this post by Ulf Wiger-2
On 10/26/2017 12:13 PM, Ulf Wiger wrote:
> But again, Jesper, just about everyone relies on the fact that maps follow the general principle that there is a well-defined term comparison order. Otherwise, maps would be highly unsuitable to use in keys, and generally treacherous to use as a replacement for records. Following the Principle of Least Surprise, it's a darned good thing that Erlang doesn't randomize the key order in its maps.
>
> I doubt that anyone would abuse an extra function that produces the map element pairs in the internally defined sort order, given that the documentation would clearly state that it's more expensive than maps:to_list/1 (though likely faster than lists:sort(maps:to_list(M), fun custom_sort_fun/2) - not to mention less error-prone.)
>
> But it's not a feature I'm willing to go to war over. If no one else sees a use for it, I'm willing to concede that it has low priority. :)
>

At a low-level, a hash array mapped trie is unordered because it relies on a hash function, so using the Erlang map in an ordered way might be possible but would be inefficient.  If you need an ordered mapping, I think my Erlang trie is a better option at https://github.com/okeuday/trie , though it requires Erlang string keys (binary keys are supported by the btrie module, but that is slow).  That doesn't necessarily help msgpack source code though, unless it would be a separate output option that was added.

Best Regards,
Michael
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Ulf Wiger-2
There are some things in Erlang that people have come to rely on, but for which there's a price. Having a globally defined sort order is such a thing. And maps must be implemented in such a way that the following is not only supported, but efficient:

comp(#{} = A, #{} = B) when A > B -> greater;
...

That is, maps must behave, efficiently, as an ordered mapping on request.

I've had reason to ponder the Erlang term comparison semantics while maintaining the sext library. The ordering semantics of maps threw a wrench into the works, with its non-standard ordering of keys (I should have left some room between the type tags for such unexpected developments.) The jury (i.e. QuickCheck) is still out on whether I'll be able to support it with the existing tag scheme.

Now, for sext, maps are at least better than refs, pids and ports, since there is no documentation on how they are sorted internally. As it turns out (for now), using term_to_binary() and some bit syntax seems to work in practice for those.

For the external term format for maps, actually, the documentation says:

«Key and value pairs (Ki => Vi) are encoded in section Pairs in the following order: K1, V1, K2, V2,..., Kn, Vn. »

... which does seem to imply that the ordering in the external term format is defined. I assume this is not the intended meaning, but rather that whichever key K1 is, the corresponding value V1 is the item that follows it, and so on.

Anyway, I've come across code that uses maps as a serialization format for a cryptographic hash. Based on what is known about maps, this would seem like a Bad Idea, since in that case, presumably, the result of the hash operation might change between implementations.

BR,
Ulf W

Den 26 okt. 2017 21:43 skrev "Michael Truog" <[hidden email]>:
On 10/26/2017 12:13 PM, Ulf Wiger wrote:
But again, Jesper, just about everyone relies on the fact that maps follow the general principle that there is a well-defined term comparison order. Otherwise, maps would be highly unsuitable to use in keys, and generally treacherous to use as a replacement for records. Following the Principle of Least Surprise, it's a darned good thing that Erlang doesn't randomize the key order in its maps.

I doubt that anyone would abuse an extra function that produces the map element pairs in the internally defined sort order, given that the documentation would clearly state that it's more expensive than maps:to_list/1 (though likely faster than lists:sort(maps:to_list(M), fun custom_sort_fun/2) - not to mention less error-prone.)

But it's not a feature I'm willing to go to war over. If no one else sees a use for it, I'm willing to concede that it has low priority. :)


At a low-level, a hash array mapped trie is unordered because it relies on a hash function, so using the Erlang map in an ordered way might be possible but would be inefficient.  If you need an ordered mapping, I think my Erlang trie is a better option at https://github.com/okeuday/trie , though it requires Erlang string keys (binary keys are supported by the btrie module, but that is slow).  That doesn't necessarily help msgpack source code though, unless it would be a separate output option that was added.

Best Regards,
Michael


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Benoit Chesneau-2
In reply to this post by Jesper Louis Andersen-2


On 26 Oct 2017, at 20:31, Jesper Louis Andersen <[hidden email]> wrote:

It is generally a bad idea to rely on order in anything which uses a hash function. The hash function is often subject to change---rather quickly I might add if it proves to be a security bug. Picking a family of hashes and seeding it randomly is usually a good trick.

Our "sister language" Go *randomizes* iteration order on its maps. This is to force programmers into not relying on the map order at all, even if it happens to be ordered right now. This opens up implementations in the future.

If you wanted order in a map, it would be *far* better if you could create a map based on RB-trees or the like. Those are naturally ordered by structure. OCaml, for instance, defines Hash Tables as well as Maps. The latter is the ordered variant.


On Thu, Oct 26, 2017 at 8:07 PM Ulf Wiger <[hidden email]> wrote:
But they *are* ordered. Otherwise, comparison of two maps would be undefined. 

BR, 
Ulf W


Den 26 okt. 2017 18:20 skrev "Roger Lipscombe" <[hidden email]>:
On 26 October 2017 at 16:30, Ulf Wiger <[hidden email]> wrote:
> Wouldn't it be reasonable to have such a function?

First thought: No, because people would start abusing it. Maps *aren't* ordered.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


I guess if at least `maps:to_list/1` would return the keys in order it would  be already a benefit. 

Another usage of such order is when you want to sign the object to compare  with others across the network. I have such usage when I’m using maps as a representation for JSON.

- benoît

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Richard A. O'Keefe-2
In reply to this post by Sverker Eriksson-4
The current thread has left me somewhat confused.

Suppose E1 and E2 are different computations,
delivering maps M1 and M2 respectively, such
that M1 =:= M2 is true.

Are we guaranteed that term_to_binary(M1) and
term_to_binary(M2) are equal binaries?

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

zxq9-2
In reply to this post by Ulf Wiger-2
On 2017年10月26日 木曜日 21:13:07 Ulf Wiger wrote:
> ... to use as a replacement for records.

But they are IN NO WAY a replacement for records.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Ulf Wiger-2
Actually, they ARE, in some ways.

I should have been clearer. I was, of course, referring to the uses of records where they were never optimal in the first place, but were tolerated e.g. because of the great convenience of referencing elements by name in pattern-matching.

Quoted from EEP-0043:

«The idea was a data-type, a syntax-aware mapping of key-value associations with pattern matching. A syntax similar to records but without the hassle of compile-time dependency and with arbitrary terms as keys. Order was not important and it could be implemented with a Hash-Array-Mapped-Trie with good performance and memory trade-offs. This was a different approach than to replace records. It was meant to replace records where suitable and in other regards not be a replacement but its own _thing_.»

Further, relevant to this discussion:

«A restriction set on the implementation by the Erlang specification is that order is total, i.e. satisfies _antisymmetry, transitivity and totality_.
...
- Ordered maps impose restrictions on the underlying implementation and a hashing approach will be nearly impossible.
- The underlying structure does not need to be sorted, an order could be produced when needed,»

The thing I tried to highlight was that "an order could be produced when needed" is actually a bit more involved than you'd think, given that no API function does it for you. At least if you want to reflect the internal sort order - given that the actual implementation is NOT as described in the EEP: "If a key or value differs, the order of the respective terms, in Erlang term order".

You'd have to write your own sort function, in which you visit all elements of complex structures, finding all instances of floats and ensuring that they get the right priority.

Or... you simply do this:

lists:sort(fun(A, B) -> #{A => 0} =< #{B => 0} end, maps:to_list(Map))


As evidenced e.g. by Benoit's comment above, I believe lots of people already rely on the order of map elements being IN SOME WAY stable. This is likely a brittle assumption, but one that may - in time - trap the OTP team into having to preserve the current APPARENT order.

And the apparent order does appear to be sorted, as long as only maps of size =< 32 ("flatmaps") are inspected. For larger maps ("hashmaps"), the order of elements returned by maps:to_list/1 indeed appears arbitrary.

BR,
Ulf W

2017-10-27 6:01 GMT+02:00 zxq9 <[hidden email]>:
On 2017年10月26日 木曜日 21:13:07 Ulf Wiger wrote:
> ... to use as a replacement for records.

But they are IN NO WAY a replacement for records.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Attila Rajmund Nohl
2017-10-27 9:25 GMT+02:00 Ulf Wiger <[hidden email]>:
[...]
> As evidenced e.g. by Benoit's comment above, I believe lots of people
> already rely on the order of map elements being IN SOME WAY stable. This is
> likely a brittle assumption, but one that may - in time - trap the OTP team
> into having to preserve the current APPARENT order.
>
> And the apparent order does appear to be sorted, as long as only maps of
> size =< 32 ("flatmaps") are inspected. For larger maps ("hashmaps"), the
> order of elements returned by maps:to_list/1 indeed appears arbitrary.

I already run into a bug where the code expected the map to be sorted.
When the 33rd element was added, it started to show pathological
behavior.
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Richard Carlsson-3
In reply to this post by Ulf Wiger-2
My interpretation of the reference manual is that maps compare to each other like the tuples created by this transformation:

t(Map) ->
    {Ks,Vs} = lists:unzip(lists:keysort(1, [{{if is_integer(K) -> 0; is_float(K) -> 1; true -> 2 end, K}, V} || {K,V} <- maps:to_list(M)])),
    list_to_tuple(Ks ++ Vs).

For example:

    t(#{65 => "A", 2.71 => e, 1.0e308 => alot, pi => 3.14, [] => 0})

yields

    {{0,65}, {1,2.71}, {1,1.0e308}, {2,pi}, {2,[]},
        "A",    e,        alot,        3.14,   0}

Which:
1) should be independent of the underlying implementation and choice of hash function, and thus stable across future versions of Erlang;
2) is cumbersome to express using the normal term order over the basic data types, since float keys always sort higher than integers (not really following the rule of least surprise).





        /Richard

2017-10-27 9:25 GMT+02:00 Ulf Wiger <[hidden email]>:
Actually, they ARE, in some ways.

I should have been clearer. I was, of course, referring to the uses of records where they were never optimal in the first place, but were tolerated e.g. because of the great convenience of referencing elements by name in pattern-matching.

Quoted from EEP-0043:

«The idea was a data-type, a syntax-aware mapping of key-value associations with pattern matching. A syntax similar to records but without the hassle of compile-time dependency and with arbitrary terms as keys. Order was not important and it could be implemented with a Hash-Array-Mapped-Trie with good performance and memory trade-offs. This was a different approach than to replace records. It was meant to replace records where suitable and in other regards not be a replacement but its own _thing_.»

Further, relevant to this discussion:

«A restriction set on the implementation by the Erlang specification is that order is total, i.e. satisfies _antisymmetry, transitivity and totality_.
...
- Ordered maps impose restrictions on the underlying implementation and a hashing approach will be nearly impossible.
- The underlying structure does not need to be sorted, an order could be produced when needed,»

The thing I tried to highlight was that "an order could be produced when needed" is actually a bit more involved than you'd think, given that no API function does it for you. At least if you want to reflect the internal sort order - given that the actual implementation is NOT as described in the EEP: "If a key or value differs, the order of the respective terms, in Erlang term order".

You'd have to write your own sort function, in which you visit all elements of complex structures, finding all instances of floats and ensuring that they get the right priority.

Or... you simply do this:

lists:sort(fun(A, B) -> #{A => 0} =< #{B => 0} end, maps:to_list(Map))


As evidenced e.g. by Benoit's comment above, I believe lots of people already rely on the order of map elements being IN SOME WAY stable. This is likely a brittle assumption, but one that may - in time - trap the OTP team into having to preserve the current APPARENT order.

And the apparent order does appear to be sorted, as long as only maps of size =< 32 ("flatmaps") are inspected. For larger maps ("hashmaps"), the order of elements returned by maps:to_list/1 indeed appears arbitrary.

BR,
Ulf W

2017-10-27 6:01 GMT+02:00 zxq9 <[hidden email]>:
On 2017年10月26日 木曜日 21:13:07 Ulf Wiger wrote:
> ... to use as a replacement for records.

But they are IN NO WAY a replacement for records.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions



_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

zxq9-2
In reply to this post by Attila Rajmund Nohl
On 2017年10月27日 金曜日 10:38:46 Attila Rajmund Nohl wrote:

> 2017-10-27 9:25 GMT+02:00 Ulf Wiger <[hidden email]>:
> [...]
> > As evidenced e.g. by Benoit's comment above, I believe lots of people
> > already rely on the order of map elements being IN SOME WAY stable. This is
> > likely a brittle assumption, but one that may - in time - trap the OTP team
> > into having to preserve the current APPARENT order.
> >
> > And the apparent order does appear to be sorted, as long as only maps of
> > size =< 32 ("flatmaps") are inspected. For larger maps ("hashmaps"), the
> > order of elements returned by maps:to_list/1 indeed appears arbitrary.
>
> I already run into a bug where the code expected the map to be sorted.
> When the 33rd element was added, it started to show pathological
> behavior.

...and this means there is a bug in the calling code, NOT in maps (which I'm pretty sure is the point Attila is pointing out).

I can't believe we are having this discussion. Again.

Having a discussion about internal ordering in the context of efficient matches and comparisons *in implementation*: totally logical

Relying on that implementation detail to leak out: ridiculous


If we really want ordered maps we should make ordered maps. And highlight in big bold letters all the baggage that brings with it. Alternately, we already have other data structures that work this way, perhaps they could be utilized. I think people just really really wish they could add wazoo syntax to various tree structures... ugh.

This discussion comes up a little more frequently than once a year and every time it reminds me of watching distributed systems engineers (try desperately to) explain CAP theorem tradeoffs to a VP of marketing.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Ulf Wiger-2
In reply to this post by Richard Carlsson-3
Actually, that's roughly what I started out assuming too, but it's not sufficient. Consider:

1> lists:sort([#{[-1.0] => 0}, #{[1] => 0}]).
[#{[1] => 0},#{[-1.0] => 0}]

You must dig into each structure and find occurrences of floats to determine whether they affect the ordering.

But yes, the documented sort order should be stable across versions. This means, in practice, that it's safe to rely on the sorting properties of maps.

BR,
Ulf

2017-10-27 10:49 GMT+02:00 Richard Carlsson <[hidden email]>:
My interpretation of the reference manual is that maps compare to each other like the tuples created by this transformation:

t(Map) ->
    {Ks,Vs} = lists:unzip(lists:keysort(1, [{{if is_integer(K) -> 0; is_float(K) -> 1; true -> 2 end, K}, V} || {K,V} <- maps:to_list(M)])),
    list_to_tuple(Ks ++ Vs).

For example:

    t(#{65 => "A", 2.71 => e, 1.0e308 => alot, pi => 3.14, [] => 0})

yields

    {{0,65}, {1,2.71}, {1,1.0e308}, {2,pi}, {2,[]},
        "A",    e,        alot,        3.14,   0}

Which:
1) should be independent of the underlying implementation and choice of hash function, and thus stable across future versions of Erlang;
2) is cumbersome to express using the normal term order over the basic data types, since float keys always sort higher than integers (not really following the rule of least surprise).





        /Richard

2017-10-27 9:25 GMT+02:00 Ulf Wiger <[hidden email]>:
Actually, they ARE, in some ways.

I should have been clearer. I was, of course, referring to the uses of records where they were never optimal in the first place, but were tolerated e.g. because of the great convenience of referencing elements by name in pattern-matching.

Quoted from EEP-0043:

«The idea was a data-type, a syntax-aware mapping of key-value associations with pattern matching. A syntax similar to records but without the hassle of compile-time dependency and with arbitrary terms as keys. Order was not important and it could be implemented with a Hash-Array-Mapped-Trie with good performance and memory trade-offs. This was a different approach than to replace records. It was meant to replace records where suitable and in other regards not be a replacement but its own _thing_.»

Further, relevant to this discussion:

«A restriction set on the implementation by the Erlang specification is that order is total, i.e. satisfies _antisymmetry, transitivity and totality_.
...
- Ordered maps impose restrictions on the underlying implementation and a hashing approach will be nearly impossible.
- The underlying structure does not need to be sorted, an order could be produced when needed,»

The thing I tried to highlight was that "an order could be produced when needed" is actually a bit more involved than you'd think, given that no API function does it for you. At least if you want to reflect the internal sort order - given that the actual implementation is NOT as described in the EEP: "If a key or value differs, the order of the respective terms, in Erlang term order".

You'd have to write your own sort function, in which you visit all elements of complex structures, finding all instances of floats and ensuring that they get the right priority.

Or... you simply do this:

lists:sort(fun(A, B) -> #{A => 0} =< #{B => 0} end, maps:to_list(Map))


As evidenced e.g. by Benoit's comment above, I believe lots of people already rely on the order of map elements being IN SOME WAY stable. This is likely a brittle assumption, but one that may - in time - trap the OTP team into having to preserve the current APPARENT order.

And the apparent order does appear to be sorted, as long as only maps of size =< 32 ("flatmaps") are inspected. For larger maps ("hashmaps"), the order of elements returned by maps:to_list/1 indeed appears arbitrary.

BR,
Ulf W

2017-10-27 6:01 GMT+02:00 zxq9 <[hidden email]>:
On 2017年10月26日 木曜日 21:13:07 Ulf Wiger wrote:
> ... to use as a replacement for records.

But they are IN NO WAY a replacement for records.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions




_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Sverker Eriksson-4

Yes. All Erlang terms, including maps, have a globally defined, implementation independent, total order.

The reason, for this "surprising" order of maps, is the alternative of using the normal arithmetic order for keys is much worse.

If we want 1 and 1.0 to be different keys (which we do as we use matching (=:=) to lookup keys)
then we need to order 1 and 1.0, and normal arithmetic term ordering does not (1 == 1.0).


It has been discussed (more and less serious) to expose this "map-key-order" with operators like :<, :>, =:<, >:=.


/Sverker


On 10/27/2017 03:13 PM, Ulf Wiger wrote:
Actually, that's roughly what I started out assuming too, but it's not sufficient. Consider:

1> lists:sort([#{[-1.0] => 0}, #{[1] => 0}]).
[#{[1] => 0},#{[-1.0] => 0}]

You must dig into each structure and find occurrences of floats to determine whether they affect the ordering.

But yes, the documented sort order should be stable across versions. This means, in practice, that it's safe to rely on the sorting properties of maps.

BR,
Ulf

2017-10-27 10:49 GMT+02:00 Richard Carlsson <[hidden email]>:
My interpretation of the reference manual is that maps compare to each other like the tuples created by this transformation:

t(Map) ->
    {Ks,Vs} = lists:unzip(lists:keysort(1, [{{if is_integer(K) -> 0; is_float(K) -> 1; true -> 2 end, K}, V} || {K,V} <- maps:to_list(M)])),
    list_to_tuple(Ks ++ Vs).

For example:

    t(#{65 => "A", 2.71 => e, 1.0e308 => alot, pi => 3.14, [] => 0})

yields

    {{0,65}, {1,2.71}, {1,1.0e308}, {2,pi}, {2,[]},
        "A",    e,        alot,        3.14,   0}

Which:
1) should be independent of the underlying implementation and choice of hash function, and thus stable across future versions of Erlang;
2) is cumbersome to express using the normal term order over the basic data types, since float keys always sort higher than integers (not really following the rule of least surprise).





        /Richard

2017-10-27 9:25 GMT+02:00 Ulf Wiger <[hidden email]>:
Actually, they ARE, in some ways.

I should have been clearer. I was, of course, referring to the uses of records where they were never optimal in the first place, but were tolerated e.g. because of the great convenience of referencing elements by name in pattern-matching.

Quoted from EEP-0043:

«The idea was a data-type, a syntax-aware mapping of key-value associations with pattern matching. A syntax similar to records but without the hassle of compile-time dependency and with arbitrary terms as keys. Order was not important and it could be implemented with a Hash-Array-Mapped-Trie with good performance and memory trade-offs. This was a different approach than to replace records. It was meant to replace records where suitable and in other regards not be a replacement but its own _thing_.»

Further, relevant to this discussion:

«A restriction set on the implementation by the Erlang specification is that order is total, i.e. satisfies _antisymmetry, transitivity and totality_.
...
- Ordered maps impose restrictions on the underlying implementation and a hashing approach will be nearly impossible.
- The underlying structure does not need to be sorted, an order could be produced when needed,»

The thing I tried to highlight was that "an order could be produced when needed" is actually a bit more involved than you'd think, given that no API function does it for you. At least if you want to reflect the internal sort order - given that the actual implementation is NOT as described in the EEP: "If a key or value differs, the order of the respective terms, in Erlang term order".

You'd have to write your own sort function, in which you visit all elements of complex structures, finding all instances of floats and ensuring that they get the right priority.

Or... you simply do this:

lists:sort(fun(A, B) -> #{A => 0} =< #{B => 0} end, maps:to_list(Map))


As evidenced e.g. by Benoit's comment above, I believe lots of people already rely on the order of map elements being IN SOME WAY stable. This is likely a brittle assumption, but one that may - in time - trap the OTP team into having to preserve the current APPARENT order.

And the apparent order does appear to be sorted, as long as only maps of size =< 32 ("flatmaps") are inspected. For larger maps ("hashmaps"), the order of elements returned by maps:to_list/1 indeed appears arbitrary.

BR,
Ulf W

2017-10-27 6:01 GMT+02:00 zxq9 <[hidden email]>:
On 2017年10月26日 木曜日 21:13:07 Ulf Wiger wrote:
> ... to use as a replacement for records.

But they are IN NO WAY a replacement for records.

-Craig
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions





_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: map elements in defined order

Ulf Wiger-2
In reply to this post by zxq9-2

2017-10-27 12:56 GMT+02:00 zxq9 <[hidden email]>:

I can't believe we are having this discussion. Again.

I will admit that I haven't been following the list that closely recently, but I wasn't aware that this discussion has been had before.

 
Having a discussion about internal ordering in the context of efficient matches and comparisons *in implementation*: totally logical

Relying on that implementation detail to leak out: ridiculous

Again, quoting from the EEP: 

«The underlying structure does not need to be sorted, an order could be produced when needed»

I think that among those who fully accepted the maps API returning elements in arbitrary order, most would have assumed that lists:sort(maps:to_list(M)) would do the trick (and, according to the EEP, it would), and were perfectly content with that.

What I brought up was the (admittedly subtle) point that even if you WANTED to settle for that, it actually doesn't produce the sort order that applies to maps internally, and there is no function, anywhere, that will give you that result - even one where the docs are riddled with warnings about its inefficiency.
 
This discussion comes up a little more frequently than once a year and every time it reminds me of watching distributed systems engineers (try desperately to) explain CAP theorem tradeoffs to a VP of marketing.

Am I the VP of marketing in this story?

BR,
Ulf 

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
12