Spatial indexing in mnesia

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Spatial indexing in mnesia

Shawn Pearce
Has anyone ever considered storing spatial data in mnesia?  It requires
creating a different style of indexing than dets must use for lookup of
records....

Oracle implements their spatial indexes as an auxiliary table that
stores each node of the spatial index tree as a row in the table.  This
table is then indexed using traditional b*-tree indexes for performance.

We'd like to stay with mnesia, but we're running into the trouble of
overloading mnesia's log writer, and being faced with the need to do
spatial data.

I'm afraid performance will suck if we have to implement a spatial index
as a standard dets table, as we'd be doing 20 or 30 read calls to mnesia
just to do the lookup in the spatial index, then turn around and perform
another read call to load the actual record we wanted.  Doing this for
a few hundred lookups at a time will choke mnesia as far as reading
goes, right?

Is this like ``far out there research'' type stuff to be doing with such
a high-level language as Erlang?  Mnesia is great, but it strikes me
that its performance is limited by the language's own constraints of
lists and tuples, none of which can be fixed-length or blocked together
like one can do in C.

--
Shawn.

  ``If this had been a real
    life, you would have
    received instructions
    on where to go and what
    to do.''


Reply | Threaded
Open this post in threaded view
|

Spatial indexing in mnesia

David Gould
On Wed, Jan 03, 2001 at 06:44:40PM -0500, Shawn Pearce wrote:
> Has anyone ever considered storing spatial data in mnesia?  It requires
> creating a different style of indexing than dets must use for lookup of
> records....
>
> Oracle implements their spatial indexes as an auxiliary table that
> stores each node of the spatial index tree as a row in the table.  This
> table is then indexed using traditional b*-tree indexes for performance.

Informix and Illustra before it, and Postgresql before that (and I think
maybe also DB2) do this using multidimensional indexes, usually based on
R-trees.

> We'd like to stay with mnesia, but we're running into the trouble of
> overloading mnesia's log writer, and being faced with the need to do
> spatial data.
>
> I'm afraid performance will suck if we have to implement a spatial index
> as a standard dets table, as we'd be doing 20 or 30 read calls to mnesia
> just to do the lookup in the spatial index, then turn around and perform
> another read call to load the actual record we wanted.  Doing this for
> a few hundred lookups at a time will choke mnesia as far as reading
> goes, right?
>
> Is this like ``far out there research'' type stuff to be doing with such
> a high-level language as Erlang?  Mnesia is great, but it strikes me
> that its performance is limited by the language's own constraints of
> lists and tuples, none of which can be fixed-length or blocked together
> like one can do in C.

I think the thing to do might be to either implement R-trees in Erlang and
add it to Mnesia, or to write a port driver that used an external R-Tree
library or server. Might be able to use postgresql directly, or borrow the
R-tree code from it (assuming they haven't done anything foolish like fail
to maintain it...).

-dg

--
David Gould                                                 dg
SuSE, Inc.,  580 2cd St. #210,  Oakland, CA 94607          510.628.3380
why would you want to own /dev/null? "ooo! ooo! look! i stole nothing!
i'm the thief of nihilism! i'm the new god of zen monks."


Reply | Threaded
Open this post in threaded view
|

Spatial indexing in mnesia

Ulf Wiger-4
On Wed, 3 Jan 2001, David Gould wrote:

>On Wed, Jan 03, 2001 at 06:44:40PM -0500, Shawn Pearce wrote:
>> Has anyone ever considered storing spatial data in mnesia?  It requires
>> creating a different style of indexing than dets must use for lookup of
>> records....
>>
>> Oracle implements their spatial indexes as an auxiliary table that
>> stores each node of the spatial index tree as a row in the table.  This
>> table is then indexed using traditional b*-tree indexes for performance.
>
>Informix and Illustra before it, and Postgresql before that (and I
>think maybe also DB2) do this using multidimensional indexes,
>usually based on R-trees.

I wrote a user contrib called gridfile-1.0 which could handle spatial
data. I'm pretty sure that noone's been using it, because I've found
out that it doesn't work. /-: I'm working on a new version, which
hopefully will both be correct and have better scalability.

The idea was taken from the article "The Grid File: An Adaptable,
Symmetric Multikey File Structure", by L. Nievergelt, H. Hinterberger,
published in "Readings in Database Systems", 2nd Ed, pp 108-24.

The really nice property was that you could do range matching in time
proportional to the number of objects found -- not to the size of the
table.

The main reason why I failed to make it work, of course, was hybris.
Being warped on Erlang, I naturally wanted it to be able to operate on
any type of key (not just integers, as in the article). This makes
splitting and merging more of a challenge. I also wanted to support
walking the grid in any dimension, with perfectly well defined
semantics (user-configurable, naturally). This proved rather
complicated.

If somebody else is really interested, I could finish my new version.
I could also try to make rdbms support derived indeces (difficult, due
to problems with atomicity in trigger actions, but I have at least an
approach to tackle that - still, hacking mnesia might be the better
option.)

/Uffe
--
Ulf Wiger                                    tfn: +46  8 719 81 95
Senior System Architect                      mob: +46 70 519 81 95
Strategic Product & System Management    ATM Multiservice Networks
Data Backbone & Optical Services Division      Ericsson Telecom AB



Reply | Threaded
Open this post in threaded view
|

Spatial indexing in mnesia

David Gould
On Thu, Jan 04, 2001 at 10:18:42AM +0100, Ulf Wiger wrote:
> On Wed, 3 Jan 2001, David Gould wrote:
>
> I wrote a user contrib called gridfile-1.0 which could handle spatial
> data. I'm pretty sure that noone's been using it, because I've found
> out that it doesn't work. /-: I'm working on a new version, which
> hopefully will both be correct and have better scalability.

What doesn't work? Can it be fixed just to tide us by?
 
> The idea was taken from the article "The Grid File: An Adaptable,
> Symmetric Multikey File Structure", by L. Nievergelt, H. Hinterberger,
> published in "Readings in Database Systems", 2nd Ed, pp 108-24.

Right, is that the silver cover edition, or an earlier one? Mine is at work
and I am not. Anyway, about 20 pages + or - from there is the paper by
Antonin Guttman defining R-Trees which as far as I last knew are the prefered
spatial index type.

Funny thing:  at Illustra (and later Informix), I worked with a guy
named Tony on some memory leaks and other integration bugs (we had a
difficult runtime environment) in the spatial index code, and as sometimes
happens when one is fixing someone elses bugs, I was kinda grumpy about it.
"Who wrote all this cruft, why do we let this kinda slop get checked in" and
so forth although it wasn't really that bad. It must have been about a year
later when I tumbled to the fact that "Tony" was short for "Antonin".
Sheesh. I wish I could say I was smarter now and wouldn't do that again...

 
> The really nice property was that you could do range matching in time
> proportional to the number of objects found -- not to the size of the
> table.

I will have a look. Thanks for the pointer.
 
> If somebody else is really interested, I could finish my new version.
> I could also try to make rdbms support derived indeces (difficult, due
> to problems with atomicity in trigger actions, but I have at least an
> approach to tackle that ...
> option.)

That might be useful though, there are an awful lot of non-mnesia systems
that are holding data hostage...
 
-dg

--
David Gould                                                 dg
SuSE, Inc.,  580 2cd St. #210,  Oakland, CA 94607          510.628.3380
why would you want to own /dev/null? "ooo! ooo! look! i stole nothing!
i'm the thief of nihilism! i'm the new god of zen monks."


Reply | Threaded
Open this post in threaded view
|

Spatial indexing in mnesia

Hal Snyder-2
David Gould <dg> writes:

> Right, is that the silver cover edition, or an earlier one? Mine is
> at work and I am not. Anyway, about 20 pages + or - from there is
> the paper by Antonin Guttman defining R-Trees which as far as I last
> knew are the prefered spatial index type.
>
> Funny thing: at Illustra (and later Informix), I worked with a guy
> named Tony on some memory leaks and other integration bugs (we had a
> difficult runtime environment) in the spatial index code, and as
> sometimes happens when one is fixing someone elses bugs, I was kinda
> grumpy about it. "Who wrote all this cruft, why do we let this kinda
> slop get checked in" and so forth although it wasn't really that
> bad. It must have been about a year later when I tumbled to the fact
> that "Tony" was short for "Antonin". Sheesh. I wish I could say I
> was smarter now and wouldn't do that again...

A little OT maybe, but BTW we have been using PostgreSQL r-trees to
locate nearest service provider - customer dials 800 number for a
retail chain, we get long/lat from his number, look up nearest store
in (r-tree indexed) database, and connect. We even optimized the
search by preloading spatial data sorted in Hilbert curve order.

The algorithms are subtle but not extremely lengthy - I bet the whole
thing could be done nicely in Erlang/mnesia. Heck, maybe you could
even use Erlang for some of the telephony...



Reply | Threaded
Open this post in threaded view
|

Spatial indexing in mnesia

David Gould
On Thu, Jan 04, 2001 at 11:29:08AM -0600, Hal Snyder wrote:
>
> A little OT maybe, but BTW we have been using PostgreSQL r-trees to
> locate nearest service provider - customer dials 800 number for a
> retail chain, we get long/lat from his number, look up nearest store
> in (r-tree indexed) database, and connect. We even optimized the
> search by preloading spatial data sorted in Hilbert curve order.

Nice. It is good to see at least some people are using PostgreSQL
for some of its more advanced features, instead of just another SQL system.
Presorting is a big win too, when you can get away with it.

> The algorithms are subtle but not extremely lengthy - I bet the whole
> thing could be done nicely in Erlang/mnesia. Heck, maybe you could
> even use Erlang for some of the telephony...

Thats what I was thinking. If someone had the time or the need of course.

-dg

--
David Gould                                                 dg
SuSE, Inc.,  580 2cd St. #210,  Oakland, CA 94607          510.628.3380
why would you want to own /dev/null? "ooo! ooo! look! i stole nothing!
i'm the thief of nihilism! i'm the new god of zen monks."