[ANN] locus: Geolocation and ASN lookup of IP addresses

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

[ANN] locus: Geolocation and ASN lookup of IP addresses

Guilherme Andrade
Hello list,

I'm pleased to announce the release of locus 1.0.0, a library that allows you to pinpoint the country, city or ASN of IPv4 and IPv6 addresses, using MaxMind's GeoLite2 databases.

The MaxMind databases[1] you choose are downloaded on-demand, cached on the filesystem and updated automatically.

* API reference: https://github.com/g-andrade/locus/blob/master/doc/locus.md
* Source code: https://github.com/g-andrade/locus/

The databases are loaded into memory (mostly) as-is; reference counted binaries are shared with the application callers using ETS tables, and the original binary search tree is used to lookup addresses. The data for each entry is decoded on the fly upon successful lookups.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] locus: Geolocation and ASN lookup of IP addresses

Max Lapshin-2
what is the difference from https://github.com/mochi/egeoip ?

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] locus: Geolocation and ASN lookup of IP addresses

Guilherme Andrade
Hi Max,


On 1 January 2018 at 21:16, Max Lapshin <[hidden email]> wrote:
what is the difference from https://github.com/mochi/egeoip ?

My original motivation in writing this library was indeed to create a modern replacement for egeoip, which I've depended upon for the past few years.

Some of the things I desired:

1) Not having snapshots of the database committed into the repository
    - This makes the repository grow over time and cloning it taking ever longer.

2) Making database updates seamless
    - Which was solved by loading them from network (closely related to 1.)
    - Caching on the local file system is then leveraged to reduce potentially
      reduced reliability.
    - In order to not consume too much bandwidth, conditional HTTP requests
      are used whenever possible.

3) Support for the GeoLite2 databases
    - GeoLite Legacy databases are being discontinued and will no longer receive updates after April 2018.

4) Less rigidly structured entries
    - Which was solved by returning entries as maps instead of records and having the code
      be particularly unopinionated about what data those entries contain.

5) Not having a limited number of workers as a potential bottleneck
    - This was solved by sharing the different database sections as reference-counted binaries
      through ETS



--
Guilherme

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] locus: Geolocation and ASN lookup of IP addresses

Сергей Прохоров-2
In reply to this post by Guilherme Andrade
Hi, Guilherme
 
5) Not having a limited number of workers as a potential bottleneck
    - This was solved by sharing the different database sections as
reference-counted binaries
      through ETS

Are you sure it works like this?
Because, as far as I know, ETS data resides in a separate chunk of memory. Is the idea that ETS will only contain references to original shared binary, but not the binary itself?
Have you checked / measured that it really works that way and not copying whole database to / from ETS for each lookup?

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] locus: Geolocation and ASN lookup of IP addresses

Guilherme Andrade
Hi Sergei,

On 2 January 2018 at 20:51, Сергей Прохоров <[hidden email]> wrote:
Are you sure it works like this?
Because, as far as I know, ETS data resides in a separate chunk of memory. Is the idea that ETS will only contain references to original shared binary, but not the binary itself?
Have you checked / measured that it really works that way and not copying whole database to / from ETS for each lookup?

I didn't benchmark it. I looked around on the web for past discussions on the matter and found a previous thread[1] discussing this particular problem, in which it was stated that a 6 word overhead would be incurred for every lookup.

As we're talking about blobs amounting to potentially dozens of megabytes, I felt very comfortable with such an overhead. I took those statements as likely being the truth, as the people involved seemed to know what they were talking about, but I'd be the first to restructure the current architecture if shown a better way.

Cheers,

[1]: http://erlang.org/pipermail/erlang-questions/2016-October/090712.html

--
Guilherme

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] locus: Geolocation and ASN lookup of IP addresses

Max Lapshin-2
Got it.

about: 5) Not having a limited number of workers as a potential bottleneck
    - This was solved by sharing the different database sections as reference-counted binaries
      through ETS


On what request rates do you get performance problems with egeoip?  we have never met them.

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] locus: Geolocation and ASN lookup of IP addresses

Guilherme Andrade
Hello,

On 5 January 2018 at 07:02, Max Lapshin <[hidden email]> wrote:
On what request rates do you get performance problems with egeoip?  we have never met them.

I have never reached the limit either, but it's a pattern that has previously led me to performance issues on other code very often.
At this point it has become ingrained in me that, if I can avoid it at negligible cost to complexity / maintainability / extensibility, then that's what I'll do.

Lest someone consider it premature optimization, I rather see it as being considerate of my future self 6 months from now :-). But that's all very subjective, of course.


--
Guilherme

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] locus: Geolocation and ASN lookup of IP addresses

Max Lapshin-2
Understand.

All other points are good, will try to look at it, thank you!

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] locus: Geolocation and ASN lookup of IP addresses

Guilherme Andrade
In reply to this post by Guilherme Andrade
Hi list,

Locus 1.1.1 was released today. For those who don't have the existing thread handy, it's library for looking up geolocation / ASN of IP addresses, using MaxMind GeoLite2.

Added:
- OTP 18, 19.0, 19.1 and 19.2 support (version 1.0.x required 19.3 or higher)
- ability of consulting database metadata, source and version through `:get_info`
- ability of subscribing database loader events
- ability of specifying connect, download start and idle download timeouts
- ability of turning off caching

Documentation was moved to HexDocs and test coverage was substantially increased.

* Documentation: https://hexdocs.pm/locus/

--
Guilherme

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] locus: Geolocation and ASN lookup of IP addresses

Guilherme Andrade
Hi list,

Locus 1.3.0 was released today. For those who don't have the existing thread handy, it's library for looking up geolocation / ASN of IP addresses, using MaxMind GeoLite2.

Added:
- ability of loading databases from local file system
- type spec of database entries

Fixed:
- wrong handling of timezones on cached tarballs
- wrong handling of daylight saving time on conditional HTTP requests

The timezone / DST fixes mentioned above were also backported to earlier versions and tagged under:
- 1.0.1
- 1.1.4
- 1.2.2


--
Guilherme

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions