DB for Full Text Search

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

DB for Full Text Search

Theepan
Team,

We are evaluating to select a database that natively supports "full text search". Do you have any inputs? The following are key:

* Erlang integration
* Document orientation
* Weights for different keys or keys at different depths (in a JSON document)
* Performance -- In-memory should be fine.
* Scalability

Thanks,
Theepan

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: DB for Full Text Search

dmkolesnikov
Hello,

It looks like ElasticSearch is your answer here.

Best Regards,
Dmitry
>-|-|-(*>

> On 03 Sep 2015, at 01:21, Kannan <[hidden email]> wrote:
>
> Team,
>
> We are evaluating to select a database that natively supports "full text search". Do you have any inputs? The following are key:
>
> * Erlang integration
> * Document orientation
> * Weights for different keys or keys at different depths (in a JSON document)
> * Performance -- In-memory should be fine.
> * Scalability
>
> Thanks,
> Theepan
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: DB for Full Text Search

Paul Oliver-2
If your primary data store is elsewhere then elasticsearch will probably fulfil those requirements.  See https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-0 for details about data loss with elasticsearch (at least as of 1.5.0.  It might be worth testing 1.7.0 with jepsen to see if anything has changed).  

On Thu, Sep 3, 2015 at 1:36 AM Dmitry Kolesnikov <[hidden email]> wrote:
Hello,

It looks like ElasticSearch is your answer here.

Best Regards,
Dmitry
>-|-|-(*>

> On 03 Sep 2015, at 01:21, Kannan <[hidden email]> wrote:
>
> Team,
>
> We are evaluating to select a database that natively supports "full text search". Do you have any inputs? The following are key:
>
> * Erlang integration
> * Document orientation
> * Weights for different keys or keys at different depths (in a JSON document)
> * Performance -- In-memory should be fine.
> * Scalability
>
> Thanks,
> Theepan
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: DB for Full Text Search

Brujo Benavides-2
And, while you’re at it, in terms of Erlang integration, it’s worth checking tirerl :)

On Sep 3, 2015, at 06:00, Paul Oliver <[hidden email]> wrote:

If your primary data store is elsewhere then elasticsearch will probably fulfil those requirements.  See https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-0 for details about data loss with elasticsearch (at least as of 1.5.0.  It might be worth testing 1.7.0 with jepsen to see if anything has changed).  

On Thu, Sep 3, 2015 at 1:36 AM Dmitry Kolesnikov <[hidden email]> wrote:
Hello,

It looks like ElasticSearch is your answer here.

Best Regards,
Dmitry
>-|-|-(*>

> On 03 Sep 2015, at 01:21, Kannan <[hidden email]> wrote:
>
> Team,
>
> We are evaluating to select a database that natively supports "full text search". Do you have any inputs? The following are key:
>
> * Erlang integration
> * Document orientation
> * Weights for different keys or keys at different depths (in a JSON document)
> * Performance -- In-memory should be fine.
> * Scalability
>
> Thanks,
> Theepan
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: DB for Full Text Search

Joe Armstrong-2
In reply to this post by Theepan
On Thu, Sep 3, 2015 at 1:21 AM, Kannan <[hidden email]> wrote:

> Team,
>
> We are evaluating to select a database that natively supports "full text
> search". Do you have any inputs? The following are key:
>
> * Erlang integration
> * Document orientation
> * Weights for different keys or keys at different depths (in a JSON
> document)
> * Performance -- In-memory should be fine.
> * Scalability



Sorry but I have to ask:

How much memory have to got?
(in-memory might be fine, but it's vague, are we talking GBytes,
Mbytes, TeraBytes?)
How many document/second do you want to index/search?
How many words per document?
How big in the corpus?
What do you want to retrieve (name of file(s) where words occur?)
Is the index write append only or must it be updatable
Is the index replicated?
Security?
What at the input documents (text, html, pdf, ...)?
What languages are the input documents in?

Without stating your requirement it is impossible to give a good answer
there is an incredible spectrum of answers.

/Joe

>
> Thanks,
> Theepan
>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
>
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: DB for Full Text Search

Theepan
Thanks for all for your valuable inputs. I am looking forward for some more variations.

Hi Joe, I hope the story below answers all your questions. Hardware is not a limiting factor.

We are in the process of building a global business-customer discovery/social/engagement platform, out from the cloud. If we succeed, it will be a big platform.

Contents of the documents will be plain text, and CRUD operations will be performed on the contents. Size - on average, 1MB each. The whole contents of the document must be searchable. There will be replication of static indexes. Security is to be applied on the wires, at the ingress and on sensitive data. Sensitive data will be kept separately, encrypted.

Regards,
Theepan










On Thu, Sep 3, 2015 at 7:05 PM, Joe Armstrong <[hidden email]> wrote:
On Thu, Sep 3, 2015 at 1:21 AM, Kannan <[hidden email]> wrote:
> Team,
>
> We are evaluating to select a database that natively supports "full text
> search". Do you have any inputs? The following are key:
>
> * Erlang integration
> * Document orientation
> * Weights for different keys or keys at different depths (in a JSON
> document)
> * Performance -- In-memory should be fine.
> * Scalability



Sorry but I have to ask:

How much memory have to got?
(in-memory might be fine, but it's vague, are we talking GBytes,
Mbytes, TeraBytes?)
How many document/second do you want to index/search?
How many words per document?
How big in the corpus?
What do you want to retrieve (name of file(s) where words occur?)
Is the index write append only or must it be updatable
Is the index replicated?
Security?
What at the input documents (text, html, pdf, ...)?
What languages are the input documents in?

Without stating your requirement it is impossible to give a good answer
there is an incredible spectrum of answers.

/Joe

>
> Thanks,
> Theepan
>
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions
>


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions