TLS Distribution Certificate Strategy

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

TLS Distribution Certificate Strategy

Raimo Niskanen-2
Hello list!

I am working on the task to get Erlang Distribution over TLS to be easy to
use and flexible. :-/

So far I have tried to read up on the subject and messed up by misusing
server_name_indication in inet_tls_dist.erl...

The distribution protocol in itself has got two roles - the client and the
server role.  When node A connects to node B then A is the client
and B is the server.  All nodes are servers and any node can also be a
client.

When a node is client we can use the options [{verify,verify_peer},
{verify_fun,{VerifyFun,Data}}].  Then VerifyFun can verify that the
certificate is valid for the Node we connect to.  Remains to get the Node
variable into the VerifyFun arguments - getting to that.

When a node is server it is not possible to know which node that connects
until after the TLS connection is up and while the ditribution handshake is
performed.  The there is a step to check that the connecting node is on the
allowed list.  This step takes a list of allowed nodes that currently is a
net_kernel configuration parameter that can be modified by e.g
inet_tls_dist.  We can use that.

So this boils down to parsing the certificate for a node name or a list
thereof.  My goal is that there should be a plugin function that does
exactly that, that one can customize.

An additional question about that certificate parsing function is whether
it should return node name(s) or host name(s).  Should a certificate be per
node or per host or should both be possible?

If a node certificate should contain host name(s) there seems to be
agreement on that one could put this in the certificate's Subject field as
a DistinguishedName containing a CommonName being the host name, or in an
Extension subjectAltName containing a dNSName.

If a node certificate should contain node name(s) I only have some
half-good suggestions:
* Store a node name [hidden email] as a host name node.example.org in
  either of the above ways, i.e replace the `@` with `.`.
* Store the node name as a CommonName as it is, either in DistinguishedName
  or in an Exension subjectAltName containing a directoryName that is
  another CommonName.
* Use the DomainComponent field in DistinguishedName, multiple times, and
  CommonName for the node name as in "/DC=com=DC=example/CN=node".  I think
  this maybe according to the standard's intent, but very odd.
* There are lots of esoteric fields to use in a certificate, for example
  the Extension subjectAltName EmailAddress that syntactically fits a node
  name, but would that not be misuse?

Related questions are: what kind of certificates can you get issued and do
you want to use regular web certificates for this purpose?  Should it be
possible to use Let's Encrypt certificates for nodes/hosts in a cluster
(you would need a node whitelist for this use case i guess)?  For a cluster
we can create tools that create local certificates, with any of the above
content, but how general should the distribution protocol validation be?

So what do people want?
And what is appropriate for Cloud services?

Anyone with an opinion and insight, please speak up!

Best Regards
--

/ Raimo Niskanen, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: TLS Distribution Certificate Strategy

Fred Dushin

> On Apr 20, 2018, at 9:57 AM, Raimo Niskanen <[hidden email]> wrote:
>
> Hello list!
>
> I am working on the task to get Erlang Distribution over TLS to be easy to
> use and flexible. :-/

Lucky you!

>
> So far I have tried to read up on the subject and messed up by misusing
> server_name_indication in inet_tls_dist.erl...
>
> The distribution protocol in itself has got two roles - the client and the
> server role.  When node A connects to node B then A is the client
> and B is the server.  All nodes are servers and any node can also be a
> client.
>
> When a node is client we can use the options [{verify,verify_peer},
> {verify_fun,{VerifyFun,Data}}].  Then VerifyFun can verify that the
> certificate is valid for the Node we connect to.  Remains to get the Node
> variable into the VerifyFun arguments - getting to that.
>
> When a node is server it is not possible to know which node that connects
> until after the TLS connection is up and while the ditribution handshake is
> performed.  The there is a step to check that the connecting node is on the
> allowed list.  This step takes a list of allowed nodes that currently is a
> net_kernel configuration parameter that can be modified by e.g
> inet_tls_dist.  We can use that.

How does the connecting (client) node tell the server its own name (say, in the non-TLS use case).  Is this part of the disterl protocol?  I would think it has to be, and not some magic conjured by the underlying TCP or TLS protocol.

(Sorry, I don't currently have time to dig into the code and research this myself)

>
> So this boils down to parsing the certificate for a node name or a list
> thereof.  My goal is that there should be a plugin function that does
> exactly that, that one can customize.
>
> An additional question about that certificate parsing function is whether
> it should return node name(s) or host name(s).  Should a certificate be per
> node or per host or should both be possible?

If we are talking about the disterl protocol, I would keep it to node names, and keep host names out of it.  There is no guarantee that two nodes would be running on the same host (Riak devrels!), where each has different security requirements.  (to your flexibility point)  I also think host names are impossible to manage, no less manage securely, whereas presumably node names are under some semblance of control by the application, and therefore hopefully the application operator.  Host names might be managed by other teams in distant lands.

>
> If a node certificate should contain host name(s) there seems to be
> agreement on that one could put this in the certificate's Subject field as
> a DistinguishedName containing a CommonName being the host name, or in an
> Extension subjectAltName containing a dNSName.

"there seems to be agreement"

Well, not universally.  I can get on my soap box and say why hostname validation is a hoax, if you like :)

>
> If a node certificate should contain node name(s) I only have some
> half-good suggestions:
> * Store a node name [hidden email] as a host name node.example.org in
>  either of the above ways, i.e replace the `@` with `.`.
> * Store the node name as a CommonName as it is, either in DistinguishedName
>  or in an Exension subjectAltName containing a directoryName that is
>  another CommonName.

^^ common in most TLS deployments, but only for server certs, i.e., when clients authenticate servers.  I think authenticating clients via DNs or SubjectAltNames is somewhat non-standard, but useful.  I would NOT use a hostname in that case, however (to the same point above).

I would say that if you trust the client, via whatever ordinary PKIX certificate chain validation at your disposal, the additional requirement of matching the CN with the node name could be an onerous burden, when the node name is already published over the disterl protocol, and is therefore shipped over an authenticated/trustorthy channel.  What sort of attack would you be thwarting, by adding that requirement?


> * Use the DomainComponent field in DistinguishedName, multiple times, and
>  CommonName for the node name as in "/DC=com=DC=example/CN=node".  I think
>  this maybe according to the standard's intent, but very odd.

yes, odd.

> * There are lots of esoteric fields to use in a certificate, for example
>  the Extension subjectAltName EmailAddress that syntactically fits a node
>  name, but would that not be misuse?

I believe there is just a URI type for SubjectAltName

https://tools.ietf.org/html/rfc5280#section-4.2.1.6

>
> Related questions are: what kind of certificates can you get issued and do
> you want to use regular web certificates for this purpose?  Should it be
> possible to use Let's Encrypt certificates for nodes/hosts in a cluster
> (you would need a node whitelist for this use case i guess)?  For a cluster
> we can create tools that create local certificates, with any of the above
> content, but how general should the distribution protocol validation be?
>
> So what do people want?

I think a reference architecture, that includes TLS-enabling epmd, and the disterl communication paths between nodes, would be good. I mean, I have a sense of what that is, but it is likely incomplete and has holes.  So having a document that states or illustrates use cases would be helpful.  Is there a platform for collaboration on this?

> And what is appropriate for Cloud services?

I think cloud services are going to be a special case of enterprise (when cloud == IAAS).  I don't think we are talking about PAAS, are we?  (I wish!)  As such, I think most PKI deployments will resemble the kind of thing you see (or should see) in a enterprise deployment, though perhaps at a slightly larger scale, depending on the growth trajectory of the folks paying the cloud bill.

>
> Anyone with an opinion and insight, please speak up!
>
> Best Regards
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> _______________________________________________
> erlang-questions mailing list
> [hidden email]
> http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: TLS Distribution Certificate Strategy

Jesper Louis Andersen-2
In reply to this post by Raimo Niskanen-2
On Fri, Apr 20, 2018 at 3:57 PM Raimo Niskanen <[hidden email]> wrote:
Hello list!

I am working on the task to get Erlang Distribution over TLS to be easy to
use and flexible. :-/


You know what I am going to say:

"What are NOT part of this task?"

I have a general toxic reaction to words such as "easy to use" and "flexible" because the former can hide important details in the name of easy, but then make more intricate setups impossible; and the latter often risks making the core of the system bad as a sacrifice for being able to do anything.

For cloud services, the general rules are:

* Machines are brought up and down at a whim, they usually change IP addresses, some times also networks.
* Machines can have stable DNS names where the underlying IP change, so be prepared for that setting.
* Some systems don't have stable DNS either
* The network, cluster size, etc are all dynamic and will scale up and down depending on load.
* The network is highly unreliable. Weekly disconnects are commonplace for any point-to-point connection. In larger clusters, assume daily TCP disconnects.
* The network is likely to deliberately fault-inject to verify the system is robust under noise (Chaos-monkey strategies).

In this setting, the lure of having TLS would be that you don't have to build a virtual network which also encrypts. Rather, you can just have the Erlang nodes connect by TLS. It also simplifies the notion of connecting "into" the cluster from the outside.

The Erlang distribution protocol is quite the contrary to the typical cloud network though:

* Assumes a mostly stable static network
* Assumes a few static machines
* Assigns names to everything, in a somewhat static way
* Assumes you know every node "beforehand" in many situations

I feel this is the impedance mismatch which is present. Hence my original pet-peeve: define the scope :)

My own solution would definitely be "screw you, TLS, here is my own public key registry, vault, and libsodium/enacl :)"


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions
Reply | Threaded
Open this post in threaded view
|

Re: TLS Distribution Certificate Strategy

Fred Dushin

On Apr 24, 2018, at 6:02 AM, Jesper Louis Andersen <[hidden email]> wrote:

On Fri, Apr 20, 2018 at 3:57 PM Raimo Niskanen <[hidden email]> wrote:
Hello list!

I am working on the task to get Erlang Distribution over TLS to be easy to
use and flexible. :-/


You know what I am going to say:

"What are NOT part of this task?"

I have a general toxic reaction to words such as "easy to use" and "flexible" because the former can hide important details in the name of easy, but then make more intricate setups impossible; and the latter often risks making the core of the system bad as a sacrifice for being able to do anything.

For cloud services, the general rules are:

* Machines are brought up and down at a whim, they usually change IP addresses, some times also networks.
* Machines can have stable DNS names where the underlying IP change, so be prepared for that setting.
* Some systems don't have stable DNS either
* The network, cluster size, etc are all dynamic and will scale up and down depending on load.
* The network is highly unreliable. Weekly disconnects are commonplace for any point-to-point connection. In larger clusters, assume daily TCP disconnects.
* The network is likely to deliberately fault-inject to verify the system is robust under noise (Chaos-monkey strategies).

So I agree with all of those, and I think it adds to the argument that any additional authentication over and above PKIX certificate validation should NOT use hostnames or IP addresses, or should not _require_ hostnames or IPs.  I am still happy to rant about TLS hostname validation [sic], as it is complete and utter BS outside of the context of e-commerce, for which it was designed.  Yet for some reason even intelligent people think it should make its way into RFCs.  :head-bang:


In this setting, the lure of having TLS would be that you don't have to build a virtual network which also encrypts. Rather, you can just have the Erlang nodes connect by TLS. It also simplifies the notion of connecting "into" the cluster from the outside.

The Erlang distribution protocol is quite the contrary to the typical cloud network though:

* Assumes a mostly stable static network
* Assumes a few static machines
* Assigns names to everything, in a somewhat static way
* Assumes you know every node "beforehand" in many situations

I feel this is the impedance mismatch which is present. Hence my original pet-peeve: define the scope :)

My own solution would definitely be "screw you, TLS, here is my own public key registry, vault, and libsodium/enacl :)"

Um, no.  Not that TLS is a panacea, but...


I think the larger question here is: why aren't you using TLS?

I will warn you in advance that "because we're using ZeroMQ" is a silly answer. This is at least the third vulnerability that has been found in your homebrew transport encryption, after the lack of a MAC and a timing attack. I hope you now realize that homebrewing your own transport encryption is a bad idea and you should seriously consider switching to TLS at this point to avoid future attacks.

-Fred


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions


_______________________________________________
erlang-questions mailing list
[hidden email]
http://erlang.org/mailman/listinfo/erlang-questions