how to read from a TCP socket line by line

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

how to read from a TCP socket line by line

Daniel Solaz-2
Hello.

How does one do when there's the need to read from a TCP socket line by
line?  I've been looking through the docs but found nothing.

Something like this would be nice:

{ok, Socket} = gen_tcp:connect(...),
Line = io:get_line(Socket, ...)

Maybe I should look at how the inets webserver parses HTTP headers
(which BTW is exactly what I want to do, except on the client side).

--
d s o l a z @ L E P I D O P T E R O . C O M



Reply | Threaded
Open this post in threaded view
|

how to read from a TCP socket line by line

Luke Gorrie-3
Daniel Solaz <dsolaz> writes:

> Hello.
>
> How does one do when there's the need to read from a TCP socket line by
> line?  I've been looking through the docs but found nothing.
>
> Something like this would be nice:
>
> {ok, Socket} = gen_tcp:connect(...),
> Line = io:get_line(Socket, ...)
>
> Maybe I should look at how the inets webserver parses HTTP headers
> (which BTW is exactly what I want to do, except on the client side).

You can use the [{packet, line}] option in gen_tcp:connect/3.

For example:

  1> gen_tcp:connect("mail", 25, [{packet, line}, {active, true}]).
  {ok,#Port<0.8>}
  2> flush().
  Shell got {tcp,#Port<0.8>,
                 "220 sharks.alteon.com ESMTP Sendmail 8.11.3/8.11.3; Tue, 14 Aug 2001 06:30:31 -0700 (PDT)\r\n"}
  ok

I think this was added somewhat recently, but it looks like it's there
in R7B.

Cheers,
Luke



Reply | Threaded
Open this post in threaded view
|

how to read from a TCP socket line by line

snickl
In reply to this post by Daniel Solaz-2
On Tue, 14 Aug 2001, Daniel Solaz wrote:

> Maybe I should look at how the inets webserver parses HTTP headers
> (which BTW is exactly what I want to do, except on the client side).

In this case, you should check out
http://www.erlang.org/contrib/www_tools-1.0.tgz

CU, SN
------
The UNIX equivalent of the "Blue Screen of Death" would be called
"kernel panic". It obviously exists, since I have heard and read about it,
but I've never been witness to it in my professional career.
        John Kirch,
        Networking Consultant and Microsoft Certified Professional (Windows NT)



Reply | Threaded
Open this post in threaded view
|

how to read from a TCP socket line by line

Daniel Solaz-2
In reply to this post by Luke Gorrie-3
On Tuesday 14 August 2001 15:33 Luke Gorrie wrote:
> You can use the [{packet, line}] option in gen_tcp:connect/3.
> ...
> I think this was added somewhat recently, but it looks like it's
> there in R7B.

Ah yes, now I see it in the docs, but they say:
"line mode, a packet is a line terminated with newline, lines longer
than the receive buffer are truncated"

What does the last part mean?  If "truncated" means what I think it
means, how do I ensure my buffer is large enough when reading binary
data?

Note that after reading the HTTP response header line by line I should
switch to buffer mode (using gen_tcp:recv/3), since the response body
may be binary data.

In Modula-3 I've solved this using "read one line" and "read one
buffer" methods on the same reader.  But in Erlang it seems I either
open the socket in active mode and get lines as messages, or open the
socket in passive mode and get buffers.

--
d s o l a z @ L E P I D O P T E R O . C O M



Reply | Threaded
Open this post in threaded view
|

how to read from a TCP socket line by line

Luke Gorrie-3
Daniel Solaz <dsolaz> writes:

I got tips and tricks for these, curtesy of tony rogvall:

> Ah yes, now I see it in the docs, but they say:
> "line mode, a packet is a line terminated with newline, lines longer
> than the receive buffer are truncated"
>
> What does the last part mean?  If "truncated" means what I think it
> means, how do I ensure my buffer is large enough when reading binary
> data?

You can use the {buffer, ByteSize} option when opening the socket to
say how long the buffer should be. If a line is longer than that, then
you'll get it in several parts (only the last containing a
newline). So for long lines maybe you'll need to concatenate them
yourself, I'm not sure if there's a practical limit on the socket
buffer size.

> Note that after reading the HTTP response header line by line I should
> switch to buffer mode (using gen_tcp:recv/3), since the response body
> may be binary data.
>
> In Modula-3 I've solved this using "read one line" and "read one
> buffer" methods on the same reader.  But in Erlang it seems I either
> open the socket in active mode and get lines as messages, or open the
> socket in passive mode and get buffers.

You can do packet-based reads both as messages and with
gen_tcp:recv. The trick with recv is to have the socket in packet
mode, and pass a length of 0.

Here's an example, using a buffer size too small to get the whole
line:

  13> {ok, S} = gen_tcp:connect("mail.bluetail.com", 25, [{active, false}, {packet, line}, {buffer, 40}]).
  {ok,#Port<0.10>}
  14> gen_tcp:recv(S, 0).
  {ok,"220 mail.bluetail.com ESMTP BLUETAIL Mai"}
  15> gen_tcp:recv(S, 0).
  {ok,"l Robustifier (2.2.2/3.1.2); Wed, 15 Aug"}
  16> gen_tcp:recv(S, 0).
  {ok," 2001 17:16:59 +0200\r\n"}

When you've finished reading the headers, you can take the socket out
of line mode with:

  inet:setopts(S, [{packet, raw}])

Then you can read the body.

There are also some nifty tricks in the inet module:

You can find out the default/current settings for a socket, e.g:

  32> {ok, S} = gen_tcp:connect("mail.bluetail.com", 25, []).
  {ok,#Port<0.14>}        
  33> inet:getopts(S, [buffer, active, packet]).            
  {ok,[{buffer,1024},{active,true},{packet,0}]}

If you want to see all options, then you can call inet:options() to
get the complete list and pass that in for the second argument of
getopts/2.

Another nice thing is inet:i() which is like a "netstat" for erlang
showing all open sockets:

  35> inet:i().
  Port Module   Recv Sent Owner    Local Address       Foreign Address      State    
  13   inet_tcp 0    0    <0.54.0> 192.168.128.43:4371 192.168.128.251:smtp CONNECTED
  14   inet_tcp 102  0    <0.54.0> 192.168.128.43:4373 192.168.128.251:smtp CONNECTED
  Port Module Recv Sent Owner Local Address Foreign Address State
  ok