File versioning

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

File versioning

Bengt Kleberg-3

> Date: Wed, 7 May 2003 09:58:13 +0200 (MEST)
> X-Sybari-Space: 00000000 00000000 00000000 00000000
> From: Bengt Kleberg <eleberg>

...deleted

> see http://plan9.bell-labs.com/sys/doc/venti/venti.html for a
> write-only file server.

it has been pointed out to me that write-only is wrong terminology.
it should be called a write-once file server. (it is possible to write
more than once to it, so possibly this name too is wrong. however, i
have now copied the name from the original paper, not using my faulty
memory of it)


bengt



Reply | Threaded
Open this post in threaded view
|

Erlang filesystem (was: Re: File versioning)

Chris Pressey
On Wed, 7 May 2003 10:18:00 +0200 (MEST)
Bengt Kleberg <eleberg> wrote:

> > see http://plan9.bell-labs.com/sys/doc/venti/venti.html for a
> > write-only file server.
>
> it has been pointed out to me that write-only is wrong terminology.

http://www.ganssle.com/misc/wom.html

:)

Seriously, it's a good idea.

Vlad wrote:
> In an Erlang environment, we could modify the file server to access
> some kind of repository for some parts of the namespace, which could
> be a database or a CVS repository or something like that. This way
> everything looks like normal files, but they aren't. Maybe Luke's nfs
> server could also be a part of this too...

Everything looks like normal files - in an Erlang environment.  That
strikes me as the biggest drawback, at least for the sake of argument.
I'd like to be able to use it from other programs with a minimum of
fuss.  I can think of a few ways around it:

1- make Erlang's filesystem mountable as an OS-level filesystem
2- make Erlang an OS
3- have some way to painlessly communicate between external, non-Erlang
   programs, and a running Erlang node

I get the impression that #1, writing a mountable filesystem 'driver',
is tricky even in C.  I could be wrong though.

I have no problem with #2 since Erlang already has many of the
characteristics of a good OS and since I've abandoned the adage of "the
right language for the task" after realizing that Java - no, Haskell -
no, *Erlang* is clearly God's Own Language.  Also, re-writing in Erlang
all the applications I use daily would give me something interesting to
do.  :)

But #3 is probably the most feasible, and useful for a number of other
things too.  I've tried using a named pipe as a port, and I've tried
ad-hoc connections to an Erlang distribution, but neither seems reliable
enough for real use.  That pretty much leaves sockets - probably inet
sockets since Unix-domain sockets aren't as well tested, and inet
sockets would let you connect across the network & can be secured if
need be.

So I envision doing stuff like the following from the command line:

  ls | grep foo | sort -r | store-in-erlfs-as bar.txt

where 'store-in-erlfs-as' is a little program which opens a socket and
talks to the running Erlang distribution that is handling the Erlang
filesystem.  It could maybe even be a shell script that looks for a
temporary file for what machine and port to connect to, then uses
'tcpdump' or similar to connect to that socket and send Erlang it's
input.

Naturally there'd also be a corresponding 'fetch-from-erlfs' program.
With shorter names probably, like 'erlfsstore' & 'erlfsfetch', but you
get the idea.

On a vaguely related note... someone recently asked on one of the
FreeBSD lists if there was any realtime file replication software
available.  rsync and so forth were mentioned, but apparently there
isn't any that is really 'realtime', and what's more it sounds like a
'hard' problem.  Maybe an Erlang filesystem could fit the bill here?

-Chris


Reply | Threaded
Open this post in threaded view
|

Erlang filesystem (was: Re: File versioning)

Scott Lystig Fritchie-3
>>>>> "cp" == Chris Pressey <cpressey> writes:

cp> 3- have some way to painlessly communicate
cp> between external, non-Erlang programs, and a running Erlang node

<reflection mode="greybeard">

Back in the days when I worked at Sendmail...

... we did such a thing.  It used TCP (and UNIX domain sockets, later)
and Sun RPC.  Client applications spoke RPC to an Erlang I/O daemon
(running on the same box).  The Erlang I/O daemon "knew" were files
were really stored: their "real" path(s) on an NFS server(s)
somewhere.  It knew if the file was replicated or not.  Though we
never finished it, we had an icky draft of code that could
transparently migrate files to new paths and/or new NFS servers in the
event that the mapping of files -> {replication-mode, [NFS servers]}
changed.

We stole open(2), read(2), write(2), close(2), chdir(2), unlink(2)
... everything file I/O related.  We could tell the difference between
an I/O-daemon'ified file descriptors (e.g., referring to files with
paths in the namespace below "/bams", "/var/spool/mqueue", or
whatever) and local file system descriptors.

We even stole flock(2) and implemented a leasing scheme that was
*reliable* when talking to an NFSv3 file server.  Heh.

The only thing that we couldn't easily do is mmap(2).  We stole it,
but the implementation was easy: int bams_mmap(...) { return EINVAL; }

BAMS'izing an application:

1. Check its use of mmap() first.
2. Link with BAMS library.
3. Run.

Worked for "sendmail", GNU "tar" and "ls".  I think I did "qmail" or
"postfix", but details are foggy now.  Oh, yeah, "beam" too.  :-)

Then I created an LD_PRELOAD'able shared library, making step #2
optional.

Ah, those were the days when we wrote code in log cabins with
wood-fired generators....

</reflection>

There are several hacks these days that do file system type stuff in
user space.  Gnome VFS and later versions of Tcl come immediately to
mind.  But they and other efforts like them don't have all that much
in common.  Except for perhaps Plastic File System.  It'd be great to
glue them together.  Mebbe I should do that.  {sigh}

-Scott


Reply | Threaded
Open this post in threaded view
|

Erlang filesystem (was: Re: File versioning)

Luke Gorrie-3
In reply to this post by Chris Pressey
Chris Pressey <cpressey> writes:

> Everything looks like normal files - in an Erlang environment.  That
> strikes me as the biggest drawback, at least for the sake of argument.
> I'd like to be able to use it from other programs with a minimum of
> fuss.  I can think of a few ways around it:
>
> 1- make Erlang's filesystem mountable as an OS-level filesystem

You can do this with my Erlang NFS server. It's a quick hack, very
non-robust and unfeatureful etc. It's in the Jungerl if you want to
play around - lib/enfs/.

I have a "/proc" filesystem included. With that you can grep process
dictionaries :-)

Cheers,
Luke



Reply | Threaded
Open this post in threaded view
|

Erlang filesystem (a bit long) (was: Re: File versioning)

Chris Pressey
What, no takers for #2?  I'm disappointed in you guys :)

On 08 May 2003 06:05:05 +0200
Luke Gorrie <luke> wrote:
> You can do this with my Erlang NFS server. It's a quick hack, very
> non-robust and unfeatureful etc. It's in the Jungerl if you want to
> play around - lib/enfs/.

And Scott wrote:
> ... we did such a thing.  It used TCP (and UNIX domain sockets, later)
> and Sun RPC.  Client applications spoke RPC to an Erlang I/O daemon
> (running on the same box).  The Erlang I/O daemon "knew" were files
> were really stored: their "real" path(s) on an NFS server(s)
> somewhere. [...] We even stole flock(2) and implemented a leasing
> scheme that was *reliable* when talking to an NFSv3 file server.  Heh.

As cool as both of these sound, well... to paraphrase Bengt -

I could be blinded by the thousands of horror stories and the
unfortunate nickname "Network Failure System" given to it by a guru I
highly respect, but I believe that sticking your data on 8" floppy disks
and walking it from one computer to the other is superior to NFS in
every way :)

OK, I'm a *little* biased.  I haven't used it in ages, apparently NFSv4
has reliable locking semantics, etc.  But still I have the vague feeling
it's not quite what I'm looking for - I'd rather have a replicated file
system, like an "auto-mirror" perhaps, that propogates updates exactly
when it needs to.

That doesn't mean, of course, that I won't look into enfs; producing a
mountable filesystem from Erlang doesn't have anything inherently to do
with NFS, and it would be the most seamless way to go (and just plain
cool besides.)  But it too has the drawback that porting it outside of
Unix would probably be a royal pain.

(For this reason, for #3 to be portable, shell scripts, tcpcat, and
unix-domain sockets are out; Perl + inet sockets looks like a better
choice - maybe C if startup-time is critical (The Erlang runtime's
relatively long startup-time is what started me thinking about this
approach in the first place.))

#3 also doesn't have anything inherently to do with NFS or any other
filesystem; what would be most useful would be a 'tellerl' command that
simply sends a message (+ input data) to the Erlang node and waits for
a response (+ output data.)  Sort of an RPC-like mechanism, except of
course it's probably more productive to think of it in terms of
messages rather than procedure calls.  'erlfsstore' et al could be built
on top of 'tellerl'.

There is also at least one other option I missed:

#4 - have the Erlang node watch the OS's filesystem and react to changes
     in it.

Different operating systems do this differently, though - FreeBSD has
kqueues, WinNT has a notification service, and I don't know what Linux
has - so while this approach has a certain elegance, it too would be
difficult to make portable, unless it resorted to polling on unknown
OS'es, which would seriously detract from the elegance.

This too would be a nifty thing to have outside of the context of having
it implement a filesystem.

Speaking of kqueues - I searched the archives and about a year ago, Per
announced a 'devpoll' patch which employed them - but it seems like it
was more for the purposes of getting better performance from sockets &
other files, than for the purposes of watching and reacting to the
filesystem.

Which brings me to ask if anyone has interfaced Erlang with kqueue in a
general fashion, for example as a port from which you can receive
messages when kevents happen.  And/or the same concept for other
operating systems.

This is definately beyond my fu but in a pinch I could probably put
together something that uses os:cmd("wait_on " ++ ...).  (Not as
efficient to shell an executable, certainly, but still far more
efficient than polling :)

-Chris


Reply | Threaded
Open this post in threaded view
|

Erlang filesystem (a bit long) (was: Re: File versioning)

Scott Lystig Fritchie-3
Too off-topic to continue on the list....

>>>>> "cp" == Chris Pressey <cpressey> writes:

cp> I could be blinded by the thousands of horror stories and the
cp> unfortunate nickname "Network Failure System" given to it by a
cp> guru I highly respect, but I believe that sticking your data on 8"
cp> floppy disks and walking it from one computer to the other is
cp> superior to NFS in every way :)

Ah, you're confusing NFS client implementation with NFS server
implementation.

If you control the NFS client implementation, you control how it
reacts in the event of NFS server or network failure.  There's nothing
that says that NFS must be implemented by the OS kernel.  :-)

-Scott


Reply | Threaded
Open this post in threaded view
|

Erlang filesystem (a bit long) (was: Re: File versioning)

Tony Rogvall-3
In reply to this post by Luke Gorrie-3
Chris Pressey wrote:
> What, no takers for #2?  I'm disappointed in you guys :)
>

Well Erlux is nearly #2 I guess, still some work to do.

The first erlux will be a kernel module running a kernel deamon (kerl),
this actually works! I have the ring0 exectuting in the emulator,
loading code etc. Things remaing

- rewrite some code to fit the 2.4.18xx (redhat kernels). Now it's a
2.4.20 only kernel module.
- fix floating point issues.
- make sure the emulator runs on a bounded stack (SMALL)
   This was done in the multi-threaded erlang! (some one preserved the
code????? I guess not :-(
- Think of nice ways of handling input from console (kernel module)

All basic drivers are working, even then dll driver (uses kernel modules
to implement loadable drivers :-)

Near enough ?

/Tony

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3237 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20030508/1814c702/attachment.bin>

Reply | Threaded
Open this post in threaded view
|

Erlang filesystem (a bit long) (was: Re: File versioning)

Scott Lystig Fritchie-3
Trying to send a message *to* the list this time.  {sigh}

>>>>> "tr" == Tony Rogvall <tony> writes:

tr> Well Erlux is nearly #2 I guess, still some work to do.

Out of curiousity, how much does "Kernel Mode Linux" get you toward
your goals?  http://www.yl.is.s.u-tokyo.ac.jp/~tosh/kml/

"Kernel Mode Linux is a technology which enables us to execute user
programs in a kernel mode. In Kernel Mode Linux, user programs can be
executed as user processes that have the privilege level of a kernel
mode. The benefit of executing user programs in a kernel mode is that
the user programs can access a kernel address space directly. So, for
example, user programs can invoke system calls very fast because it is
unnecessary to switch between a kernel mode and a user mode by using
costly software interruptions or context switches. Unlike kernel
modules, user programs are executed as ordinary processes (except for
their privilege level), so scheduling and paging are performed as
usual."

This sort of thing is just sick and wrong and against the laws of
Kentucky and Tennessee(*) and ... darnit ... cool.

-Scott

(*) Disclaimer: I am not a lawyer.


Reply | Threaded
Open this post in threaded view
|

Erlang filesystem (a bit long) (was: Re: File versioning)

Chris Pressey
In reply to this post by Tony Rogvall-3
On Thu, 08 May 2003 22:18:06 +0200
Tony Rogvall <tony> wrote:

> Chris Pressey wrote:
> > What, no takers for #2?  I'm disappointed in you guys :)
> >
>
> Well Erlux is nearly #2 I guess, still some work to do.
>
> The first erlux will be a kernel module running a kernel deamon (kerl),
> this actually works! I have the ring0 exectuting in the emulator,
> loading code etc. Things remaing
>
> - rewrite some code to fit the 2.4.18xx (redhat kernels). Now it's a
> 2.4.20 only kernel module.
> - fix floating point issues.
> - make sure the emulator runs on a bounded stack (SMALL)
>    This was done in the multi-threaded erlang! (some one preserved the
> code????? I guess not :-(
> - Think of nice ways of handling input from console (kernel module)
>
> All basic drivers are working, even then dll driver (uses kernel modules
> to implement loadable drivers :-)
>
> Near enough ?
>
> /Tony

!!!!!  Dangit, the Erlang community never ceases to amaze me.

Downright inspired me to prove that I wasn't just all talk, too -
I sat myself down and implemented #3.  The README should explain it all:

  http://www.catseye.mb.ca/projects/tellerl-2003.0508/

-Chris


Reply | Threaded
Open this post in threaded view
|

Erlang filesystem (a bit long) (was: Re: File versioning)

Luke Gorrie-3
Chris Pressey <cpressey> writes:

> Downright inspired me to prove that I wasn't just all talk, too -
> I sat myself down and implemented #3.  The README should explain it all:
>
>   http://www.catseye.mb.ca/projects/tellerl-2003.0508/

Do you know about the erl_call example program included with
erl_interface? It is actually incredibly cool, but I never remember to
use it..

Of course, a real man would have no hesitation in doing it like this:

  #!/bin/sh
  ERLANG_MODE_DIR="/home/luke/elisp"
  DISTEL_DIR="/home/luke/hacking/distel/elisp"
  if [ $# != 4 ]; then
      echo "Usage: $0 <node> <mod> <func> <argslist (in emacs lisp syntax)>"
      exit 1
  fi
  emacs --batch \
        -L ${ERLANG_MODE_DIR} \
        -L ${DISTEL_DIR} \
        --eval "
  (progn
    (defun message (&rest ignore) nil)   ; silence messages
    (require 'distel)
    (setq worker-pid
          (erl-rpc (lambda (x)
                     (princ (format \"%s\\n\" x) t))
                     '()
                     '$1 '$2 '$3 '$4))
    (while (erl-local-pid-alive-p worker-pid)
      (accept-process-output)))
  "

And then:

  $ ./tellerl x lists reverse "((x y z))"
  (z y x)

Cheers,
Luke



Reply | Threaded
Open this post in threaded view
|

Erlang filesystem (a bit long) (was: Re: File versioning)

Luke Gorrie-3
Luke Gorrie <luke> writes:

> Of course, a real man would have no hesitation in doing it like this:

Sorry, bad joke.

Of course what I meant was:

  #!/bin/sh
  ERLANG_MODE_DIR="/home/luke/elisp"
  DISTEL_DIR="/home/luke/hacking/distel/elisp"
  if [ $# -lt 2 ]; then
      echo "Usage: $0 <node> <expr>..."
      exit 1
  fi
  node=$1; shift
  expr="$@"
  emacs --batch \
        -L ${ERLANG_MODE_DIR} \
        -L ${DISTEL_DIR} \
        --eval "
  (progn
    (require 'distel)
    (setq erl-nodeup-hook nil)  ; suppress 'nodeup' message
    (let ((pid (erl-eval-expression '${node} \"${expr}\")))
      (while (erl-local-pid-alive-p pid)
        (accept-process-output))))"

So you can just say:

  $ ./tellerl x "{code_path_size, length(code:get_path())}."
  {code_path_size,44}

But then, "C-c C-d :" is marginally less keystrokes than "tellerl" ;-)

-Luke



Reply | Threaded
Open this post in threaded view
|

erl_call (was Re: Erlang filesystem)

Chris Pressey
In reply to this post by Luke Gorrie-3
On 09 May 2003 05:59:15 +0200
Luke Gorrie <luke> wrote:

> Chris Pressey <cpressey> writes:
>
> > Downright inspired me to prove that I wasn't just all talk, too -
> > I sat myself down and implemented #3.  The README should explain it
> > all:
> >
> >   http://www.catseye.mb.ca/projects/tellerl-2003.0508/
>
> Do you know about the erl_call example program included with
> erl_interface? It is actually incredibly cool, but I never remember to
> use it..

Had heard of it, hadn't played with it until now.  No mention of it in
the erl_interface docs that I could see.

It doesn't seem to work for me with short node names, only long ones.

In contrast, tellerl doesn't require any distribution mechanism, just
that tellerl.beam is loaded and started.

tellerl also doesn't allow arbitrary functions to be evaluated.
(This is a *good* thing in my book, much like avoiding export_all is a
good thing... I have enough ComplexityShockHorror in my life as it is)

tellerl.pl also acts as a data pipe destination, which is pretty much a
requirement for things launched from /etc/aliases or a .qmail-* file.
Getting erl_call to do this would be, well, possible, but pretty bodgy
(first pipe the data to something that wraps it in an Erlang expression
then pipe it to erl_call -e ...)

So, yes, I'm marginally aware that I'm reinventing the wheel, but
dangit, not all wheels are the same.

-Chris