Dead Simple Process Pooling

Posted: February 27th, 2008 | Author: kevin | Filed under: Erlang | View Comments

The pool module provide a really simple, but useful, process pooling and load balancing mechanism. pool:start/1 is used to start worker Erlang nodes and pool:pspawn/3 or pool:pspawn_link/3 is used to queue work items into the pool.

The thing I really like about pool is it handles code loading transparently. Meaning that the worker nodes can access all the code the master node has available on its code path. This makes setting up a process pool even easier since there’s no per-worker configuration needed.

At the bottom of the post is the source for a Erlang module which uses pool to calculate fibnoacci numbers. Highly useful, I know. But sometimes the classic examples are the best :) A quick word of thanks to fellow Erlang Studio attendee Noah Thorp for his efficient fib implementation.

There are four easy steps to using the fibber module:

  1. Configure the master node for disributed Erlang and setup the required code paths like so: erl -sname foo -cookie erlang -pa .
  2. Call fibber:start/1 with the desired number of workers nodes.
    Example:

    (foo@perdido)1> fibber:start(3).
    Started worker3
    Started worker2
    Started worker1
    ok
    (foo@perdido)2>
    

    If you’re running a Unix-like operating system, you can use the following command to verify that master + # of workers Erlang processes are running: ps -A | grep beam | grep -v "grep beam" | wc -l

  3. Start calculating fibonacci numbers by calling fibber:calc:
    (foo@perdido)2> {Node, Result} = fibber:calc(15).
    {worker1@perdido,610}
    (foo@perdido)3> Node.
    worker1@perdido
    (foo@perdido)4> Result.
    610
    
  4. When you’re all done using fibber, call fibber:stop() to shutdown the process pool.

No other language I’ve ever used, including Java, Python, Ruby, or C++, has made thread/process pooling so easy or flexible.

-module(fibber).

-export([calc/1, calc/2, server_calc/2, start/1, stop/0]).

start(0) ->
	ok;

start(Workers) ->
	Name = list_to_atom("worker" ++ integer_to_list(Workers)),
	pool:start(Name),
	io:format("Started ~p~n", [Name]),
	start(Workers - 1).

stop() ->
	pool:stop().

calc(N) ->
	fibber:calc(N, 10000).

calc(N, Timeout) ->
	pool:pspawn(fibber, server_calc, [self(), N]),
	receive
		{reply, Result} ->
			Result;
		Wtf ->
			io:format("Didn't expect ~p~n", [Wtf])
	after Timeout ->
		timeout
	end.

server_calc(Caller, N) ->
	Result = fib(N),
	Caller ! {reply, {node(), Result}}.

fib(1) -> 1;
fib(2) -> 1;
fib(N) when N > 2 -> fib1(N,1,1).

fib1(3,P1,P2) -> P1 + P2;
fib1(N,P1,P2) ->
    fib1(N-1,P2, P1 + P2).

  • @Dimitry - It's like you read my mind. I'm working on a post right now that illustrates how to write a few gen_servers and deploy them onto several Erlang nodes. I'm hoping to have it done in another day or two.
  • @zvi - What I was trying to do with this example was illustrate how to pool processes. To make the example easier to understand I kept it to a single machine. If you wanted to pool nodes on multiple machines all you need to do is edit your ~/.hosts.erlang file and enter additional host names. That said, I can see how pool on a single machine might be desirable, too. For example, if the work I'm pooling, in this case server_calc/1, is heavyweight then pool's load balancing could be helpful.
  • Zvi
    Kevin,

    Sorry, I didn't get the purpouse of pooling server_calc processes?
    Unlike C++ and Java, where native hevyweight threads are used, Erlang using user-level microthreads. My understanding is that BEAM VM pooling them automatically. I.e. where in Java I would create thread pool, fill it once with worker threads. Ask thread from pool when I need it and after usage will return it to the pool. In Erlang you just spawn a process and don't care about process lifecycle management.
    In my opinion your example only make sense in 2 cases (except obvious educational and demonstration purpouse):
    1. Emulating SMP Erlang, on platforms where it's not available (since each pooled server run on different node).
    2. In cases when you spawning more than max_processes it will crash BEAM, using pool you may limit overal number of processes per VM.

    I'm newbee like yourself and learning a lot from your blog.
    Thanks!
    Zvi
  • Dmitry
    @Kevin,

    Could you give an example how to run an erlang application on cluster? Especially interesting how to run gen_server application with pool. I can't find any workng examples for this.
  • @zvi - I'm afraid my reply came out as too negative. What I was trying to say was that even the single machine mode, the library is useful especially for programmers coming from other languages. But, your observations about the pool and slave modules are spot on. Hmmm, I smell a project coming on...
  • @zvi - it might not be that interesting to you but it sure is to me. Writing this kind of multithreaded/multiprocess producer/consumer in other languages is distinctly more painful and less fun.
  • @Justin - That's debugging/prototyping code left over from when I was writing this. I write this code on reflex when I'm prototyping. Normally I go back and rip it out after I'm happy with the code, but in this case, I didn't. I guess that's what happens when I write code at 5:00 AM :-)
  • Zvi
    Hi Kevin,
    your example is not so interesting.
    The real fun is when you running pooled nodes on different hosts!
    By default Erlang supports only UNIX's rsh for starting remote slave nodes.
    I modified Erlang's slave module to launching remote nodes on Windows. And now I have a Single Image Cluster running on 32 CPUs (8 hosts).
    By replacing each erlang:spawn to pool:pspawn and optionaly each lists:map to plists:map - you get the most easiest single image distributed operating system in the world! It's a no brainer!

    The problem with pool module is that it's not easily customizable. I had to patch it and recompile, because the windows command line is radically different from rsh.
    Also the load balancing criteria is the length of message queue. It should be pluaggable. Fo example I need to spawn processes closed to the data or load balance them by the real CPU load.

    Maybe it's possible to extarct pool.erl and slave.erl and convert them to real thing?

    Zvi
  • Justin
    Why catch the Wtf message in calc/2? It seems antithetical to Erlang's "Fail hard" policy. Potentially your process has spawned some other computation in parallel before calculating the Fibonacci numbers, and these processes might be reporting back to the parent. Catching Wtf just eats up those other messages and potentially throws away useful information.

    I understand the Erlang-style to be: If you get a message you don't know what to do with, don't do anything with it. If it crashes, whoever called the function (or spawned the process) should be the one to take care of it.
blog comments powered by Disqus