Erlang Pattern: The Router

Posted: October 19th, 2009 | Author: kevin | Filed under: Erlang | View Comments

There are three components to this pattern. The client, the router, and the target. The client begins the process by calling the router with a message destined for a target. The router uses the message, and possibly other metadata, to locate a desirable target. Then the router hands off the message to the selected target. The target replies directly to the client and completes the process.

I’ve written this code a number of times in several different languages. What makes the Erlang implementation so nice is it’s brevity and simplicity. Using the gen_server behavior I can implement the core logic in just a few lines:

  1. Client calls the router with a message destined for a target:
    gen_server:call(?ROUTER, {invoke, Target, Msg})

  2. The router looks up the target and forwards the message on:

    handle_call({invoke, Target, Msg}, From, State) ->
    %% Target selection code goes here
    gen_server:cast(TargetPidOrName, {From, Msg}),
    {noreply, State};

  3. The target replies directly to the waiting client:

    handle_cast({Originator, Msg}, State) ->
    %% Server logic goes here
    gen_server:reply(Originator, Reply),
    {noreply, State};

The pretty bit is that all this plumbing can be hidden away inside a function. So the original call to the router in the client winds up looking like this: router:invoke_target(Target, Msg). The rest happens behind the scenes and appears as just another gen_server call.

I’m pretty sure there are no bugs in using this approach. I’ve benchmarked a recent implementation of this pattern at over 6000 requests/sec without a hiccup.


  • @Dizzy - I'll have to think about this a bit. I knew going into this that we'd get bogus timeout messages but for what we were doing that was acceptable. We have fairly robust logging so between the data in the message and the log output we can debug timeouts.

    @Hunter - The lookup was a simple ETS table which contained some metadata about the service (interface version, acceptable messages, etc). We used monitors to determine when to remove a service and judicious use of pg2 to sync up information across the cluster.
  • I like this.

    What was the lookup mechanism in the recent implementation you talked about? I'm interested in how the "target" processes notify the lookup bookkeeping about their deaths.
  • Dizzy
    I'll admit that this pattern bugs me a bit. The root of my discomfort is the fact that you're doing a gen:call on the router process, which causes the caller to setup a monitor on that (router) process. However, what we REALLY want to monitor is the target process. After all, if the target crashes, we'll never know how/why -- we'll just get a timeout from the router, which is misleading to say the least.

    So there are two options, I suppose:
    1. Do a call to get the target PID, followed by a call to that process
    2. Do a cast to the router which in turn does a cast to the target

    The downside with 1 is increased latency (likely). However, you get more meaningful errors, which in my book, is worth a bit of latency. Option 2 makes it impossible to know if the router or target actually recv'd the message.

    A third option would be to split the difference like so:
    1. Setup monitor on router
    2. Cast request to router
    3. Recv status notif from router with target pid
    4. Setup monitor on target (delete router monitor)
    5. Recv completion notice from target

    This way you would know where the message fails, with slightly better concurrency. But you'd have to show me a pretty convincing chunk o' profiling to justify that over option 1. :)

    My $0.02. :)

    D.
blog comments powered by Disqus