Melding Monads

2012 July 10

Announcing split-channel

Filed under: Uncategorized — lpsmith @ 11:44 pm

The split-channel package is new library that is a small variation on Control.Concurrent.Chan. The most obvious change is that it splits the channel into sending and receiving ports. This has at least two advantages: first, that this enables the type system to more finely constrain program behavior, and second, a SendPort can have zero ReceivePorts associated with it, and messages written to such a channel can be garbage collected.

This library started life last fall as part of my experiments in adding support for PostgreSQL’s asynchronous notifications to Chris Done’s native pgsql-simple library. The initial motivation was that if a notification arrived and nobody was listening, I wanted to be able to garbage collect it. However, the type advantages are what keep me coming back.

Beyond the primary change, this library has a number of other small improvements over Control.Concurrent.Chan: the deprecated thread-unsafe functions aren’t there, and several operators have been added or improved, most notably listen, sendMany, fold, and split.

  1. listen attaches a new ReceivePort to an existing SendPort. By contrast, Chan only provides the ability to duplicate an existing ReceivePort.

    Edit: I was mistaken: listen is essentially equivalent to dupChan, whereas duplicate is new.

  2. sendMany sends a list of messages atomically. It’s a better name than writeList2Chan, which is not atomic and is only a convenience function written in terms of send. However, writeList2Chan does work on infinite streams, whereas sendMany does not.

  3. fold is a generalization of getChanContents, potentially avoiding some data structures.

  4. split cuts an existing channel into two channels. It gives you back a new ReceivePort associated with the existing SendPort, and a new SendPort associated with the existing ReceivePorts. This is a more general operator than one I’ve used in a few places to transparently swap out backend services.

    Chan does not provide the split operator, though one could be added. However I am skeptical that this is a good idea: it’s just a little too effect-ful for comfort. I think that putting a SendPort in an MVar tends to be a better idea than using split, even though it does introduce another layer of indirection.

Finally, a few acknowledgements are in order: primarily, Control.Concurrent.Chan and its authors and contributors, and secondarily, Joey Adams for GHC Bug #5870, the fix of which has been incorporated into split-channel.

About these ads

15 Comments »

  1. Is split’s documentation correct wrt the use of words “old” and “new”?

    Comment by Felipe Lessa — 2012 July 11 @ 7:26 am

    • I just checked it again, and it looks correct to me. It’s one of those things that is a bit more clear if you see a diagram, so here’s a quick attempt at one:

      split

      Comment by lpsmith — 2012 July 11 @ 10:06 am

      • Ok, I get it now. “…and a new send port associated with the existing receiving ports [of the existing send port]” is what I didn’t understand before.

        Thanks! It would be nice to link this image from the docs, too.

        Comment by Felipe Lessa — 2012 July 11 @ 10:17 am

        • Yes, I was wanting to include such a diagram in the haddocks, but I didn’t take the time to figure out how.

          Comment by lpsmith — 2012 July 11 @ 10:23 am

  2. Nice! This is just the way (Levien’s) Io does Chan; this should fit right in, should I get back to my Ganymede project.

    Comment by bmeph — 2012 July 11 @ 9:33 am

    • I hadn’t heard of Levien’s Io before, thanks! Hope your project goes well.

      Comment by lpsmith — 2012 July 11 @ 10:39 am

  3. I can’t look at your lib right now since hackage appears to be down, but did you see my library ‘chan-split’? I’m interested in seeing what you’ve done (especially curious how you managed to let the GC collect chan receive ends)

    Comment by jberryman — 2012 July 12 @ 8:38 am

    • I did see your library, actually, but only after I’d released mine. The key is that I didn’t build a wrapper around Control.Concurrent.Chan, I reimplemented the idea. (It’s a very simple idea and implementation, by the way. Did you ever look at the source, or read the Concurrent Haskell paper?)

      The problem is that a Chan is basically a pair consisting of a SendPort and ReceivePort, so an active channel necessarily has a reference to at least one ReceivePort. Thus messages will build up in that channel if somebody is talking and nobody is listening. So I got rid of that pair and made a few other small improvements.

      You can take a look at the github repo if you are impatient.

      Comment by lpsmith — 2012 July 12 @ 9:51 am

      • Yes, I remember being surprised that Chans were implemented so simply with MVar. Your lib has quite different semantics and warrants a new implementation, but the naming clash is unfortunate. Do you have any interest in providing an instance for my SplitChan class?

        A couple questions: doesn’t the existence of ‘listen’ weaken your advantage no. 1, since we can always get a receive port from a SendPort? won’t it be difficult to use ‘listen’ effectively (i.e. without throwing messages “into the aether” non-deterministically) in real concurrent code?

        Comment by jberryman — 2012 July 12 @ 4:50 pm

        • Well, same semantics, just wrapped up in a slightly different interface. The module name clash between our packages is somewhat unfortunate, I agree, but I doubt that it’ll be a problem, and I don’t really want to add any dependencies other than base.

          Regarding listen and race conditions, I don’t think it’s going to be especially problematic compared to any other concurrent operator. I would reemphasize that listen is the equivalent of dupChan, so you can create the exact same problems as you can with Chan. And I think it enhances the potential advantage of having zero ReceivePorts, otherwise this situation is a lot less useful, in my opinion. I mean, the program that initially motivated this library required the use of listen.

          I would point out that that the “zero receive ports” is something of a special situation, that the real advantage is the finer-grained types. I’ve only found one use for that situation, which is my fork of the pgsql-simple library, and that code was only in production for a couple of months before being replaced with the postgresql-simple library. And even then, the motivation was providing pgsql-simple with some desirable properties; the client of that library didn’t actually make use of those particular properties.

          If not for the fact that I really like having SendPorts seperate from ReceivePorts, I probably wouldn’t still be using this library in other projects, and probably wouldn’t have released it on hackage.

          Comment by lpsmith — 2012 July 13 @ 1:57 pm

          • Ah, I think I was confusing things. I’ll have to take some time to wrap my bony head around the details and might steal your approach for my own library.

            Comment by jberryman — 2012 July 14 @ 3:27 pm

          • Do you have any benchmarks that you’re using to measure GC behavior vs Chan? In all of my simple tests of vanilla Chan (on GHC 7.4.1), unless optimizations are turned completely off, messages are successfully garbage collected as we write, when there are no more readers (i.e. no more references to the fst / read side of the Chan).

            I certainly see the benefit to making it easier for the GC to do its duties, but I’m wondering if there are any cases where you’ve seen your implementation have better behavior then straight Chan with optimizations on.

            Comment by jberryman — 2012 July 18 @ 6:34 pm

  4. [Re: jberryman’s question about GC benchmarks]

    No, I haven’t benchmarked. I find that surprising, but only a little bit. (I’m not surprised by being surprised at GHC optimizations.) It’s probably better to not rely on this particular optimization though, unless it’s a particularly well-understood and reliable optimization. I would expect it to be relatively easy to come up with use cases where the optimization doesn’t kick in, for reasons that may be less than immediately obvious.

    I’m very interested in seeing your benchmark, though.

    Comment by lpsmith — 2012 July 19 @ 12:26 am

    • Right, I think the relevant optimization is just inlining/unfolding where the compiler sees it doesn’t have to rebox the two MVars in the Chan constructor, but I’m not sure.

      The code I used was (sorry if this doesn’t get formatted correctly):

      module Main
          where
      
      import Control.Concurrent.Chan
      import Control.Concurrent.Chan.Split
      
      main = main1
      
      payload :: [Integer]
      payload = [1000000000000000.. 1000000010000000]
      
      -- test traditional Chan implementation:
      main0 = do
          putStrLn "Starting to write"
          s <- newChan
          mapM_ (writeChan s) payload
          putStrLn "Done"
      
      main1 = do
          putStrLn "Starting to write"
          s <- newSendPort
          mapM_ (send s) payload
          putStrLn "Done"
      

      Comment by jberryman — 2012 July 19 @ 8:54 am

      • Yeah, I’m not too surprised that GHC manages to optimize Chan’s ReceivePort away in this simple benchmark. And it may well do so in more realistic use cases too. You may find it enlightening to study GHC’s core output, and probe some of the limitations of the set of optimizations that accomplishes this.

        Comment by lpsmith — 2012 July 19 @ 1:34 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Shocking Blue Green Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: