Melding Monads

2015 February 12

Announcing blaze-builder-0.4

Filed under: Uncategorized — lpsmith @ 1:42 pm

After a recent chat with Simon Meier, we decided that I would take over the maintenance of the exceedingly popular blaze-builder package.

Of course, this package has been largely superseded by the new builder shipped inside bytestring itself. The point of this new release is to offer a smooth migration path from the old to the new.

If you have a package that only uses the public interface of the old blaze-builder, all you should have to do is compile it against blaze-builder-0.4 and you will in fact be using the new builder. If your program fails to compile against the old public interface, or there’s any change in the semantics of your program, then please file a bug against my blaze-builder repository.

If you are looking for a function to convert Blaze.ByteString.Builder.Builder to Data.ByteString.Builder.Builder or back, it is id. These two types are exactly the same, as the former is just a re-export of the latter. Thus inter-operation between code that uses the old interface and the new should be efficient and painless.

The one caveat is that the old implementation has all but disappeared, and programs and libraries that touch the old internal modules will need to be updated.

This compatibility shim is especially important for those libraries that have the old blaze-builder as part of their public interface, as now you can move to the new builder without breaking your interface.

There are a few things to consider in order to make this transition as painless as possible, however: libraries that touch the old internals should probably move to the new bytestring builder as soon as possible, while those libraries who depend only on the public interface should probably hold off for a bit and continue to use this shim.

For example, blaze-builder is part of the public interface of both the Snap Framework and postgresql-simple. Snap touches the old internals, while postgresql-simple uses only the public interface. Both libraries are commonly used together in the same projects.

There would be some benefit to postgresql-simple to move to the new interface. However, let’s consider the hypothetical situation where postgresql-simple has transitioned, and Snap has not. This would cause problems for any project that 1.) depends on this compatibility shim for interacting with postgresql-simple, and 2.) uses Snap.

Any such project would have to put off upgrading postgresql-simple until Snap is updated, or interact with postgresql-simple through the new bytestring builder interface and continue to use the old blaze-builder interface for Snap. The latter option could range from anywhere from trivial to extremely painful, depending on how entangled the usage of Builders are between postgresql-simple and Snap.

By comparison, as long as postgresql-simple continues to use the public blaze-builder interface, it can easily use either the old or new implementation. If postgresql-simple holds off until after Snap makes the transition, then there’s little opportunity for these sorts of problems to arise.

Announcing snaplet-postgresql-simple-0.6

Filed under: Uncategorized — lpsmith @ 12:16 pm

In the past, I’ve said some negative things1 about Doug Beardsley’s snaplet-postgresql-simple, and in this long overdue post, I retract my criticism.

The issue was that a connection from the pool wasn’t reserved for the duration of the transaction. This meant that the individual queries of a transaction could be issued on different connections, and that queries from other requests could be issued on the connection that’s in a transaction. Setting the maximum size of the pool to a single connection fixes the first problem, but not the second.

At Hac Phi 2014, Doug and I finally sat down and got serious about fixing this issue. The fix did require breaking the interface in a fairly minimal fashion. Snaplet-postgresql-simple now offers the withPG and liftPG operators that will exclusively reserve a single connection for a duration, and in turn uses withPG to implement withTransaction.

We were both amused by the fact that apparently a fair number of people have been using snaplet-postgresql-simple, even transactions in some cases, without obviously noticing the issue. One could speculate the reasons why, but Doug did mention that he pretty much never uses transactions. So in response, I came up with a list of five common use cases, the first three involve changing the database, and last two are useful even in a read-only context.

  1. All-or-nothing changes

    Transactions allow one to make a group of logically connected changes so that they either all reflected in the resulting state of the database, or that none of them are. So if anything fails before the commit, say due to a coding error or even something outside the control of software, the database isn’t polluted with partially applied changes.

  2. Bulk inserts

    Databases that provide durability, like PostgreSQL, are limited in the number of transactions per second by the rotational speed of the disk they are writing to. Thus individual DML statements are rather slow, as each PostgreSQL statement that isn’t run in an explicit transaction is run in its own individual, implicit transaction. Batching multiple insert statements into a single transaction is much faster.

    This use case is relatively less important when writing to a solid state disk, which is becoming increasingly common. Alternatively, postgresql allows a client program to turn synchronous_commit off for the connection or even just a single transaction, if sacrificing a small amount of durability is acceptable for the task at hand.

  3. Avoiding Race Conditions

    Transactional databases, like Software Transactional Memory, do not automatically eliminate all race conditions, they only provide a toolbox for avoiding and managing them. Transactions are the primary tool in both toolboxes, though there are considerable differences around the edges.

  4. Using Cursors

    Cursors are one of several methods to stream data out of PostgreSQL, and you’ll almost always want to use them inside a single transaction.2 One advantage that cursors have over the other streaming methods is that one can interleave the cursor with other queries, updates, and cursors over the same connection, and within the same transaction.

  5. Running multiple queries against a single snapshot

    If you use the REPEATABLE READ or higher isolation level, then every query in the transaction will be executed on a single snapshot of the database.

So I no longer have any reservations about using snaplet-postgresql-simple if it is a good fit for your application, and I do recommend that you learn to use transactions effectively if you are using Postgres. Perhaps in a future post, I’ll write a bit about picking an isolation level for your postgres transactions.


  1. See for example, some of my comments in the github issue thread on this topic, and the reddit thread which is referenced in the issue.

  2. There is the WITH HOLD option for keeping a cursor open after a transaction commits, but this just runs the cursor to completion, storing the data in a temporary table. Which might occasionally be acceptable in some contexts, but is definitely not streaming.

Blog at WordPress.com.