Melding Monads

2015 February 12

Announcing snaplet-postgresql-simple-0.6

Filed under: Uncategorized — lpsmith @ 12:16 pm

In the past, I’ve said some negative things1 about Doug Beardsley’s snaplet-postgresql-simple, and in this long overdue post, I retract my criticism.

The issue was that a connection from the pool wasn’t reserved for the duration of the transaction. This meant that the individual queries of a transaction could be issued on different connections, and that queries from other requests could be issued on the connection that’s in a transaction. Setting the maximum size of the pool to a single connection fixes the first problem, but not the second.

At Hac Phi 2014, Doug and I finally sat down and got serious about fixing this issue. The fix did require breaking the interface in a fairly minimal fashion. Snaplet-postgresql-simple now offers the withPG and liftPG operators that will exclusively reserve a single connection for a duration, and in turn uses withPG to implement withTransaction.

We were both amused by the fact that apparently a fair number of people have been using snaplet-postgresql-simple, even transactions in some cases, without obviously noticing the issue. One could speculate the reasons why, but Doug did mention that he pretty much never uses transactions. So in response, I came up with a list of five common use cases, the first three involve changing the database, and last two are useful even in a read-only context.

  1. All-or-nothing changes

    Transactions allow one to make a group of logically connected changes so that they either all reflected in the resulting state of the database, or that none of them are. So if anything fails before the commit, say due to a coding error or even something outside the control of software, the database isn’t polluted with partially applied changes.

  2. Bulk inserts

    Databases that provide durability, like PostgreSQL, are limited in the number of transactions per second by the rotational speed of the disk they are writing to. Thus individual DML statements are rather slow, as each PostgreSQL statement that isn’t run in an explicit transaction is run in its own individual, implicit transaction. Batching multiple insert statements into a single transaction is much faster.

    This use case is relatively less important when writing to a solid state disk, which is becoming increasingly common. Alternatively, postgresql allows a client program to turn synchronous_commit off for the connection or even just a single transaction, if sacrificing a small amount of durability is acceptable for the task at hand.

  3. Avoiding Race Conditions

    Transactional databases, like Software Transactional Memory, do not automatically eliminate all race conditions, they only provide a toolbox for avoiding and managing them. Transactions are the primary tool in both toolboxes, though there are considerable differences around the edges.

  4. Using Cursors

    Cursors are one of several methods to stream data out of PostgreSQL, and you’ll almost always want to use them inside a single transaction.2 One advantage that cursors have over the other streaming methods is that one can interleave the cursor with other queries, updates, and cursors over the same connection, and within the same transaction.

  5. Running multiple queries against a single snapshot

    If you use the REPEATABLE READ or higher isolation level, then every query in the transaction will be executed on a single snapshot of the database.

So I no longer have any reservations about using snaplet-postgresql-simple if it is a good fit for your application, and I do recommend that you learn to use transactions effectively if you are using Postgres. Perhaps in a future post, I’ll write a bit about picking an isolation level for your postgres transactions.


  1. See for example, some of my comments in the github issue thread on this topic, and the reddit thread which is referenced in the issue.

  2. There is the WITH HOLD option for keeping a cursor open after a transaction commits, but this just runs the cursor to completion, storing the data in a temporary table. Which might occasionally be acceptable in some contexts, but is definitely not streaming.

Advertisements

1 Comment »

  1. I am no longer sure the place you’re getting your information, however good topic.
    I must spend a while studying more or understanding more.
    Thanks for fantastic info I was in search of this info for my mission.

    Comment by revise — 2015 June 3 @ 12:17 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: