PostgreSQL
The world's most advanced open source database

..::Planet PostgreSQL::..

July 02, 2009

Peter Eisentraut

Where have all the translations gone?


If you have downloaded PostgreSQL 8.4.0 and are wondering where so many of the translations have gone: The translation team has decided not to ship translations anymore that are not translated at least about 80%. (See the commit message for the list of victims.) This is so that incidental users of stale translations are not presented with a confusing and distracting mix of translated and untranslated messages all the time. So right now we are only shipping a full or almost full set of translations into German, Spanish, French, Japanese, Portuguese, and Turkish.

To get the translations into other languages back into the release, go to http://babel.postgresql.org/ and start submitting updates. Updates may be included as early as release 8.4.1 in a few months.

I hope in particular that we might get the Chinese, Italian, and Russian translations back into shape.

By the way, if you want to start (or continue) translating, I suggest that you approximately follow this priority order: libpq, psql, pgscripts, pg_dump, initdb, postgres. This or a similar order will make the translations useful to the most users with the least amount of work.

by Peter Eisentraut (peter_e@gmx.net) at July 02, 2009 08:00 PM

Robert Gravsjö

"No to SQL"

Every time I see something or hear something like this I sigh a little bit. Not only when it's related to SQL but in the world of computer professionals in general. "The right tool for the job" seems to be a hard concept to understand sometimes. I wonder ...

July 02, 2009 08:54 AM

July 01, 2009

David Wheeler

Committed: pgTAP Result Set Assertion Functions

Regular readers will know that I've been thinking a lot about testing SQL result sets and how to how to name result testing functions, and various implementation issues. I am very happy to say that I've now committed the first three such test functions to the Git repository. They've been tested on 8.4 and 8.3. Here's what I came up with.

Read More »

by David E. Wheeler at July 01, 2009 09:32 PM

David Fetter

WITH (so much drama in the CTE)

By now, you've probably seen that PostgreSQL 8.4 can produce Mandelbrot sets
like the one below, but what are Common Table Expressions really about?

Continue reading "WITH (so much drama in the CTE)"

by nospam@example.com (David Fetter) at July 01, 2009 05:18 PM

Andreas Scherbaum

PostgreSQL 8.4: Column Permissions

Author
Andreas 'ads' Scherbaum

Up to PostgreSQL 8.3 it was only possible to grant (and revoke) permissions on the entire table. If column level permissions were needed, a workaround like a view solved (more or less) the problem: create the view with the required (allowed) columns, revoke all permissions from the underlaying table, grant permissions to the view.


This - of course - is uneloquent, error prone and does not scale well. For different users requiring access to different columns, a big number of views is needed.


PostgreSQL 8.4 solves the problem with a shiny new feature: column level permissions.




Continue reading "PostgreSQL 8.4: Column Permissions"

by nospam@example.com (Andreas 'ads' Scherbaum) at July 01, 2009 05:00 PM

US PostgreSQL Association

PostgreSQL 8.4 Released!

JD wrote:

For those sleeping in PostgreSQL.org just released PostgreSQL 8.4. This is an exciting release with many new features including:

  • Parallel Database Restore, speeding up recovery from backup up to 8 times
  • Per-Column Permissions, allowing more granular control of sensitive data
  • Per-database Collation Support, making PostgreSQL more useful in multi-lingual environments
  • In-place Upgrades through pg_migrator (beta), enabling upgrades from 8.3 to 8.4 without extensive downtime

    read more

by jd at July 01, 2009 04:35 PM

Robert Gravsjö

PostgreSQL 8.4 released

Spread the word, PostgreSQL 8.4 is out!

July 01, 2009 04:30 PM

Josh Berkus

pg_stat_statements in 8.4

Now that PostgreSQL 8.4 is out, I thought I'd write a little about my favorite 8.4 feature. As Mister Performance Whack-a-Mole, what makes me happy about 8.4 is the ability to whack moles faster ... which is why I'm very fond of pg_stat_statements.

July 01, 2009 12:52 PM

June 30, 2009

Robert Lor

How to add DTrace probes to your application

MC Brown and I co-presented a session at CommunityOne West on how to add probes to applications, using MySQL and PostgreSQL as case study. In the presentation, I used a very simple example to demonstrate how easy it is to add probes. If you want to try out yourself, here is the code. Extract the files, run gmake (or gnumake on OS X) to build, and run the executable in one terminal and the DTrace script in another terminal to see the output from the probes. To see more complicated examples, checkout the MySQL or PostgreSQL source code.

by rnl at June 30, 2009 03:22 PM

Leo Hsu and Regina Obe

Managing disk space using table spaces

We have covered this briefly before, but its an important enough concept to cover again in more detail.

Problem: You are running out of disk space on the drive you keep PostgreSQL data on
Solution:

Create a new tablespace on a separate drive and move existing tables to it, or create a new tablespace and use for future tables.

What is a tablespace and how to create a tablespace

A tablespace in PostgreSQL is similar to a tablespace in Oracle and a filegroup in SQL Server. It segments a piece of physical disk space for use by the PostgreSQL process for holding data. Below are steps to creating a new tablespace. Tablespaces have existed since PostgreSQL 8.0.

More about tablespaces in PostgreSQL is outlined in the manual PostgreSQL 8.3 tablespaces

While it is possible to create a table index on a different tablespace from the table, we won't be covering that.


Continue reading "Managing disk space using table spaces"

by nospam@example.com (Leo Hsu and Regina Obe) at June 30, 2009 04:00 AM

June 29, 2009

Robert Lor

More on the demos from PgCon 2009

At PgCon 2009 in Ottawa, I did a lightning talk on DTrace probes in PostreSQL 8.4. I wanted to show several demos but ran out of time. If you want to try them out, use the scripts below.

Here is the script, query_time.d, used in slide 14. This script is used to identify slow queries by printing out the query execution time.


#!/usr/sbin/dtrace -s
#pragma D option quiet

dtrace:::BEGIN
{
  printf("Tracing... Hit Ctrl-C to end.\n");
}

postgresql*:::query-start
{
    self->query = copyinstr(arg0);
    self->ts = timestamp;
}

postgresql*:::query-done
/self->ts/
{
    @query_time[self->query] = avg(timestamp - self->ts);
    self->query = 0;
    self->ts = 0;
}

dtrace:::END
{
     printf("%10s %s\n", "TIME (ns)", "QUERY");
     printf("==============================================================\n");
     printa("%@10d %s\n", @query_time);
}

Here is the second script, sort.d, used in slide 16. This script tells the type of sort, whether the sort was done in memory or on disk, and the time to perform the sort.


#!/usr/sbin/dtrace -qs

dtrace:::BEGIN
{
        sorttype[0] = "TUPLE";
        sorttype[1] = "INDEX";
        sorttype[2] = "DATUM";

        sortmethod[0] = "INTERNAL";
        sortmethod[1] = "EXTERNAL";
}

postgresql*:::sort-start
{
        self->ts = timestamp;
        printf("\nBegin %s sort, workmem = %d KB \n", sorttype[arg0], arg3);
}

postgresql*:::sort-done
/self->ts && arg0 == 0/
{
        /* Internal Sort */
        printf("%s sort ended, space used = %d KB \n", sortmethod[arg0], arg1);
        printf("Sort time = %d ms\n\n", (timestamp - self->ts) / 1000000);
}

postgresql*:::sort-done
/self->ts && arg0 == 1 /
{        /* External Sort */
        printf("%s sort ended, space used = %d disk blocks\n", sortmethod[arg0], 
arg1);
        printf("Sort time = %d ms\n\n", (timestamp - self->ts) / 1000000);
}

Below is the last script, query_trace.d, used in slide 23. This script provides useful data that will allow you to dig down deeper. In this example, the buffer reads to table 16397 (this is the OID) is huge. This signals a red flag that an index may be needed for this table. To find out the table name from OID, run "SELECT relname FROM pg_class WHERE relfilenode=16397" in psql.


#!/usr/sbin/dtrace -qs

postgresql*:::query-start
{
        self->ts = timestamp;
        self->pid = pid;
}

postgresql*:::buffer-read-start
/self->pid/
{
        self->readts = timestamp;
}

postgresql*:::buffer-read-done
/self->pid && arg7/
{
        /* Buffer cache hit */
        @read_count[arg2, arg3, arg4] = count();
        @read_hit_total["Total buffer cache hits      : "] = count();
        @read_hit_time["Average read time from cache : "] = avg (timestamp - self->readts);
        self->readts = 0;
}

postgresql*:::buffer-read-done
/self->pid && !arg7/
{
        /* Buffer cache miss */
        @read_count[arg2, arg3, arg4] = count();
        @read_miss_total["Total buffer cache misses    : "] = count();
        @read_miss_time["Average read time from disk  : "] = avg (timestamp - self->readts);
        self->readts = 0;
}

postgresql*:::buffer-flush-start
/self->pid/
{
        self->writets = timestamp;
}

postgresql*:::buffer-flush-done
/self->pid/
{
        @write_count[arg2, arg3, arg4] = count();
        @write_time["Average write time to disk   : "] = avg (timestamp - self->writets);
        self->writets = 0;
}

postgresql*:::query-done
/self->ts && self->pid == pid/
{
        printf("\n============ Buffer Read Counts ============\n");
        printf("%10s %10s %10s %10s\n","Tablespace", "Database", "Table", "Count");
        printa("%10d %10d %10d %@10d\n",@read_count);

        printf("\n======= Buffer Write Request Counts ========\n");
        printf("%10s %10s %10s %10s\n","Tablespace", "Database", "Table", "Count");
        printa("%10d %10d %10d %@10d\n",@write_count);

        printf("\n========== Additional Statistics ===========\n");

        printf ("Backend PID    : %d\n", pid);
        printf ("SQL Statement  : %s\n", copyinstr(arg0));
        printf ("Execution time : %d.%03d sec (%d ns)\n", (timestamp - self->ts) / 1000000000, ((timestamp - self->ts) / 1000000) % 1000, timestamp - self->ts);
        printa("\n%19s %@8d\n",@read_hit_total);
        printa("%19s %@8d\n",@read_miss_total);
        printa("%19s %@8d (ns)\n",@read_hit_time);
        printa("%19s %@8d (ns)\n",@read_miss_time);
        printa("%19s %@8d (ns)\n",@write_time);
        printf("\n\n");

        trunc(@read_count);
        trunc(@write_count);
        trunc(@read_hit_total);
        trunc(@read_miss_total);
        trunc(@read_hit_time);
        trunc(@read_miss_time);
        trunc(@write_time);

        self->ts = 0;
        self->pid = 0;
}

To see more sample scripts as well as a GUI tool, check out the PostgreSQL DTrace Toolkit.

by rnl at June 29, 2009 06:17 PM

PostgreSQL DTrace Toolkit

As many of you know, PostgreSQL 8.4 has quite a few more DTrace probes. See my previous blog post for more details . To use the probes, you need to write DTrace scripts, which is quite easy to do, but to make it easier to use the probes (especially for those who are new to DTrace), I have written some scripts that you can just run from the command line. In addition, I've integrated some of those script with Chime to make it even easier to visualize the data. Check out the toolkit on PgFoundry.

by rnl at June 29, 2009 04:19 AM

June 28, 2009

Andrew Dunstan

Parallel pg_restore for PostgreSQL 8.4

I try to complete at least one significant feature item per PostgreSQL release. This time the feature is making pg_restore run in parallel. This is quite important for many users, particularly some large enterprise users.

It's important that people understand what this will do and what it won't do. pg_restores runs a number of steps. In conventional mode it simply runs them all in a single connection to the database, one after the other. In parallel mode it first runs all the quick and easy steps, essentially those that don't involve any data access, such as table and function creation, in a single connection, just like conventional mode. Then it runs the remaining steps each in its own connection. The steps are the same, and there is no parallelism within a given step. For example, a single COPY to a table is not parallelised. Rather, we run it in parallel with other data intensive steps.

The maximum amount of parellelism is controlled by the user. This will involve some experimentation to get to the sweet spot for your setup. A good place to start is the number of physical processors you have available. The idea here is to improve the situation where the CPU is the limiting factor, and allow you to drive the restoration rate up to where IO is in fact the limiting factor. With very high end hardware we believe that you can drive the parallelism quite high.

Like many performance features, this one might well require several releases to tweak it for optimal performance gain. The program works by keeping a pool of slots to be used for the steps that are run in parallel. One possible area for improvement is in the algorithm that selects the item to be used for a slot as it it becomes available. Currently we keep a queue of items that have no remaining unrestored dependencies. An item gets put on the queue as soon as all the items it depends on have been restored. This is likely to be a fairly good approximation of an optimal algorithm, but there might well be a way of tweaking it. Another possible area of optimsation would be to take some notice of the tablespace that each item affects, and try to balance these, so we use as many IO channels as possible.

What is important is that we have now got the basic framework of parallel restore, so that some researchers can easily experiment with various tweaks to improve the performance.

pg_restore is going to be with us for quiter a long time. Even if we manage to get pg_upgrade working pretty well, that will take quite a bit of time, and there is currently no guarantee that it will for for every release. So I expect pg_restore to be the most common method of upgrading for quite some time, making it run as fast as possible is thus still a significant requirement.

I'm proud to have been able to contribute this feature to Postgres, and look forward to other people improving it further as time goes by.

by nospam@example.com (Andrew Dunstan) at June 28, 2009 12:24 PM

PostgreSQL Weekly News

June 27, 2009

Josh Berkus

Anyone Have a Timer for Lightning Talks?

I run a lot of lightning talks, and one tool I haven't been able to find a satisfactory solution for is the timer for the talks ...

June 27, 2009 05:31 PM

June 26, 2009

Andreas Scherbaum

Call for Papers for the PostgreSQL-Devroom at FrOSCon 2009

Author
Andreas 'ads' Scherbaum

The PostgreSQL Project will have a dev-room at FrOSCon on sunday, august 23, 2009. Talks wanted!


The theme should be PostgreSQL-related, please submit the talk(s) by using the FrOSCon Pentabarf:


https://pentabarf.froscon.org/submission/froscon2009/


Procedure:



  1. Create account (if not yet done) and follow the confirmation link in the email

  2. Login in Pentabarf

  3. Create a new event

  4. Choose track "PostgreSQL"


All submitters will receive a confirmation timely, if the talk is accepted. Who wants to submit a talk about databases in general - or a talk about another database - may choose the "OpenSQL Camp" track.

by nospam@example.com (Andreas 'ads' Scherbaum) at June 26, 2009 05:27 PM

Kenny Gorman

pg_reorg 1.0.4

At Hi5, we currently use pg_reorg1.0.3 in order to organize data in a clustered fashion. I posted previously about the strategy. Our version is slightly modified, the modifications I made to the C code essentially allow pg_reorg to spin/wait for locks on the objects to be released before proceeding.

The good news is the folks at NTT have incorporated a similar change in pg_reorg 1.0.4. This is a fantastic improvement, and frankly implemented in a cleaner way than my changes.

The crux of the issue is the situation where a database is being auto-vacuumed, you can’t be guaranteed that pg_reorg and the vacuum will not collide. In theory you should not need to vacuum a table which you are pg_reorg’ing because that is the point of a pg_reorg, it’s essentially a vacuum full w/ extra features because the table is being rebuilt from scratch. However in an environment where auto-vacuum is being utilized to keep tables vacuumed, both will need to co-exist.

The change is simple, use the NOWAIT option of lock table to fail if the lock can not be obtained. This is wrapped in a loop until the lock is granted. The effect is pg_reorg patiently sits and waits while your vacuums complete and then it can finish it’s work. The downside is if any of these operations run for too long, then the journal table may grow very large. So there should be some monitoring wrapped around the code if it’s intended to run in the background. For the future we need a backoff algorithm as well as perhaps a limit to the number of spin/sleep cycles, but hey this is excellent progress.

This tool is essential in my humble opinion for everyone running PostgreSQL in a high transaction/high availability environment. By the way, pg_reorg works seamlessly with Slony-I.

The code addition does the following:

for (;;)                        
        {
                command("BEGIN ISOLATION LEVEL READ COMMITTED", 0, NULL);
                res = execute_nothrow(table->lock_table, 0, NULL); 
                if (PQresultStatus(res) == PGRES_COMMAND_OK)
                {
                        PQclear(res);
                        break;
                }
                else if (sqlstate_equals(res, SQLSTATE_LOCK_NOT_AVAILABLE))
                {
                        /* retry if lock conflicted */ 
                        PQclear(res);
                        command("ROLLBACK", 0, NULL);
                        sleep(1);
                        continue;
                }
                else
                {
                        /* exit otherwise */
                        printf("%s", PQerrorMessage(connection));
                        PQclear(res);
                        exit(1);        
                }
        }

The text below is a snip of the strace on pg_reorg while it’s waiting for the lock:

rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_DFL}, 8) = 0
sendto(3, "P\0\0\0008\0SELECT reorg.reorg_apply($"..., 529, 0, NULL, 0) = 529
rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
poll([{fd=3, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recvfrom(3, "1\0\0\0\0042\0\0\0\4T\0\0\0$\0\1reorg_apply\0\0\0\0"..., 16384, 0, NULL, NULL) = 77
rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_DFL}, 8) = 0
sendto(3, "P\0\0\0\177\0SELECT 1 FROM pg_locks WHE"..., 178, 0, NULL, 0) = 178
rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
poll([{fd=3, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recvfrom(3, "1\0\0\0\0042\0\0\0\4T\0\0\0!\0\1?column?\0\0\0\0\0\0\0"..., 16384, 0, NULL, NULL) = 74
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, {1, 0})

Your Postgresql log file will show the following:

Jun 25 17:09:33 <dbname> postgres[7825]: [37-2] 2009-06-25 17:09:33 PDTSTATEMENT:  LOCK TABLE <tablename> IN ACCESS EXCLUSIVE MODE NOWAIT
Jun 25 17:09:34 <dbname> postgres[7825]: [38-1] 2009-06-25 17:09:34 PDTERROR:  could NOT obtain LOCK ON relation "<tablename>"

by kgorman at June 26, 2009 12:33 AM