The URLs for the individual issues of General Bits will be in flux as I do a little housecleaning. A bookmark to 32.html, for example, may no longer work. For consistent access to older issues, bookmark the Archive page listed below.
The following bookmarks will remain stable.
The client side python interface for 7.4 has been removed from the regular CVS source tree. It can now be found at
http://www.pygresql.orgThe interface for 7.3 and earlier remains in the ordinary distribution.
This change is somewhat out of the ordinary. In the past interfaces have moved to gborg.postgresql.org. On gborg the page for the PyGreSQL project the website link points to pygresql.org. It is unclear whether the CVS, bugs and features links will be maintained on gborg.
Suppose we had a table representing processes tasks like this:
=# \d procs Table "procs" Column | Type | Modifiers -----------+-----------------------+----------- pid | integer | not null task | character varying(10) | not null owner | character varying(10) | not null IsActive | boolean |The tasks are executed in single-threaded mode, one after the other. No two processes can be active at the same moment. What is wanted is the exclusive OR behavior like a radio button set.
The problem is how to keep the table updated with one and only one record in the table with IsActive set as TRUE. When a process is made active active, any other active process must be set to inactive.
The implementation is a trigger that fires when a processes updated to active which updates the row that was previously set to active as inactive. Here is the function and trigger.
CREATE TRIGGER proc_current AFTER UPDATE OR INSERT ON procs FOR EACH ROW EXECUTE PROCEDURE proc_current_trigger(); CREATE FUNCTION proc_current_trigger() RETURNS TRIGGER AS ' BEGIN IF NEW.IsActive THEN UPDATE procs SET IsActive = ''f'' WHERE pid != NEW.pid AND IsActive = ''t''; END IF; RETURN NEW; END' LANGUAGE 'plpgsql';
Both INSERT and UPDATE cases are being handled by the trigger to ensure that integrity remains. Because there is no DELETE trigger and no action when a process is set to inactive, it is possible to have no active processes.
The function checks to see if the new or updated row is set to active, and if so, unset all (should only be one!) other rows to inactive. The other is accomplished by excluding the new row by pid in the where clause.
Minor releases can be upgraded in place. In that case nothing needs to be done to the database for the upgrade. Only the software needs upgrading.
A minor release is signaled by a change in the 3rd number, for example 7.3.4 is a minor release update to 7.3.2. When the second number changes, there are usually database changes to be done for the upgrade. For example, updates were necessary when transitioning from 7.2 to 7.3.
Minor releases can be done quite quickly and easily. Here are a couple examples of how people do minor updates. The first takes a cautious approach and first backs up the database moves the old installation out of the way. Then the new version is built and installed. The new data area has initdb run and then the data from the backup is loaded.
pgdumpall > dbbackup -- backup and stop pg_ctl stop mv %PGHOME% /usr/local/pgsql.old -- move old pgsql program cd /usr/local/src/postgresql-7.3.2 -- installs new pgsql version make install initdb -D %PGHOME%/data -- start and restore db. pg_ctl start psql < dbackupFor the less cautious, this approach barrels through with no backup at all.
tar xvfz postgresql-7.3.4.tar.gz ; cd postgresql-7.3.4 ./configure ; make ; pg_ctl stop ; make install pg_ctl startA more moderate approach might be to just back up the data, just in case. But install the new release over the old one.
pgdumpall > dbbackup tar xvfz postgresql-7.3.4.tar.gz ; cd postgresql-7.3.4 ./configure ; make; pg_ctl stop ; make install
Notice that the database server should be stopped before running make install. And of course, YMMV.
There is a limit of 16 Terrabytes for a PostgreSQL database. This is limited by the BlockNumber being 32 bits, so you cannot have a table larger than 2 or 4 billion blocks.
(The FAQ is conservatively assumes that the limit is 2 billion blocks; 2G blocks * 8KB block size = 16TB. In principle 4 billion should work, but it needs more rigorous testing. Old code which uses signed arithmetic on the BlockNumber may still exist. Is there anyone willing and able to test a database between 16T and 32T in size?)
Suppose you have a table with 50 fields of 20 characters each. If a table can be 16 TB, how many records could it hold? Let's calculate.
There have been references of databases as big as 300GB. People routinely have more than 10GB databases. You can dig in list archives for the cases.
The 4TB database mentioned in the FAQ belongs to the American Chemical Society (something about scanning all their journals back to the mid-1800s...)
Another large database is the 2-micron sky survey: http://pegasus.astro.umass.edu/ which covers a bit under half a billion stars; it is reportedly about 150GB when loaded into Postgres. The UMass people seem to be happy with the performance they get.
Two functions were proposed, one in plpgsql and the other in plperl.
create or replace function hex_to_int(char(2)) returns integer as ' declare v_ret record; begin for v_ret in execute ''select x'''''' || $1 || ''''''::int as f'' loop return v_ret.f; end loop; end; ' language 'plpgsql'; create or replace function hex_to_int_perl(char(2)) returns integer as ' return hex $_; ' language 'plperl';Both functions produced the result correctly.
create table foo(f1 char(2)); insert into foo values ('ff'); insert into foo values ('fe'); insert into foo values ('fd'); select hex_to_int[_perl](f1) from foo; hex_to_int ------------ 255 254 253 (3 rows)
The discussion branched here to the speed of these two functions in 7.4. Experiments were run with EXPLAIN ANALYZE showed that plperl function is fastest with or without preloading the libraries. In the course of the experiments, a problem was found with the preloading function for plperl. The problem was that if the library was not found or could not be loaded, no ERROR or NOTICE was raised. And the initialization function for plperl was a static library and could not be dynamically loaded anyway. But who knew? Both of these will be fixed in 7.4 by Joe Conway for plperl, plpgsql, pltcl and plpython.
In 7.4 you will be able to preload language libraries, using the postgresql.conf variable preload_libraries.
The hexadecimal to decimal functions are good, but they lack the inverse function as well as the ability to process variable length hexadecimal strings. This is left as an exercise for the reader :-)
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009