As a group, we, the PostgreSQL have few religious wars and only minor flame fests. We appear to value good technology over commercial interests. The mailing lists are vibrant and good spirited with the desire to learn and help others.
Current research regarding Open Source projects confirms my experience that Enthusiasm is the best motivation. PostgreSQL's success is tightly coupled with the dedication and focused purpose of its community--both the development team and the PostgreSQL advocates and users.
This paper examines motivations of people in loosely coupled organizations like the PostgreSQL community. I see my own motivations as well as the motivations of many of my colleagues. Do you see yourself?
In this paper we report on the results of a study of the effort and motivations of individuals to contributing to the creation of Free/Open Source software. We used a Web-based survey, administered to 684 software developers in 287 F/OSS projects, to learn what lies behind the effort put into such projects. Academic theorizing on individual motivations for participating in F/OSS projects has posited that external motivational factors in the form of extrinsic benefits (e.g.; better jobs, career advancement) are the main drivers of effort. We find in contrast, that enjoyment-based intrinsic motivation, namely how creative a person feels when working on the project, is the strongest and most pervasive driver. We also find that user need, intellectual stimulation derived from writing code, and improving programming skills are top motivators for project participation. A majority of our respondents are skilled and experienced professionals working in IT-related jobs, with approximately 40 percent being paid to participate in the F/OSS project.
Karim Lakhani's Research Statement
We have two tables, a language table with a message identifier, a language identifier and a message text and a texts table which links a message identifier to a text context. The column msg_id appears as if it should be a viable foreign key, however it is not.
The primary key of the language table is a two part key, the msg_id and the lang_id. When you try to define a foreign key on the contexts table linking the msg_id columns, you will get the error message:
ALTER TABLE texts ADD CONSTRAINT msg_link FOREIGN KEY (msg_id) REFERENCES languages(msg_id) MATCH FULL ON DELETE CASCADE; ERROR: there is no unique constraint matching given keys for referenced table "languages"
Foreign keys must uniquely identify a row in the referenced table. The msg_id is not unique in the language table. There are two options here. The first is to include the language reference in the context table so as to provide a unique reference back to the messages table. However, this will increase the size of the context table and creates an artificial dependence between languages and contexts that does seems to be contrary to the point of storing messages independently of their translations.
The alternative is to rethink the construction of the tables in terms of the objects they represent. This is what we have:
Using the objects as a guideline a better configuration of the tables might be to have four separate tables. The messages and languages would be aggregated in the translation table, freeing the context table to reference the message in a language-independent way.
The foreign keys defined would be as follows:
This solution also adds room for descriptions and comments on the languages available and the meanings or description of the messages. These are optional of course. The point is that the translations are dependent on valid languages and messages and the context is only dependent on messages so that it can be presented in any language.
There were many great responses to my survey in PostgreSQL General Bits issue #56. Many thanks to all who took the time to responded. Your thoughtful and sometimes funny and pointed responses are very insightful.
Below is a partial analysis of the responses to the survey. More analysis will be available in future issues of PostgreSQL General Bits.
The Platforms and Versions
6. What version of PostgreSQL do you currently use? If you do not use 7.4 yet what are the technical and/or business reasons that you have not upgraded yet.
We run PostgreSQL primarily on variations of Linux. This is no surprise--Debian and Red Hat were explicitly mentioned. Other unix variations included FreeBSD, Solaris, OpenBSD and the alternatives, Mac OS X and cygwin also weighed in.
The oldest version mentioned was 7.2. Most people are up to 7.4 in development and poised to move to 7.4 in production as soon as time allows.
Reluctance to convert to 7.4 was attributed to time (of course) and the difficulty of the dump and restore required between major versions. (Advocate an upgrade script and methodology!) It was also noted that the Debian stable release contains 7.3 and people using Debian stable are reluctant to move to 7.4 until that changes. Hosting companies also affected some people's switch to 7.4 as the hosting companies did not necessarily move to 7.4 as quickly as individual shops.
The Tools we use with PostgreSQL
(Questions: 1, 9, 17, 18)
1. What PostgreSQL Client API do you use the most?
In keeping with the flexibility of PostgreSQL tools we use are fairly varied.
PHP and Perl came in strong with Java holding its own. libpq variations were strong. Although the trend the last several years has been toward scripting languages and Java, apparently there are C/C++ programmers doing a lot of work. ECPG made the list as did PgExpress, a Borland Kylix/Delphi 6+/C++Builder Driver. NSHARP, an interactive Skew-T for upperair, model, and aircraft soundings was the most exotic interface mentioned. (Does anyone use dx?) The windows based web/database interface IDE for php and other languages Code Charge Studio also was mentioned for the first time.
psql was mentioned as a client interface of sorts--it can be executed in any scripting programming language capable of forking.
Drivers JDBC and PgExpress were included in these responses, but surprisingly, no explicit mention of ODBC was made.
For administration and visualization tools, the clear leader was psql, and I am counting /dev/hands and vi and emacs in that group :-) Diagramming tools included white boards, scratch paper (my favorite), Dia and text editors. The only GUI administration tools mentioned explicitly were phppgadmin, pgadmin and DbVisualizer (the free version).
Many people use Open Office, however, not as an interface to PostgreSQL. Setting up the connection seems to be problematic and some people did not know that they could connect. If someone has a tutorial or wants to write up how to make these tools connect, that would be very welcome and I'd be pleased to publish it in PostgreSQL General Bits.
More results will be available in future issues.
One expectation of many companies running database services for mission critical, enterprise systems is high availability. With a High Availability PostgreSQL Cluster, "HAPC", most companies can achieve this goal with minimal costs using open source technologies. Examples of high available business systems are: banking applications, point of sale, order entry, logistics and medical systems. With the absence of high available clustering, down time can represent lost sales, lost revenue, and lost customers. I would like to show how we created a HAPC for a point of sale customer.
Our choice was to integrate PostgreSQL 7.4 on HP Proliant Rack servers, sharing scsi disks on an HP CASA storage solution running RedHat Advanced Server 3.0 configured with their clustering solution. For PostgreSQL to work, we placed the PG_DATA directory, the WAL (write ahead log), and the PostgreSQL configuration files on the shared disks. We modified the /etc/init.d/postgresql start script to force mounting and unmounting of the shared disks. This is critical because two systems that have shared disks mounted simultaneously will lose data.
In addition, we modified the /etc/init.d/postgresql script to utilize STONITH, Shoot The Other Node In The Head, technology to make certain the failed system is powered off and does not have the shared disks mounted. The STONITH mechanism relies on a network/serial interfaces to a power switch. We utilized HP's Integrated Lights Out to send a power off message to shutdown the system that has failed.
Implementing the HAPC architecture has created an automated fail over of our PostgreSQL mission critical system. In addition, we have a mechanism to provide maintenance on the servers without impacting the database service. Using cluster commands, we are able to manage the cluster, disabling and enabling the clustered service.
Thus, by using PostgreSQL, high available open source cluster software, hardware and shared disk storage we created a database service that is virtually non-stop. Typically, organizations will spend lots of money implementing high availability. By implementing HAPC we were able to save money. We were quoted $300,000 for Microsoft SQLServer and $500,000 for Oracle database software, without any hardware. But instead we chose to implement PostgreSQL for substantially less. Corporations utilizing this technology can lower their database service costs without sacrificing performance, availability or reliability.
For questions and comments please contact firstname.lastname@example.org
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009