Introduction to Postgres constraints
Database constraints are a way to ensure data integrity. You can use database constraints to uniquely identify records, but also to ensure specific domain requirements.
Primary and foreign keys
Most basic database constraint is a primary key. A big integer that is both unique and not empty is almost always used to identify records. Typically, a primary key has SERIAL
type. It means an ever-growing unique integer.
1 | CREATE TABLE public.posts ( |
A primary identifies a record inside current table, while foreign key identifies and ensures existence of records from another table. To continue with our blog example, a post can have comments, with comments being attached to exactly one post.
1 | CREATE TABLE public.comments ( |
A foreign key constraint is useful if you want to ensure that the record you are attaching to exists in the other table. A comment should always reference an existing post. The reverse is also true. You cannot delete a post if it has comments that reference it. First, you need to delete the comments, and then, you can delete the post.
Not null constraint
Not much to say about them. Your database design should include as many non-null columns as possible.
1 | CREATE TABLE public.users ( |
Unique constraint
You can use a unique constraint if you want to ensure that data does not repeat inside the database. It is also a way to ensure that no two clients can enter the same data into the database For example, you would want to ensure that user emails are unique:
1 | CREATE TABLE public.users ( |
Unique constraints are also composite, you can declare an unique on two or more columns.
1 | CREATE TABLE public.users ( |
Unique constraint automatically creates an unique B-tree index on the columns in question, there is no need to create an index manually. Before PostgreSQL 9.5, an unique index was considered an implementation detail in Postgres, and the preferred way to enforce uniqueness wass to execute an ALTER TABLE
statement. Later on, that information was removed from documentation.
Unique constraints are not free, as they incure a time cost on each INSERT
or UPDATE
statement. If your database is write-heavy, you might find it becoming a performance bottleneck.
Exclusion constraint
An exclusion constraint compares new data with data that already exists in the same table. An example provided by postgres documentation are overlapping circles, but if you are writing business application, it might not be relatable.
I like to use an example of scheduling system for rooms where you do not want to have two events scheduled in the same room at the same time. Note that to use exclusion constraints, you need to enable the btree_gist
extension.
1 | CREATE EXTENSION btree_gist; |
Let’s concentrate on line eight. Innermost function tsrange(starts_at, ends_at, '[]')
creates an inclusive timestamp range from starts_at
to ends_at
, and WITH &&
means compare the range to existing values with &&
from range operators. room_pk WITH =
means compare room_pk with equality.
Behind the scenes, Postgres creates an index to monitor this constraint:
1 | Indexes: |
Check constraints
A check constraint evaluates arbitrary boolean expression and either lets you insert the data or raises an exception. A simple example of check constraint is as follows:
1 | CREATE TABLE public.reservations ( |
Check constraints cannot execute subqueries or reference another table, but it can also enforce domain rules. Be aware that an expensive boolean expression in a check constraint will result in an expensive INSERT
or UPDATE
operation.
Wrap up
Personally, I like to introduce multiple database constraints in the early stages of database design. One reason is that it minimizes the possiblity of having incorrect data in the database. Secondly, it also is easier to relax tighter rules than to introduce a new rule when your database is already used in a production system.