Introduction to Postgres constraints

Maciej Szlosarczyk

2018-11-03

Database constraints are a way to ensure data integrity. You can use database constraints to uniquely identify records, but also to ensure specific domain requirements.

Primary and foreign keys

Most basic database constraint is a primary key. A big integer that is both unique and not empty is almost always used to identify records. Typically, a primary key has SERIAL type. It means an ever-growing unique integer.

CREATE TABLE public.posts (
    pk SERIAL PRIMARY KEY,
    title VARCHAR
)

A primary identifies a record inside current table, while foreign key identifies and ensures existence of records from another table. To continue with our blog example, a post can have comments, with comments being attached to exactly one post.

CREATE TABLE public.comments (
    pk SERIAL PRIMARY KEY,
    post_pk integer NOT NULL,
    body text NOT NULL,
    email character varying(255) NOT NULL
);

ALTER TABLE ONLY public.comments
    ADD CONSTRAINT comments_post_pk_fkey FOREIGN KEY (post_pk) REFERENCES public.posts(pk);

A foreign key constraint is useful if you want to ensure that the record you are attaching to exists in the other table. A comment should always reference an existing post. The reverse is also true. You cannot delete a post if it has comments that reference it. First, you need to delete the comments, and then, you can delete the post.

Not null constraint

Not much to say about them. Your database design should include as many non-null columns as possible.

CREATE TABLE public.users (
    pk SERIAL PRIMARY KEY,
    email character varying(255) NOT NULL
);

Unique constraint

You can use a unique constraint if you want to ensure that data does not repeat inside the database. It is also a way to ensure that no two clients can enter the same data into the database For example, you would want to ensure that user emails are unique:

CREATE TABLE public.users (
    pk SERIAL PRIMARY KEY,
    name character varying(255)
    email character varying(255) NOT NULL
    UNIQUE (email)
);

Unique constraints are also composite, you can declare an unique on two or more columns.

CREATE TABLE public.users (
    pk SERIAL PRIMARY KEY,
    name character varying(255)
    email character varying(255)
    UNIQUE (email, name)
);

Unique constraint automatically creates an unique B-tree index on the columns in question, there is no need to create an index manually. Before PostgreSQL 9.5, an unique index was considered an implementation detail in Postgres, and the preferred way to enforce uniqueness wass to execute an ALTER TABLE statement. Later on, that information was removed from documentation.

Unique constraints are not free, as they incure a time cost on each INSERT or UPDATE statement. If your database is write-heavy, you might find it becoming a performance bottleneck.

Exclusion constraint

An exclusion constraint compares new data with data that already exists in the same table. An example provided by postgres documentation are overlapping circles, but if you are writing business application, it might not be relatable.

I like to use an example of scheduling system for rooms where you do not want to have two events scheduled in the same room at the same time. Note that to use exclusion constraints, you need to enable the btree_gist extension.

CREATE EXTENSION btree_gist;

CREATE TABLE public.reservations (
    pk SERIAL PRIMARY KEY,
    room_pk integer NOT NULL,
    starts_at timestamp without time zone NOT NULL,
    ends_at timestamp without time zone NOT NULL,
    EXCLUDE USING gist (room_pk WITH =, tsrange("starts_at", "ends_at", '[]') WITH &&)
);

ALTER TABLE ONLY public.reservations
    ADD CONSTRAINT comments_room_pk_fkey FOREIGN KEY (room_pk) REFERENCES public.room(pk);

Let’s concentrate on line eight. Innermost function tsrange(starts_at, ends_at, '[]') creates an inclusive timestamp range from starts_at to ends_at, and WITH && means compare the range to existing values with && from range operators. room_pk WITH = means compare room_pk with equality.

Behind the scenes, Postgres creates an index to monitor this constraint:

Indexes:
    "reservations_pkey" PRIMARY KEY, btree (pk)
    "reservations_room_pk_tsrange_excl" EXCLUDE USING gist
        (room_pk WITH =, tsrange(starts_at, ends_at, '[]'::text) WITH &&)

Check constraints

A check constraint evaluates arbitrary boolean expression and either lets you insert the data or raises an exception. A simple example of check constraint is as follows:

CREATE TABLE public.reservations (
    pk SERIAL PRIMARY KEY,
    room_pk integer NOT NULL,
    starts_at timestamp without time zone NOT NULL,
    ends_at timestamp without time zone NOT NULL,
    CONSTRAINT starts_at_earlier_than_ends_at CHECK ((starts_at < ends_at))
);

ALTER TABLE ONLY public.reservations
    ADD CONSTRAINT comments_room_pk_fkey FOREIGN KEY (room_pk) REFERENCES public.room(pk);

Check constraints cannot execute subqueries or reference another table, but it can also enforce domain rules. Be aware that an expensive boolean expression in a check constraint will result in an expensive INSERT or UPDATE operation.

Wrap up

Personally, I like to introduce multiple database constraints in the early stages of database design. One reason is that it minimizes the possiblity of having incorrect data in the database. Secondly, it also is easier to relax tighter rules than to introduce a new rule when your database is already used in a production system.