How To Partitioning on PostgreSQL 9.6

Hello,

I'm gonna share my experience on partitioning and show how affect partition on scripts. I'm gonna show you how to partitioning on existing table.

Let's begin to following steps;
  1. Create new table. It's gonna be partitioned table.
  2. Create partition tables.
  3. Grant privileges to partition tables and partitioned table.
  4. Generate function and trigger on partitioned table.
  5. Insert data from main table to partitioned table.
  6. Create indexes if you had on main table on partition tables.
  7. Check partition tables everything is fine.
  8. Rename main table.
  9. Rename partitioned table name like main table
  10. Drop old main table.
Not that: if you have some indexes or constraint on your main table, not to increase inserting time you should create your new main table and other partition tables without constraints and indexes. After insert whole data to partition tables you can add indexes and constraints. It will be help you to executing insert statement more quickly. 

Before perform this steps, you need to decide which criteria you want to use for separate main table. Usually, id and date is good choice to partitioning. 

I'm going to create a dummy table and going to separate the table by date. 

Create dummy_transactions table. This table is going to be main table for partitioning but not now.

test=# create table public.dummy_transactions (id serial, some_val integer, transaction_date timestamp without time zone);
CREATE TABLE

test=# insert into public.dummy_transactions (some_val,transaction_date) select generate_series(1,100), generate_series('2018-01-01 00:00'::timestamp,'2018-03-04 12:00', '1 day') from generate_series(1,2);
INSERT 0 12600


The main table has some rows. We need to create copy- partitioned table.

test=# create table public.dummy_transactions_prt (id serial, some_val integer, transaction_date timestamp without time zone);
CREATE TABLE


Let's create first partition table as bellow. Note that If your main table has other constraints like foreign key, you should use in your generating partition script. 

CREATE TABLE public.dummy_transactions_p1
(CONSTRAINT dummy_transactions_p1_pk PRIMARY KEY (id),
CONSTRAINT check1 CHECK (transaction_date < '2018-02-01 00:00:00'::timestamp without time zone)
) INHERITS (public.
dummy_transactions_prt);

CREATE TABLE public.dummy_transactions_p2
(CONSTRAINT dummy_transactions_p2_pk PRIMARY KEY (id),
CONSTRAINT check1 CHECK (transaction_date >= '20180201' and transaction_date < '20180301')
) INHERITS (public.
dummy_transactions_prt);


CREATE TABLE public.dummy_transactions_p3
(CONSTRAINT dummy_transactions_p3_pk PRIMARY KEY (id),
CONSTRAINT check1 CHECK (transaction_date >= '20180301')
) INHERITS (public.
dummy_transactions_prt);


Continue with second step like (if you want)


GRANT SELECT, UPDATE, INSERT, DELETE ON TABLE public.dummy_transactions_p1 TO dev;

GRANT SELECT ON TABLE public.dummy_transactions_p1 TO rpt;

GRANT SELECT, UPDATE, INSERT, DELETE ON TABLE public.dummy_transactions_p2 TO dev;

GRANT SELECT ON TABLE public.dummy_transactions_p2 TO rpt;

GRANT SELECT, UPDATE, INSERT, DELETE ON TABLE public.dummy_transactions_p3 TO dev;

GRANT SELECT ON TABLE public.dummy_transactions_p3 TO rpt;

Generate function and trigger

CREATE OR REPLACE FUNCTION transaction_insert() RETURNS TRIGGER AS
$$
BEGIN
 
IF (NEW.transaction_date <'20180201') THEN
INSERT INTO public.dummy_transactions_p1 VALUES (NEW.*);
ELSIF (NEW.transaction_date >='20180201' and NEW.transaction_date <'20180301') THEN
INSERT INTO public.dummy_transactions_p2 VALUES (NEW.*);

ELSIF (NEW.transaction_date >='20180301') THEN
INSERT INTO public.dummy_transactions_p3 VALUES (NEW.*);

ELSE RAISE EXCEPTION 'Date out of range.';
END IF;

RETURN NULL;

END;
$$
LANGUAGE plpgsql;

Than create trigger. In the insert time this trigger looks to above function and all data will insert suitable partition tables. 


CREATE TRIGGER transaction_insert
BEFORE INSERT ON public.dummy_transactions_prt
FOR EACH ROW EXECUTE PROCEDURE transaction_insert();

Check table values

test=# select * from public.dummy_transactions limit 10;
 id | some_val |  transaction_date  
----+----------+---------------------
  1 |        1 | 2018-01-01 00:00:00
  2 |        2 | 2018-01-02 00:00:00
  3 |        3 | 2018-01-03 00:00:00
  4 |        4 | 2018-01-04 00:00:00
  5 |        5 | 2018-01-05 00:00:00
  6 |        6 | 2018-01-06 00:00:00
  7 |        7 | 2018-01-07 00:00:00
  8 |        8 | 2018-01-08 00:00:00
  9 |        9 | 2018-01-09 00:00:00
 10 |       10 | 2018-01-10 00:00:00
(10 rows)

Time: 0.890 ms


Look at bellow result to compere before generating partition table. You see only main table dummy_transactions. 

test=# explain analyze select * from public.dummy_transactions where transaction_date < '20180201';
                                                     QUERY PLAN                
                                    
--------------------------------------------------------------------------------
-------------------------------------
 Seq Scan on dummy_transactions  (cost=0.00..226.50 rows=6200 width=16) (actual
time=0.030..4.214 rows=6200 loops=1)
   Filter: (transaction_date < '2018-02-01 00:00:00'::timestamp without time zon
e)
   Rows Removed by Filter: 6400
 Planning time: 0.107 ms
 Execution time: 4.930 ms
(5 rows)

Time: 5.663 ms



Insert data from old main table to partition tables. There is two choice in that step. If your table is not heavy than insert whole data to partitioned table. 

test=# insert into public.dummy_transactions_prt select * from public.dummy_transactions;
INSERT 0 0


Or the second way, you can separate your data as your partitioned criteria or insert only partitioned tables.

test=# insert into public.dummy_transactions_p1 select * from public.dummy_transactions where transaction_date < '20180201';
INSERT 0 6200

test=# insert into public.dummy_transactions_p2 select * from public.dummy_transactions where transaction_date >= '20180201' and transaction_date < '20180301';
INSERT 0 5600

test=# insert into public.dummy_transactions_p3 select * from public.dummy_transactions where transaction_date >= '20180301';
INSERT 0 800

Check partition tables and main table. Make sure main table doesn't have any row.

test=#  SELECT count(*) FROM ONLY public.dummy_transactions_prt;
 count
-------
     0
(1 row)

Time: 0.583 ms 
 
test=#  SELECT count(*) FROM ONLY public.dummy_transactions_p1;
 count
-------
  6200
(1 row)

What happen after partitioning? Check on emphasized as red in query plan 
test=# explain analyze select sum(some_val) from public.dummy_transactions_prt where transaction_date<'20180201';
                                                            QUERY PLAN         
                                                  
--------------------------------------------------------------------------------
---------------------------------------------------
 Aggregate  (cost=127.00..127.01 rows=1 width=8) (actual time=5.459..5.459 rows=
1 loops=1)
   ->  Append  (cost=0.00..111.50 rows=6201 width=4) (actual time=0.024..4.181 r
ows=6200 loops=1)
         ->  Seq Scan on dummy_transactions_prt  (cost=0.00..0.00 rows=1 width=4
) (actual time=0.004..0.004 rows=0 loops=1)
               Filter: (transaction_date < '2018-02-01 00:00:00'::timestamp with
out time zone)
         ->  Seq Scan on dummy_transactions_p1  (cost=0.00..111.50 rows=6200 wid
th=4) (actual time=0.019..2.904 rows=6200 loops=1)
               Filter: (transaction_date < '2018-02-01 00:00:00'::timestamp with
out time zone)
 Planning time: 0.886 ms
 Execution time: 5.499 ms
(8 rows)

Time: 8.837 ms
test=#

As you see, now queries scanning on partition tables, not need main tables.

If you make sure everything is fine on new main table and sub partition tables, you can rename main table and rename partitioned table as main table than drop old main table.

test=# alter table dummy_transactions rename to dummy_transactions_old;
ALTER TABLE
Time: 2.142 ms
 

test=# alter table dummy_transactions_prt rename to dummy_transactions;
ALTER TABLE
Time: 3.377 ms
 

test=# drop table dummy_transactions_old;
DROP TABLE
Time: 5.470 ms

The last one thing is that, check your partition tables row counts as bellow,

test=# select tableoid::regclass as partition_name, count(*) from dummy_transactions group by 1;
 partition_name                         | count
-----------------------------------------+-------
 
dummy_transactions_p1       |    15
 
dummy_transactions_p2       |    18
 
dummy_transactions_p3       |    60
(3 rows)


Loves,


Comments

Popular posts from this blog

PostgreSQL High Availability - Patroni 2

PostgreSQL Foreign Data Wrappers

PostgreSQL High Availability - Patroni 1