postgres data type denormalized - ghdrako/doc_snipets GitHub Wiki
- Boolean type
- Numeric types
- Character types
- Date/time
- Composite types, array, range.
- NoSQL data types: hstore, xml, json, and jsonb
- point
PostgreSQL supports Boolean data types. The Boolean type (identified by BOOLEAN
or BOOL
), like all data types supported by PostgreSQL, can assume the NULL value. Therefore, a Boolean data type can take the NULL, FALSE, and TRUE values.
alter table users add user_on_line boolean;
update users set user_on_line = true where pk=1;
- integer or int4 (4-byte integer number).
- bigint or int8 (8-byte integer number).
- real (4-byte variable precision, inexact with 6-decimal-digit precision).
- double precision (8-byte variable precision, inexact with 15-decimal-digit precision).
- numeric (precision, scale), where the precision of a numeric is the total count of significant digits in the whole number, and the scale of a numeric is the count of decimal digits in the fractional part. For example, 5.827 has a precision of 4 and a scale of 3.
select 1.123456789::integer as my_field;
1
select 1.123456789::int4 as my_field;
1
select 1.123456789::bigint as my_field;
1
select 1.123456789::int8 as my_field;
1
select 1.123456789::real as my_field;
1.1234568
select 1.123456789::double precision as my_field;
1.12345689
perform the sum of the value 0.1 10 times. The correct result would be the number 1.
select sum(0.1::real) from generate_series(1,10);
sum
-----------
1.0000001
We get the value 1.0000001. This happens due to the intrinsic rounding error in the real data type, so it is not recommended to use the real data type in fields representing money. The correct way to make this sum is using the numeric data type.
select 1.123456789::numeric(10,1) as my_field;
1.1
select 1.123456789::numeric(10,5) as my_field;
1.12346
select 1.123456789::numeric(10,9) as my_field;
1.123456789
select 1.123456789::numeric(10,10) as my_field;
ERROR: numeric field overflow
DETAIL: A field with precision 10, scale 10 must round to an absolute
value less than 1.
select 0.123456789::numeric(10,10) as my_field;
0.1234567890
select sum(0.1::numeric(2,2)) from generate_series(1,10);
1.0
- character(n)/char(n) (fixed-length, blank-padded)
- character varying(n)/varchar(n) (variable length with a limit)
- varchar/text (variable unlimited length)
char(10)
- char with a fixed length of 10, so even if we insert a string with a shorter length,
the difference between 10 and the number of real characters of the string will be filled with blank
characters.
- https://www.postgresql.org/docs/devel/arrays.html
- https://www.w3resource.com/PostgreSQL/postgresql-array-operators-and-functions.php
PostgreSQL has built-in support for arrays with ability to process array elements from SQL directly. This capability includes indexing facilities thanks to GIN indexing.
Arrays can be used to denormalize data and avoid lookup tables. A good rule of thumb for using them that way is that you mostly use the array as a whole, even if you might at times search for elements in the array. Heavier processing is going to be more complex than a lookup table.
A classic example of a good use case for PostgreSQL arrays is user-defined tags.
create table hashtag
(
id bigint primary key,
date timestamptz,
uname text,
message text,
location point,
hashtags text[]
);
regexp_matches(message, '(#[^ ,]+)', 'g') as match
-
regexp_matches()
with theg
flag to return every match found and not just the first tag in a message. Those multiple matches are returned one per rów.
array_agg(match[1] order by match[1]) as hashtags
-
unnest()
function allows processing the array’s content as if it were just another relation.
select tag, count(*)
from hashtag, unnest(hashtags) as t(tag)
group by tag
order by count desc
limit 10;
create index on hashtag using gin (hashtags);
Gin index access method allows PostgreSQL to index the contents of the arrays, the tags them-selves, rather than each array as an opaque value. Replacement for array is lookup table if array Will be inefficient.
DECLARE my_array INTEGER[];
You can initialize an array with values using curly braces:
DECLARE my_array INTEGER[] := '{1, 2, 3}';
You can access individual elements of an array using square brackets. The following is a simple example:
postgres=# DO
$$
DECLARE
my_array INTEGER[] := '{1, 2, 3}';
second_element INTEGER;
BEGIN
second_element := my_array[2];
RAISE NOTICE 'second element from my array is: %', second_element;
END;
$$;
begin;
create type rate_t as
(
currency text,
validity daterange,
value numeric
);
create table rate of rate_t
(
exclude using gist (currency with =, validity with &&)
);
insert into rate(currency, validity, value)
select currency, validity, rate
from rates;
commit;
In advanced cases is usful build composite types
- Management ofStored Procedur?API
- Advanced use cases of array of composite types
- hstore
- xml
- json/jsonb
- https://www.postgresql.org/docs/current/hstore.html This data type is used for storing key-value pairs in a single value. Before working with the hstore data type, we need to enable the hstore extension on our server:
create extension hstore ;
create table posts_options as
select p.pk,p.title,hstore(ARRAY['username',u.username,'category',c.
title]) as options
from posts p
inner join users u on p.author=u.pk
left join categories c on p.category=c.pk
order by 1;
select * from posts_options where options->'category'='Database';
insert into posts_options (pk,title,options) values (7,'my lastpost','"enabled"=>"false"') ;
- https://www.postgresql.org/docs/current/datatype-xml.html
- https://www.postgresql.org/docs/current/functions-xml.html The best option when you need to process XML documents might be the XSLT transformation language for XML. PostgreSQL extension PL/XSLT allows writing stored procedurę in this language if you have to deal with XML documents .
create extension plxslt;
CREATE OR REPLACE FUNCTION striptags(xml) RETURNS text
LANGUAGE xslt
AS $$<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml" >
<xsl:output method="text" omit-xml-declaration="yes"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
</xsl:stylesheet> 1
$$;
Usage
create table docs
(
id serial primary key,
content xml
);
insert into docs(content)
values ('<?xml version="1.0"?>
<html xmlns="http://www.w3.org/1999/xhtml">
<body>hello</body>
</html>');
select id, striptags(content) 14 from docs;
Find out the ten nearest circuits to Paris, France, which is at longi-tude 2.349014 and latitude 48.864716. That’s a kNN search with k = 10:
select name, location, country
from circuits
order by point(lng,lat) <-> point(2.349014, 48.864716)
limit 10;
begin;
alter table f1db.circuits add column position point;
update f1db.circuits set position = point(lng,lat);
create index on f1db.circuits using gist(position);
commit;
-- point type use effective GIST index
explain (costs off, buffers, analyze)
select name, location, country
from circuits
order by position <-> point(2.349014, 48.864716)
limit 10;
By default, the distance operator <->
is defined only for geometric data types in PostgreSQL. Some extensions such as pg_trgm add to that list so that you may bene??t from a kNN index lookup in other situations, such as in queries using thelikeoperator, or even the regular expression operator~
.
Using lateral left
join lookup kNN search with order by ... <-> ... limit K
. clause:
The <->
operator computes the distance in between its argument, and by using the limit 1
clause we select the nearest known location