postgres copy - ghdrako/doc_snipets GitHub Wiki

 [ [ WITH ] 
          [ BINARY ]
          [ OIDS ]
          [ DELIMITER [ AS ] 'delimiter' ]
          [ NULL [ AS ] 'null string' ]
          [ CSV [ HEADER ]
                [ QUOTE [ AS ] 'quote' ] 
                [ ESCAPE [ AS ] 'escape' ]
                [ FORCE QUOTE column [, ...] ]

\copy

\copy { table | ( query ) } { from | to } { 'filename' } [ [ with ] ( option [, ...] ) ]

WITH DELIMITER ',' CSV HEADER; – Specifying that this file’s delimiter will be a comma and that it does contain a header.

WITH CSV DELIMITER ',' NULL '' ESCAPE'\' - spacje sa traktowane jako null np gdy ladujemy do pola typu data CSV jest potrzebne bo mozna uzyc ESCAPE aby \ nie byl traktowany jako znak secjalny inaczej ; nie bedzie separatorem

-c option to supply a command to psql

psql -c "\copy sample FROM '/home/MyUser/data/TableName.csv' WITH (FORMAT CSV)"
psql -c "\copy sample FROM /home/MyUser/data/TableName.csv WITH (DELIMITER ',')"

Copy

psql -U postgres -d mydb -c "COPY customers TO 'customers.csv' CSV HEADER;"
psql -U postgres -d postgres -c "COPY employees FROM 'path/to/employees.csv' CSV HEADER;" 
copy (select * from customers limit 5) to stdout with csv header;

COPY command is very efficient in bulk loading of data

COPY has two main operating modes:

  • COPY TO pulls data out of a table and sends it to a file on the filesystem or to another external application or process.
  • COPY FROM loads data from a file on the filesystem or an external application and inserts it into a specified table.

When dealing with external files or programs, the COPY command requires superuser privileges (or at least for the user to belong to the pg_write_server_files group).

COPY forum.categories TO '/tmp/categories.backup.txt';
COPY forum.categories TO '/tmp/categories.csv' WITH ( HEADER on, DELIMITER ';' );
COPY forum.categories TO '/tmp/categories.csv' WITH ( FORMAT csv );
#  COPY TO does not allow for a WHERE clause, but it is possible to copy from a query
COPY
( SELECT * FROM forum.categories
WHERE pk % 2 = 1 )
TO '/tmp/categories.odd.csv'
WITH (FORMAT csv);

# Send data to external program
COPY forum.categories TO PROGRAM $CODE$ awk '{print $2;}' > /tmp/titles.txt $CODE$;
COPY forum.categories_reloaded FROM '/tmp/categories.csv' WITH (FORMAT csv);
COPY forum.categories_reloaded FROM '/tmp/categories.csv' WITH (FORMAT csv) WHERE pk % 2 = 1;
#  Load data generating by bash script
COPY forum.categories_reloaded
FROM PROGRAM $CODE$
/bin/bash -c 'for i in {1..10}; do echo "$i,Title$i,A generated row";
done' $CODE$
WITH (FORMAT csv);

COPY is not usable by unprivileged users. To deal with this, psql provides its own COPY replacement command, named \copy, that streams the content (in either direction) with regard to files accessible to the psql client. This way, the user is able to exploit COPY without needing any particular server privilege like pg_write_server.

psql allows for the COPY command to be called remotely using the psql-specific \copy instruction, which invokes COPY.

\COPY
( SELECT * FROM forum.categories
WHERE pk % 2 = 1 )
TO '/tmp/categories.odd.csv'
WITH (FORMAT csv);
\copy (SELECT * FROM customers LIMIT 5) TO 'my_file.csv' WITH CSV HEADER;
  • \copy is invoking the Postgres COPY ... TO STDOUT... command to output the data.
  • (SELECT * FROM customers LIMIT 5) is the query that we want to copy.
  • TO 'my_file.csv' indicates that psql should save the output from standard into my_file.csv.
  • The WITH CSV HEADER parameters add header to export.

\copy does not allow for commands with new lines. A simple way around this is to create a view containing your data before the \copy command and drop the view after your \copy command has finished.

CREATE TEMP VIEW customers_sample AS (
SELECT *
FROM customers
LIMIT 5
);
\copy customers_sample TO 'my_file.csv' WITH CSV HEADER
DROP VIEW customers_sample;  -- TEMP VIEW is deleted automaticaly in end of session

Configuring COPY and \copy

There are several options to configure the COPY and \copy commands:

  • FORMAT format_name can be used to specify the format. The options for format_name are csv, text, or binary. Alternatively, you can simply specify CSV or BINARY without the FORMAT keyword, or not specify the format at all and let the output default to a text file format.
  • DELIMITER 'delimiter_character' can be used to specify the delimiter character for CSV or text files (for example ',' for CSV files, or '|' for pipe-separated files)
  • NULL 'null_string' can be used to specify how null values should be represented (for example, ' ' if blanks represent null values, or 'NULL' if that's how missing values should be represented in the data).
  • HEADER specifies that the header should be output.
  • QUOTE 'quote_character' can be used to specify how fields with special characters (for example, a comma in a text value within a CSV file) can be wrapped in quotes so that they are ignored by COPY.
  • ESCAPE 'escape_character' specifies the character that can be used to escape the following character.
  • ENCODING 'encoding_name' allows specification of the encoding, which is particularly useful when you are dealing with foreign languages that contain special characters or user input.

Delimiter character as hex

DELIMITER E'\x3A'for : ASCII hex representation as well E'\xB' for vertical tab (VT) character.

Preprocessing file

sed s/~,~/\\t/g inputFile

Loading using temporary table

/* load line to temporary table */
create temporary table t (x text);
copy t from 'foo.csv';
/* split line */
select regexp_matches(x, '^([0-9]+),(true|false),(.*)$') from t;
/* load to destination table */
insert into sent(id, is_alive, body_text)
  select x[1], x[2], x[3] 
  from (
    select regexp_matches(x, '^([0-9]+),(true|false),(.*)$') as x 
      from t) t

Quote strings that contain the delimiter character If the quoted string contains the quote character, double the quote character. Example if you want to represent two values Fred "wiggle" Smith and one, two, you'd do so as: "Fred ""Wiggle"" Smith","one, two"