PostgreSQL binary data - ghdrako/doc_snipets GitHub Wiki

The bytea data type allows storage of binary strings

A binary string is a sequence of octets (or bytes). Binary strings are distinguished from character strings in two ways. First, binary strings specifically allow storing octets of value zero and other "non-printable" octets (usually, octets outside the range 32 to 126). Character strings disallow zero octets, and also disallow any other octet values and sequences of octet values that are invalid according to the database's selected character set encoding. Second, operations on binary strings process the actual bytes, whereas the processing of character strings depends on locale settings. In short, binary strings are appropriate for storing data that the programmer thinks of as "raw bytes", whereas character strings are appropriate for storing text.

The bytea type supports two external formats for input and output:

  • PostgreSQL's historical "escape" format, and
  • "hex" format.

Hex format

encodes binary data as 2 hexadecimal digits per byte, most significant nibble first. The entire string is preceded by the sequence \x (to distinguish it from the escape format).



SELECT e'string with a \0 byte';
SELECT '{"a_json_object": "with_a_\u0000_byte"}'::jsonb;

 SELECT convert_from('foo\000bar'::bytea, 'unicode');
 select 'foo\000bar'::bytea
 
  select position('\0'::bytea in 'foo\000bar'::bytea)
  
  select convert_from('\x03FF'::bytea, 'unicode');
  
  
  
select position('\000om'::bytea in 'Th\000omas'::bytea);
select position('\00'::bytea in 'Th\000omas'::bytea); 
 
  SELECT convert_from(e'foo0\0bar'::bytea, 'unicode');
 
 
 SELECT REPLACE(e'ALA',"A","a");
 
 SELECT
	REPLACE (e'ABC AA \0', 'A', 'Z');
	
	
select convert_from(metadata, 'UTF8');	
	
	
	SELECT convert_from(E'\\xDEADBEEF'::bytea, 'unicode') as value
	
	SELECT E'\\x00000000'::bytea as value
	
SELECT E'\\xDEADBEEF';

escape format

It takes the approach of representing a binary string as a sequence of ASCII characters, while converting those bytes that cannot be represented as an ASCII character into special escape sequences.