Skip to main content
Skip to main content

Encoding Functions

char

Returns the string with the length as the number of passed arguments and each byte has the value of corresponding argument. Accepts multiple arguments of numeric types. If the value of argument is out of range of UInt8 data type, it is converted to UInt8 with possible rounding and overflow.

Syntax

Arguments

  • number_1, number_2, ..., number_n — Numerical arguments interpreted as integers. Types: Int, Float.

Returned value

  • a string of given bytes. String.

Example

Query:

Result:

You can construct a string of arbitrary encoding by passing the corresponding bytes. Here is example for UTF-8:

Query:

Result:

Query:

Result:

hex

Returns a string containing the argument's hexadecimal representation.

Alias: HEX.

Syntax

The function is using uppercase letters A-F and not using any prefixes (like 0x) or suffixes (like h).

For integer arguments, it prints hex digits ("nibbles") from the most significant to least significant (big-endian or "human-readable" order). It starts with the most significant non-zero byte (leading zero bytes are omitted) but always prints both digits of every byte even if the leading digit is zero.

Values of type Date and DateTime are formatted as corresponding integers (the number of days since Epoch for Date and the value of Unix Timestamp for DateTime).

For String and FixedString, all bytes are simply encoded as two hexadecimal numbers. Zero bytes are not omitted.

Values of Float and Decimal types are encoded as their representation in memory. As we support little-endian architecture, they are encoded in little-endian. Zero leading/trailing bytes are not omitted.

Values of UUID type are encoded as big-endian order string.

Arguments

Returned value

  • A string with the hexadecimal representation of the argument. String.

Examples

Query:

Result:

Query:

Result:

Query:

Result:

Query:

Result:

unhex

Performs the opposite operation of hex. It interprets each pair of hexadecimal digits (in the argument) as a number and converts it to the byte represented by the number. The return value is a binary string (BLOB).

If you want to convert the result to a number, you can use the reverse and reinterpretAs<Type> functions.

Note

If unhex is invoked from within the clickhouse-client, binary strings display using UTF-8.

Alias: UNHEX.

Syntax

Arguments

Supports both uppercase and lowercase letters A-F. The number of hexadecimal digits does not have to be even. If it is odd, the last digit is interpreted as the least significant half of the 00-0F byte. If the argument string contains anything other than hexadecimal digits, some implementation-defined result is returned (an exception isn't thrown). For a numeric argument the inverse of hex(N) is not performed by unhex().

Returned value

  • A binary string (BLOB). String.

Example

Query:

Result:

Query:

Result:

bin

Returns a string containing the argument's binary representation.

Syntax

Alias: BIN.

For integer arguments, it prints bin digits from the most significant to least significant (big-endian or "human-readable" order). It starts with the most significant non-zero byte (leading zero bytes are omitted) but always prints eight digits of every byte if the leading digit is zero.

Values of type Date and DateTime are formatted as corresponding integers (the number of days since Epoch for Date and the value of Unix Timestamp for DateTime).

For String and FixedString, all bytes are simply encoded as eight binary numbers. Zero bytes are not omitted.

Values of Float and Decimal types are encoded as their representation in memory. As we support little-endian architecture, they are encoded in little-endian. Zero leading/trailing bytes are not omitted.

Values of UUID type are encoded as big-endian order string.

Arguments

Returned value

  • A string with the binary representation of the argument. String.

Examples

Query:

Result:

Query:

Result:

Query:

Result:

Query:

Result:

unbin

Interprets each pair of binary digits (in the argument) as a number and converts it to the byte represented by the number. The functions performs the opposite operation to bin.

Syntax

Alias: UNBIN.

For a numeric argument unbin() does not return the inverse of bin(). If you want to convert the result to a number, you can use the reverse and reinterpretAs<Type> functions.

Note

If unbin is invoked from within the clickhouse-client, binary strings are displayed using UTF-8.

Supports binary digits 0 and 1. The number of binary digits does not have to be multiples of eight. If the argument string contains anything other than binary digits, some implementation-defined result is returned (an exception isn't thrown).

Arguments

  • arg — A string containing any number of binary digits. String.

Returned value

  • A binary string (BLOB). String.

Examples

Query:

Result:

Query:

Result:

bitmaskToList(num)

Accepts an integer. Returns a string containing the list of powers of two that total the source number when summed. They are comma-separated without spaces in text format, in ascending order.

bitmaskToArray(num)

Accepts an integer. Returns an array of UInt64 numbers containing the list of powers of two that total the source number when summed. Numbers in the array are in ascending order.

bitPositionsToArray(num)

Accepts an integer and converts it to an unsigned integer. Returns an array of UInt64 numbers containing the list of positions of bits of arg that equal 1, in ascending order.

Syntax

Arguments

Returned value

  • An array containing a list of positions of bits that equal 1, in ascending order. Array(UInt64).

Example

Query:

Result:

Query:

Result:

mortonEncode

Calculates the Morton encoding (ZCurve) for a list of unsigned integers.

The function has two modes of operation:

  • Simple
  • Expanded

Simple mode

Accepts up to 8 unsigned integers as arguments and produces a UInt64 code.

Syntax

Parameters

Returned value

Example

Query:

Result:

Expanded mode

Accepts a range mask (tuple) as a first argument and up to 8 unsigned integers as other arguments.

Each number in the mask configures the amount of range expansion:
1 - no expansion
2 - 2x expansion
3 - 3x expansion
...
Up to 8x expansion.

Syntax

Parameters

  • range_mask: 1-8.
  • args: up to 8 unsigned integers or columns of the aforementioned type.

Note: when using columns for args the provided range_mask tuple should still be a constant.

Returned value

Example

Range expansion can be beneficial when you need a similar distribution for arguments with wildly different ranges (or cardinality) For example: 'IP Address' (0...FFFFFFFF) and 'Country code' (0...FF).

Query:

Result:

Note: tuple size must be equal to the number of the other arguments.

Example

Morton encoding for one argument is always the argument itself:

Query:

Result:

Example

It is also possible to expand one argument too:

Query:

Result:

Example

You can also use column names in the function.

Query:

First create the table and insert some data.

Use column names instead of constants as function arguments to mortonEncode

Query:

Result:

implementation details

Please note that you can fit only so many bits of information into Morton code as UInt64 has. Two arguments will have a range of maximum 2^32 (64/2) each, three arguments a range of max 2^21 (64/3) each and so on. All overflow will be clamped to zero.

mortonDecode

Decodes a Morton encoding (ZCurve) into the corresponding unsigned integer tuple.

As with the mortonEncode function, this function has two modes of operation:

  • Simple
  • Expanded

Simple mode

Accepts a resulting tuple size as the first argument and the code as the second argument.

Syntax

Parameters

  • tuple_size: integer value no more than 8.
  • code: UInt64 code.

Returned value

Example

Query:

Result:

Expanded mode

Accepts a range mask (tuple) as a first argument and the code as the second argument. Each number in the mask configures the amount of range shrink:
1 - no shrink
2 - 2x shrink
3 - 3x shrink
...
Up to 8x shrink.

Range expansion can be beneficial when you need a similar distribution for arguments with wildly different ranges (or cardinality) For example: 'IP Address' (0...FFFFFFFF) and 'Country code' (0...FF). As with the encode function, this is limited to 8 numbers at most.

Example

Query:

Result:

Example

It is also possible to shrink one argument:

Query:

Result:

Example

You can also use column names in the function.

First create the table and insert some data.

Query:

Use column names instead of constants as function arguments to mortonDecode

Query:

Result:

hilbertEncode

Calculates code for Hilbert Curve for a list of unsigned integers.

The function has two modes of operation:

  • Simple
  • Expanded

Simple mode

Simple: accepts up to 2 unsigned integers as arguments and produces a UInt64 code.

Syntax

Parameters

Returned value

  • A UInt64 code

Type: UInt64

Example

Query:

Result:

Expanded mode

Accepts a range mask (tuple) as a first argument and up to 2 unsigned integers as other arguments.

Each number in the mask configures the number of bits by which the corresponding argument will be shifted left, effectively scaling the argument within its range.

Syntax

Parameters

Note: when using columns for args the provided range_mask tuple should still be a constant.

Returned value

  • A UInt64 code

Type: UInt64

Example

Range expansion can be beneficial when you need a similar distribution for arguments with wildly different ranges (or cardinality) For example: 'IP Address' (0...FFFFFFFF) and 'Country code' (0...FF).

Query:

Result:

Note: tuple size must be equal to the number of the other arguments.

Example

For a single argument without a tuple, the function returns the argument itself as the Hilbert index, since no dimensional mapping is needed.

Query:

Result:

Example

If a single argument is provided with a tuple specifying bit shifts, the function shifts the argument left by the specified number of bits.

Query:

Result:

Example

The function also accepts columns as arguments:

Query:

First create the table and insert some data.

Use column names instead of constants as function arguments to hilbertEncode

Query:

Result:

implementation details

Please note that you can fit only so many bits of information into Hilbert code as UInt64 has. Two arguments will have a range of maximum 2^32 (64/2) each. All overflow will be clamped to zero.

hilbertDecode

Decodes a Hilbert curve index back into a tuple of unsigned integers, representing coordinates in multi-dimensional space.

As with the hilbertEncode function, this function has two modes of operation:

  • Simple
  • Expanded

Simple mode

Accepts up to 2 unsigned integers as arguments and produces a UInt64 code.

Syntax

Parameters

  • tuple_size: integer value no more than 2.
  • code: UInt64 code.

Returned value

  • tuple of the specified size.

Type: UInt64

Example

Query:

Result:

Expanded mode

Accepts a range mask (tuple) as a first argument and up to 2 unsigned integers as other arguments. Each number in the mask configures the number of bits by which the corresponding argument will be shifted left, effectively scaling the argument within its range.

Range expansion can be beneficial when you need a similar distribution for arguments with wildly different ranges (or cardinality) For example: 'IP Address' (0...FFFFFFFF) and 'Country code' (0...FF). As with the encode function, this is limited to 8 numbers at most.

Example

Hilbert code for one argument is always the argument itself (as a tuple).

Query:

Result:

Example

A single argument with a tuple specifying bit shifts will be right-shifted accordingly.

Query:

Result:

Example

The function accepts a column of codes as a second argument:

First create the table and insert some data.

Query:

Use column names instead of constants as function arguments to hilbertDecode

Query:

Result: