Skip to main content

Data Types

DataFusion uses Arrow, and thus the Arrow type system, for query execution. The SQL types from sqlparser-rs are mapped to Arrow data types according to the following table. This mapping occurs when defining the schema in a CREATE EXTERNAL TABLE command or when performing a SQL CAST operation.

You can see the corresponding Arrow type for any SQL expression using the arrow_typeof function. For example:

select arrow_typeof(interval '1 month');
+---------------------------------------------------------------------+
| arrow_typeof(IntervalMonthDayNano("79228162514264337593543950336")) |
+---------------------------------------------------------------------+
| Interval(MonthDayNano) |
+---------------------------------------------------------------------+

You can cast a SQL expression to a specific Arrow type using the arrow_cast function For example, to cast the output of now() to a Timestamp with second precision:

select arrow_cast(now(), 'Timestamp(Second, None)');
+---------------------+
| now() |
+---------------------+
| 2023-03-03T17:19:21 |
+---------------------+

Character Types

SQL DataTypeArrow DataType
CHARUtf8
VARCHARUtf8
TEXTUtf8
STRINGUtf8

Numeric Types

SQL DataTypeArrow DataTypeNotes
TINYINTInt8
SMALLINTInt16
INT or INTEGERInt32
BIGINTInt64
TINYINT UNSIGNEDUInt8
SMALLINT UNSIGNEDUInt16
INT UNSIGNED or INTEGER UNSIGNEDUInt32
BIGINT UNSIGNEDUInt64
FLOATFloat32
REALFloat32
DOUBLEFloat64
DECIMAL(precision, scale)Decimal128(precision, scale)Decimal support is currently experimental (#3523)

Date/Time Types

SQL DataTypeArrow DataType
DATEDate32
TIMETime64(Nanosecond)
TIMESTAMPTimestamp(Nanosecond, None)
INTERVALInterval(IntervalMonthDayNano)

Boolean Types

SQL DataTypeArrow DataType
BOOLEANBoolean

Binary Types

SQL DataTypeArrow DataType
BYTEABinary

You can create binary literals using a hex string literal such as X'1234' to create a Binary value of two bytes, 0x12 and 0x34.

Unsupported SQL Types

SQL Data TypeArrow DataType
UUIDNot yet supported
BLOBNot yet supported
CLOBNot yet supported
BINARYNot yet supported
VARBINARYNot yet supported
REGCLASSNot yet supported
NVARCHARNot yet supported
CUSTOMNot yet supported
ARRAYNot yet supported
ENUMNot yet supported
SETNot yet supported
DATETIMENot yet supported

Supported Arrow Types

The following types are supported by the arrow_typeof function:

Arrow Type
Null
Boolean
Int8
Int16
Int32
Int64
UInt8
UInt16
UInt32
UInt64
Float16
Float32
Float64
Utf8
LargeUtf8
Binary
Timestamp(Second, None)
Timestamp(Millisecond, None)
Timestamp(Microsecond, None)
Timestamp(Nanosecond, None)
Time32
Time64
Duration(Second)
Duration(Millisecond)
Duration(Microsecond)
Duration(Nanosecond)
Interval(YearMonth)
Interval(DayTime)
Interval(MonthDayNano)
FixedSizeBinary(<len>) (e.g. FixedSizeBinary(16))
Decimal128(<precision>, <scale>) e.g. Decimal128(3, 10)
Decimal256(<precision>, <scale>) e.g. Decimal256(3, 10)