Appendix A. Supported data types in PyTables

All PyTables datasets can handle the complete set of data types supported by the NumPy (see [8]), numarray (see [10]) and Numeric (see [9]) packages in Python. The data types for table fields can be set via instances of the Col class and its descendants (see Section 4.13.2), while the data type of array elements can be set through the use of the Atom class and its descendants (see Section 4.13.1).

PyTables uses ordinary strings to represent its types, with most of them matching the names of NumPy scalar types. Usually, a PyTables type consists of two parts: a kind and a precision in bits. The precision may be omitted in types with just one supported precision (like bool) or with a non-fixed size (like string).

There are eight kinds of types supported by PyTables:

The time and enum kinds are a little bit special, since they represent HDF5 types which have no direct Python counterpart, though atoms of these kinds have a more-or-less equivalent NumPy data type.

There are two types of time: 4-byte signed integer (time32) and 8-byte double precision floating point (time64). Both of them reflect the number of seconds since the Unix epoch, i.e. Jan 1 00:00:00 UTC 1970. They are stored in memory as NumPy's int32 and float64, respectively, and in the HDF5 file using the H5T_TIME class. Integer times are stored on disk as such, while floating point times are split into two signed integer values representing seconds and microseconds (beware: smaller decimals will be lost!).

PyTables also supports HDF5 H5T_ENUM enumerations (restricted sets of unique name and unique value pairs). The NumPy representation of an enumerated value (an Enum, see Section 4.14.3) depends on the concrete base type used to store the enumeration in the HDF5 file. Currently, only scalar integer values (both signed and unsigned) are supported in enumerations. This restriction may be lifted when HDF5 supports other kinds on enumerated values.

Here you have a quick reference to the complete set of supported data types:

Type CodeDescriptionC TypeSize (in bytes)Python Counterpart
boolbooleanunsigned char1bool
int88-bit integersigned char1int
uint88-bit unsigned integerunsigned char1int
int1616-bit integershort2int
uint1616-bit unsigned integerunsigned short2int
int32integerint4int
uint32unsigned integerunsigned int4long
int6464-bit integerlong long8long
uint64unsigned 64-bit integerunsigned long long8long
float32single-precision floatfloat4float
float64double-precision floatdouble8float
complex64single-precision complexstruct {float r, i;}8complex
complex128double-precision complexstruct {double r, i;}16complex
stringarbitrary length stringchar[]*str
time32integer timePOSIX's time_t4int
time64floating point timePOSIX's struct timeval8float
enumenumerated valueenum--

Table A.1. Data types supported for array elements and tables columns in PyTables.