Column#

class pylibcudf.column.Column(obj=None, *args, **kwargs)#

A container of nullable device data as a column of elements.

This class is an implementation of Arrow columnar data specification for data stored on GPUs. It relies on Python memoryview-like semantics to maintain shared ownership of the data it is constructed with, so any input data may also be co-owned by other data structures. The Column is designed to be operated on using algorithms backed by libcudf.

Parameters:
data_typeDataType

The type of data in the column.

sizesize_type

The number of rows in the column.

datagpumemoryview

The data the column will refer to.

maskgpumemoryview

The null mask for the column.

null_countint

The number of null rows in the column.

offsetint

The offset into the data buffer where the column’s data begins.

childrenlist

The children of this column if it is a compound column type.

Methods

all_null_like(Column like, size_type size)

Create an all null column from a template.

child(self, size_type index)

Get a child column of this column.

children(self)

The children of the column.

copy(self)

Create a copy of the column.

data(self)

The data buffer of the column.

from_array(cls, obj)

Create a Column from any object which supports the NumPy or CUDA array interface.

from_array_interface(cls, obj)

Create a Column from an object implementing the NumPy Array Interface.

from_cuda_array_interface(cls, obj)

Create a Column from an object implementing the CUDA Array Interface.

from_scalar(Scalar slr, size_type size)

Create a Column from a Scalar.

list_view(self)

Accessor for methods of a Column that are specific to lists.

null_count(self)

The number of null elements in the column.

null_mask(self)

The null mask of the column.

num_children(self)

The number of children of this column.

offset(self)

The offset of the column.

size(self)

The number of elements in the column.

type(self)

The type of data in the column.

with_mask(self, gpumemoryview mask, ...)

Augment this column with a new null mask.

static all_null_like(Column like, size_type size)#

Create an all null column from a template.

Parameters:
likeColumn

Column whose type we should mimic

sizeint

Number of rows in the resulting column.

Returns:
Column

An all-null column of size rows and type matching like.

child(self, size_type index) Column#

Get a child column of this column.

Parameters:
indexsize_type

The index of the child column to get.

Returns:
Column

The child column.

children(self) list#

The children of the column.

copy(self) Column#

Create a copy of the column.

data(self) gpumemoryview#

The data buffer of the column.

classmethod from_array(cls, obj)#

Create a Column from any object which supports the NumPy or CUDA array interface.

Parameters:
objobject

The input array to be converted into a pylibcudf.Column.

Returns:
Column
Raises:
TypeError

If the input does not implement a supported array interface.

ImportError

If NumPy is not installed.

Notes

  • 1D and 2D C-contiguous device arrays are supported. The data are not copied.

  • For numpy.ndarray, this is not yet implemented.

Examples

>>> import pylibcudf as plc
>>> import cupy as cp
>>> cp_arr = cp.array([[1,2],[3,4]])
>>> col = plc.Column.from_array(cp_arr)
classmethod from_array_interface(cls, obj)#

Create a Column from an object implementing the NumPy Array Interface.

Parameters:
objobject

Must implement the __array_interface__ protocol.

Raises:
NotImplementedError

This method is not yet implemented.

classmethod from_cuda_array_interface(cls, obj)#

Create a Column from an object implementing the CUDA Array Interface.

Parameters:
objobject

Must implement the __cuda_array_interface__ protocol.

Returns:
Column

A Column containing the data from the CUDA array interface.

Raises:
TypeError

If the object does not support __cuda_array_interface__.

ValueError

If the object is not 1D or 2D, or is not C-contiguous. If the number of rows exceeds size_type limit.

NotImplementedError

If the object has a mask.

static from_scalar(Scalar slr, size_type size)#

Create a Column from a Scalar.

Parameters:
slrScalar

The scalar to create a column from.

sizesize_type

The number of elements in the column.

Returns:
Column

A Column containing the scalar repeated size times.

list_view(self) ListColumnView#

Accessor for methods of a Column that are specific to lists.

null_count(self) size_type#

The number of null elements in the column.

null_mask(self) gpumemoryview#

The null mask of the column.

num_children(self) size_type#

The number of children of this column.

offset(self) size_type#

The offset of the column.

size(self) size_type#

The number of elements in the column.

type(self) DataType#

The type of data in the column.

with_mask(self, gpumemoryview mask, size_type null_count) Column#

Augment this column with a new null mask.

Parameters:
maskgpumemoryview

New mask (or None to unset the mask)

null_countint

New null count. If this is incorrect, bad things happen.

Returns:
New Column object sharing data with self (except for the mask which is new).
class pylibcudf.column.ListColumnView(Column col)#

Accessor for methods of a Column that are specific to lists.

Methods

child(self)

The data column of the underlying list column.

offsets(self)

The offsets column of the underlying list column.

child(self)#

The data column of the underlying list column.

offsets(self)#

The offsets column of the underlying list column.

pylibcudf.column.is_c_contiguous(shape: Sequence[int], strides: None | Sequence[int], int itemsize: int) bool#

Determine if shape and strides are C-contiguous

Parameters:
shapeSequence[int]

Number of elements in each dimension.

stridesNone | Sequence[int]

The stride of each dimension in bytes. If None, the memory layout is C-contiguous.

itemsizeint

Size of an element in bytes.

Returns:
bool

The boolean answer.