Skip to content

Reindex

tablite.reindex

Classes

Functions

tablite.reindex.reindex(T, index, names=None, tqdm=_tqdm, pbar=None)

Constant Memory helper for reindexing pages.

Memory usage is set by datatype and Config.PAGE_SIZE

PARAMETER DESCRIPTION
T

subclass of Table

TYPE: Table

index

int64.

TYPE: array

names

list of names from T to reindex.

TYPE: (list, str) DEFAULT: None

tqdm

Defaults to _tqdm.

TYPE: tqdm DEFAULT: tqdm

pbar

Defaults to None.

TYPE: pbar DEFAULT: None

RETURNS DESCRIPTION
_type_

description

Source code in tablite/reindex.py
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
def reindex(T, index, names=None, tqdm=_tqdm, pbar=None):
    """Constant Memory helper for reindexing pages.

    Memory usage is set by datatype and Config.PAGE_SIZE

    Args:
        T (Table): subclass of Table
        index (np.array): int64.
        names (list, str): list of names from T to reindex.
        tqdm (tqdm, optional): Defaults to _tqdm.
        pbar (pbar, optional): Defaults to None.

    Returns:
        _type_: _description_
    """
    if names is None:
        names = list(T.columns.keys())

    if pbar is None:
        total = len(names)
        pbar = tqdm(total=total, desc="join", disable=Config.TQDM_DISABLE)

    sub_cls_check(T, BaseTable)
    cls = type(T)
    result = cls()
    for name in names:
        result.add_column(name)
        col = result[name]

        for start, end in Config.page_steps(len(index)):
            indices = index[start:end]
            values = T[name].get_by_indices(indices)
            # in these values, the index of -1 will be wrong.
            # so if there is any -1 in the indices, they will
            # have to be replaced with Nones
            mask = indices == -1
            if np.any(mask):
                nones = np.full(index.shape, fill_value=None)
                values = np.where(mask, nones, values)
            col.extend(values)
        pbar.update(1)

    return result