Skip to content

Unable to register UDAFs using SessionContext's register_udaf #874

@emanueledomingo

Description

@emanueledomingo

Describe the bug
During the update of Datafusion from 39 to 41, my script got broken because the register_udaf crashes witht he following error:

in SessionContext.register_udaf(self, udaf)
    829 def register_udaf(self, udaf: AggregateUDF) -> None:
    830     """Register a user-defined aggregation function (UDAF) with the context."""
--> 831     self.ctx.register_udaf(udaf._udaf)

AttributeError: 'AggregateUDF' object has no attribute '_udaf'

To Reproduce
Steps to reproduce the behavior:

import datafusion as df
import pyarrow
import pyarrow.compute as pc
from typing import List

class AverageAccumulator(df.Accumulator):
    def __init__(self):
        self._sum = 0
        self._count = 0

    def update(self, values: pyarrow.Array) -> None:
        self._sum += pc.sum(values).as_py()
        self._count += len(values)

    def merge(self, other) -> None:
        self._sum += other._sum
        self._count += other._count

    def state(self) -> pyarrow.Array:
        return pyarrow.array([self._sum, self._count])

    def evaluate(self) -> pyarrow.Scalar:
        return pyarrow.scalar(self._sum / self._count)

average_udaf = df.udaf(
    AverageAccumulator,
    pyarrow.float64(),
    pyarrow.float64(),
    [pyarrow.float64()],
    'stable'
)

ctx = df.SessionContext()

ctx.register_udaf(average_udaf)

Additional context
The bug is introduce in datafusion 40.1.0

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions