Recursion in OpenCL kernels

To complete our stack we just need to set it up before use. Enough local memory needs to allocated so that each thread in a workgroup can access maxStackSize elements, i.e. maxStackSize * numItemsInWorkGroup * sizeof(int) bytes.

__kernel void useStack( __local int* pStackData, size_t maxStackSize, ... )
{
    size_t localId = get_local_id(0);

    Stack stack;
    stackInit( &stack, &pStackData[localId * maxStackSize], maxStackSize );

    ... etc.
}

A Simple OpenCL Example

Armed with a stack and some knowledge of how functions are called here’s a simple OpenCL function to demonstrate “calling” a function via a stack.

int simple_switch( Stack* pStack, int number )
{
    int finalValue = 0; // the value we will ultimately return from this function

    // the various valid labels we can have in our "program"
    typedef enum
    {
        ENTRY_POINT,
        PRE_FUNCTION_CALL,
        POST_FUNCTION_CALL,
        EXIT_POINT,
        FUNCTION_TO_CALL
    } ProgramLabels;

    ProgramLabels pc = ENTRY_POINT; // the program counter
    bool bDone = false;             // has "program" execution completed

    int returnValue;                // a return value "register"

    while(!bDone)
    {
        switch (pc)
        {
            case ENTRY_POINT:
                // do some stuff

            case PRE_FUNCTION_CALL:
                // get ready for function "call"

                // push variable(s) onto pStack
                stackPush(pStack, number);

                // push "program counter" return label
                stackPush( pStack, (int)POST_FUNCTION_CALL );

                // "jump" to function
                pc = FUNCTION_TO_CALL;
                break;

            case POST_FUNCTION_CALL:
                // clean up pStack
                stackPop(pStack, 1);
                stackPop(pStack, 1);

                // use the returned value
                finalValue = returnValue * 10;

            case EXIT_POINT:
                // mark that we've done
                bDone = true;
                break;

            case FUNCTION_TO_CALL:
            {
                // where to return to
                ProgramLabels returnAddress = (ProgramLabels) stackTop(pStack, 0);
                stackPop(pStack, 1);

                // get variables from pStack
                int localNumber = stackTop(pStack, 0);
                stackPop(pStack, 1);

                // do some work in the "function"
                //
                localNumber++;

                // "return" a value
                returnValue = localNumber;

                // "jump" to return label
                pc = returnAddress;
                break;
            }

            default:
                // should never get here so just abort
                bDone = true;
                break;
        }
    }

    // return the final calculated value
    return finalValue;
}

Call it from a kernel something like

__kernel void recurse( __global int* input, __global int* output, __local int* pStackData, size_t maxStackSize )
{
    size_t localId = get_local_id(0);

    Stack stack;
    stackInit( &stack, &pStackData[localId * maxStackSize], maxStackSize );

    size_t i = get_global_id(0);
    output[i] = simple_switch( &stack, input[i] );
}

The function doesn’t really do much other than demonstrate how to pull an argument from the stack, modify it and return the modified value. The change in program counter are done via the labels within the switch statement.

One thought on “Recursion in OpenCL kernels

Leave a Reply

Your email address will not be published. Required fields are marked *