Running C/C++ Code within Python: Bridging Efficiency and Flexibility

In the realm of software development, efficiency and performance often stand as paramount objectives, especially in computationally intensive tasks. Python, with its simplicity and vast ecosystem, is a go-to language for many developers. However, when it comes to raw performance, languages like C and C++ hold the upper hand due to their low-level operation and minimal abstraction overhead. This article explores how to harness the computational efficiency of C/C++ within the Python environment, providing a detailed guide on integrating C/C++ code into Python applications.

The motivation for integrating C/C++ code into Python is to combine the best of both worlds: Python’s ease of use and C/C++'s execution speed. Such integration is particularly beneficial for tasks that are processor-intensive, like data processing, machine learning, and real-time applications. By offloading the heavy computational parts to C/C++, applications can achieve significant performance improvements while retaining Python's simplicity for higher-level program logic. There are many ways to do this, one simple way is introduced in this article.

To illustrate the process, we'll integrate a simple C/C++ function into a Python script. The example focuses on an Exponential Moving Average (EMA) calculation, a common operation in financial analysis and signal processing.

Python Implementation

In Python, the EMA would typically be implemented by a function like this:

def EMA_py(values, beta):
    N = len(values)
    ema = [0 for _ in range(N)]
    sofar = 0.0
    correction = 0.0

    for i in range(N):
        sofar = beta * values[i] + (1 - beta) * sofar
        correction = beta + (1 - beta) * correction
        ema[i] = sofar / correction

    return ema

C/C++ Implementation

An equivalent C/C++ implementation of the EMA_py function above, would be the following:

#include <stdlib.h>
#include <math.h>
#include <assert.h>
#include <vector>

extern "C" {
    void EMA(double* values, int N, double beta, double** result) {
        std::vector<double> ema(N);
        double sofar = 0.0, correction = 0.0;
        for (int i = 0; i < N; ++i) {
            sofar = beta * values[i] + (1 - beta) * sofar;
            correction = beta + (1 - beta) * correction;
            ema[i] = sofar / correction;
        }
        *result = new double[ema.size()];
        std::copy(ema.begin(), ema.end(), *result);
    }

    void free_double_memory(double* ptr) {
        free(ptr);
    }
}

Python Wrapper to integrate the C/C++ implementation

The code below serves as the Python wrapper that uses the ctypes Python built-in library to load and interact with the C/C++ shared library. The Python code handles the conversion of data types between Python and C/C++, and it defines the interface to the C/C++ functions.

import ctypes
import os
import numpy as np

# Load the shared library containing C/C++ functions
# Search the directory for a file that starts with "c_modules" and ends with ".so"
lib = ctypes.cdll.LoadLibrary([
    os.path.join(os.path.dirname(__file__), f)
    for f in os.listdir(os.path.dirname(__file__))
    if f.startswith("c_modules") and f.endswith(".so")
][0])


def ctypes_double_array_to_ndarray(ptr, size):
    # Convert C array (pointed to by ptr) to a NumPy array and copy it to manage memory correctly
    array = np.ctypeslib.as_array(ptr, shape=(size,)).copy()
    # Free the memory allocated in C to prevent memory leaks
    # Define argument types for the imported C function to ensure correct type handling
    lib.free_double_memory.argtypes = [ctypes.POINTER(ctypes.c_double)]
    lib.free_double_memory(ptr)
    return array


def EMA_c(arr: np.ndarray, beta: float) -> np.ndarray:
    # Ensure input array is of type double for compatibility with C code
    arr = np.array(arr, dtype=np.float64)
    # Prepare a pointer to double to receive the output from C function
    ptr = ctypes.POINTER(ctypes.c_double)()
    # Call the C function, passing the data array, its size, the beta value, and the output pointer
    lib.EMA(
        ctypes.c_void_p(arr.ctypes.data),
        ctypes.c_int(arr.shape[0]),
        ctypes.c_double(beta),
        ctypes.byref(ptr)
    )
    # Convert the returned C array to a NumPy array and return it
    return ctypes_double_array_to_ndarray(ptr, arr.shape[0])

Some important remarks:

The compiled .so module is imported in the Python code in the beginning as the global variable lib.
One needs to be careful about the compatibility, arrays are sent to C/C++ double implementations, hence need to be ensured to have float64 from the Python code.
lib.EMA is what calls the C/C++ code, it is a void function that fills the results to the pointer ptr.
The implemented function ctypes_double_array_to_ndarray reads the elements of the ptr, then releases the memory using lib.free_double_memory(ptr). Failing to do this might lead to memory leaks, since the memory block where ptr is pointing is created in the C/C++ code.
The C/C++ functions need to be wrapped in the extern scope.

Running the Integration

In order to run the C/C++ code above within python, the C/C++ must be first compiled, this is done using:

g++ -shared -fPIC -o c_modules.so c_modules/*.cpp

This assumes that all the .cpp, .h files are existing in the directory c_modules, this will compile all the cpp files (and the imported h files, if needed). The compilation creates a shared object file (.so), which the Python script can load and interact with.

After that, calling the EMA_c function above (in the Python code) would run the compiled C/C++ implementation of EMA imported inside EMA_c in the Python code.

Efficiency of C/C++ vs. Python Implementations

Here is a table with the different runtimes (in seconds) for the size of the input array N.

N	C/C++ runtime (seconds)	Python runtime (seconds)
100	0.00018	0.00015
1,000	0.00020	0.00053
10,000	0.00041	0.00446
100,000	0.00431	0.04830
1,000,000	0.03830	0.45708
10,000,000	0.55090	4.58128
100,000,000	5.75553	49.16737

This output demonstrates the successful integration and execution of C/C++ code within Python, showcasing the efficiency and speed of C/C++ operations compared to pure Python implementations.

Summary

Integrating C/C++ code with Python combines Python’s developer-friendly nature with the raw speed and efficiency of C/C++. This synergy allows for the development of applications that are both easy to manage and performance-optimized. Following the guidelines outlined in this article, developers can enhance their Python applications, ensuring they are not only functional but also performant where it counts.

This implementation can be inferior to other methods that use Cython of Python-dev library in terms of handling complex objects and complex integrations, however, its ease of integration can be an attractive aspect in simple usecases.

Created on 2024-03-19 at 14:00