PyKX allows you to call Python functions from kdb+/q (and vice versa), enabling powerful data analysis using the rich ecosystem of libraries available in Python. In this post, I will show how you can invoke a Lasso Regression function in Python by passing a table from a q script.

**1. Install PyKX**

pip install pykx

**2. Create a Python (.p) file**

Create a Python file called `lasso.p`

containing a function that takes a Pandas DataFrame, performs Lasso regression (using the scikit-learn machine learning library), and returns a vector of coefficients.

# lasso.p import numpy as np import pandas as pd from sklearn.linear_model import Lasso def lasso_regression(df): X = df.iloc[:, :-1] y = df.iloc[:, -1] # Perform Lasso regression lasso = Lasso(alpha=0.1) lasso.fit(X, y) coefficients = np.append(lasso.coef_, lasso.intercept_) return coefficients

**3. Invoke the Python function from a q script**

Next, write a q script that generates a table of random data and invokes the Python function with it.

// Load pykx and the python file \l /path/to/python/site-packages/pykx/pykx.q \l lasso.p // Create a sample table with random x and y values n:100; x:n?10f; y:2*x+n?2f; data:([]x;y); // Call the python function qfunc:.pykx.get[`lasso_regression;<]; coefficients:qfunc data;

**Conversion of data types between kdb+/q and Python**

When transferring data between q and Python, PyKX applies "default" type conversions. For instance, tables in q are automatically converted to Pandas DataFrames, and lists are converted to NumPy arrays. You can call `.pykx.setdefault`

to change the default conversion type to Pandas, Numpy, Python, or PyArrow. PyKX also provides functions to convert q data types to specific Python types, such as `.pykx.tonp`

which tags a q object to be converted to a NumPy object. The following code illustrates type conversion:

q) .pykx.util.defaultConv "default" // lists are converted to NumPy arrays by default q) .pykx.print .pykx.eval["lambda x: type(x)"] til 10 <class 'numpy.ndarray'> // tables are converted to Pandas DataFrames by default q) .pykx.print .pykx.eval["lambda x: type(x)"] ([] foo:1 2) <class 'pandas.core.frame.DataFrame'> // change default conversion to NumPy q) .pykx.setdefault["Numpy"] // tables are NumPy arrays now q) .pykx.print .pykx.eval["lambda x: type(x)"] ([] foo:1 2) <class 'numpy.recarray'> // change default conversion to Python q) .pykx.setdefault["Python"] // tables are converted to dict when using Python conversion q) .pykx.print .pykx.eval["lambda x: type(x)"] ([] foo:1 2) <class 'dict'> // tag a q object as a Pandas DataFrame q) .pykx.print .pykx.eval["lambda x: type(x)"] .pykx.topd ([] foo:1 2) <class 'pandas.core.frame.DataFrame'>

## No comments:

## Post a Comment

Note: Only a member of this blog may post a comment.