When working with non-linear rational features, fitting a linear model can be a challenge. Think of features like ‘time since X happened’, ‘total time spent on the website’, or ‘bid in an auction’. These types of features are unbounded and non-negative, making them perfect candidates for rational transformations.
But here’s the question: where do you place the poles of these rational functions? That’s what I want to explore in this post.
The Goal: A Rational Transformer
I’m not trying to fit a single rational curve to my data. Instead, I want to create a component that can be used in a pipeline to transform features before model fitting, similar to Scikit-Learn’s MinMaxScaler
or SplineTransformer
.
The idea is to create a RationalTransformer
that uses a basis of rational functions to transform my non-linear features into something more linear. But to do that, I need to decide where to place the poles of these rational functions.
The Importance of Pole Placement
Pole placement is crucial because it affects how the rational functions behave. Placing poles near the origin can lead to unstable or oscillatory behavior, while placing them too far away can result in a loss of detail in the transformed features.
So, where should we place the poles? Here are a few options to consider:
- At the origin: This is the simplest approach, but it may not always be the best. Placing poles at the origin can lead to unstable behavior, especially if the data has a large range of values.
- At the median or mean: Placing poles at the median or mean of the data can help stabilize the rational functions. This approach is more robust, but it may still not capture the full range of values in the data.
- At quantiles: Placing poles at quantiles (e.g., 25th, 50th, 75th percentile) can help capture more of the data’s distribution. This approach is more flexible, but it requires more poles and may lead to overfitting.
Conclusion
Placing poles for non-linear rational features is an art that requires careful consideration. There’s no one-size-fits-all solution, and the best approach will depend on the specific problem and data at hand. By understanding the importance of pole placement, we can create more effective rational transformers that help us better model and understand our data.
Further reading: Rational Functions