Unraveling Non-Euclidean Distance Matrices: A Guide to Dimensionality Reduction

Unraveling Non-Euclidean Distance Matrices: A Guide to Dimensionality Reduction

When working with approximate Gaussian Processes in Stan, we often encounter non-Euclidean distance matrices that can’t be changed due to theoretical constraints. For instance, the cophenetic distance of a tree is a great example of such a matrix. The question is, how do we re-project these non-Euclidean matrices into Euclidean space, which is a requirement for many algorithms, including approximate GP?

I’ve tried using regular MDS, but the resulting distance matrix seems quite off when compared to the original. Stacked autoencoders didn’t yield meaningful results either. So, what’s the best dimensionality reduction technique to use in this scenario?

In this post, we’ll explore some options and discuss the pros and cons of each. Whether you’re working with tree metrics or other non-Euclidean distance matrices, this guide will help you find the least bad (or best) dimensionality reduction technique for your problem.

Leave a Comment

Your email address will not be published. Required fields are marked *