How to Use PyMDE for Minimum-Distortion Embedding

Jun 23, 2024 | Data Science

PyMDE is an exciting Python library that allows you to perform Minimum-Distortion Embedding (MDE) efficiently. Whether you’re visualizing data, generating feature vectors, or creating custom embeddings, this guide is here to help you navigate the essentials of getting started with PyMDE!

Installation

To start using PyMDE, you first need to install it. You can choose to install it using either Pip or Conda.

  • Install with Pip:
    pip install pymde
  • Install with Conda:
    conda install -c pytorch -c conda-forge pymde

Ensure you have the following requirements:

  • Python = 3.7
  • numpy = 1.17.5
  • scipy
  • torch = 1.7.1
  • torchvision = 0.8.2
  • pynndescent
  • requests

Getting Started with PyMDE

Using PyMDE is straightforward, especially with two main functions that will be your go-to for embeddings:

  • Preserve Neighbors: This function maintains the local structure of the original data.
  • Preserve Distances: This function preserves pairwise distances in the original data.

To use these functions, the input should be a data matrix or a graph representing pairwise distances. You can specify the embedding dimension with the embedding_dim keyword argument (default is 2).

An Analogy to Understand PyMDE

Imagine you have a large piece of clay (your dataset) and a set of molds (embedding dimensions). Using a mold allows you to shape the clay into various forms while trying to retain its basic features. With PyMDE, those molds include the ability to preserve local relationships between data points (preserve neighbors) or the overall shape (preserve distances).

Preserving Neighbors Example

Here’s how you can use PyMDE to create an embedding of the MNIST dataset, which includes images of handwritten digits:

import pymde
mnist = pymde.datasets.MNIST()
embedding = pymde.preserve_neighbors(mnist.data, verbose=True).embed()
pymde.plot(embedding, color_by=mnist.attributes[digits])

Preserving Distances Example

If you’re interested in maintaining global structures rather than just local ones, take a look at the following example with an academic co-authorship network:

import pymde
google_scholar = pymde.datasets.google_scholar()
embedding = pymde.preserve_distances(google_scholar.data, verbose=True).embed()
pymde.plot(embedding, color_by=google_scholar.attributes[coauthors], color_map=viridis, background_color=black)

Example Notebooks

For more detailed examples, check out the example notebooks which showcase PyMDE on various datasets.

Troubleshooting

If you encounter issues while using PyMDE, or if you have feedback, don’t hesitate to reach out by filing a Github issue. Remember to keep your libraries and dependencies up to date for smooth operations. For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Conclusion

Now that you’re equipped with the knowledge of installing and using PyMDE, dive into your datasets and start creating meaningful embeddings with ease!

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox