Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I’m familiar with metric learning within the Mahalanobis family for kNN oriented applications . I’m not getting what use cases this framework targets? Is it custom image search type stuff which may benefit from fine tuning?

What is a realistic minimum viable dataset for an approach like this? When is it not advisable? How does it compare to other more basic approaches?



The main idea is to train a deep learning model to encode a high-dimensional sample to a low-dimensional vector in a latent space. Then it can be used in various downstream tasks such as KNN applications, semantic search, multimodal retrieval, recommendation systems, anomaly detection etc. It's not limited to the image domain --it can be also audios, texts, videos, or more specific entities such as authors, soccer players, songs etc. The size of the dataset can be thought of being similar to other deep learning methods, but you can make a choice among various similarity learning methods based on the size of your dataset or according to whether it's labeled or not. A common approach is (1) to train a base model by using a self-supervised method with a bulk amount of unlabeled data and (2) to finetune it on a more specific domain or task with a smaller labeled dataset. If you can start with a pretrained model such as ResNet or BERT, you can skip the first step.


I’d be surprised if it’d be useful for something like cars or soccer players, or really anything that may not have a continuous mapping. I guess more generally whenever the underlying “true” similarity function is not differentiable — categorical data springs to mind (cars, football players…).

I could see it making sense for complex unstructured data — Qdrant seems to point in that direction.


Some loss functions such as ArcFace loss and CosFace loss enforce the encoder model to organize their latent space in such a way that categories are placed with an angular margin from one another. Thus the model implicitly learns a continuous distance function.

Fun fact, one of the examples in Quaterion is for similar cars search.

If you find this topic and want to discover more, we collected a bunch of resources that might be helpful. https://github.com/qdrant/awesome-metric-learning


To answer my own question:

https://qdrant.tech/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: