Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have a large collection of images, many being accessible through google image search.

I wonder if there could be a way to "index" those images so I can find them back without storing the whole image, using some type of clever image histogram or hashing-kind function.

I wonder if that thing already exist, since there are many images, and since most images have a lot of difference in their data, could it be possible to create some kind of function that describe an image in a way that entering such histogram redirects to (or the closest) the image it indexed? I guess I'm lacking the math, but it sounds like some "averaging" hashing function.



That's perceptual hashing. Check out https://www.phash.org/


Is there simpler way to implement it? This is a library, but aren't more common ways to do this?


so will this do something like image recognition? ie does it work as well as surf/sift?


Perceptual hashing is useful for copy detection. Its not robust to changes/transformations nor do the hashes encode any semantic information.


This is the current approach for large sale image retrieval. By using some model to extract features and then performing distance calculations. This is usually done with hashing once speed and the size of the dataset become large.


I am developing a Visual Search engine with pluggable indexing models.

https://www.deepvideoanalytics.com




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: