Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's not very sexy, but I think you might find it easier and more robust just to use an NLP library.

I built something similar (albeit for a relatively limited database of recipes) for a hackathon a couple of weeks back. I didn't even use a proper NLP library, just some simple hand-rolled pattern-matching, and got pretty good results.

Good luck!



I think you're right. Did you happen to open-source your code from the hackathon? I'd love to take a look at your approach if you don't mind.


Sorry, I normally would but one of the other team members is considering taking the hack forward and wanted to keep it closed for now. (It's hard to see how much competitive advantage he'd have from 48 hours of very-hacked-together code, but so few hackathon projects get taken forward that I didn't want to discourage him!)

The approach was to tokenize the input and then do basic pattern-matching on it, with separate dictionaries of quantity units (e.g. cup, oz, pound) ingredients, processing words (e.g. "chopped") and throw-away words (e.g. "of"). In fact, possibly the most complicated part was parsing "2.5", "2 and a half" and "2½" all to the same thing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: