Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you ever get back into it, I bet you can 10x to 20x that speed with augmented labeling.

Once you have >100 frames labeled you can put a classifier in the loop and only have to label the % of frames it gets wrong.

I usually set up a view with 10x10 samples only containing samples labeled as a single class by the classifier, then I mark those it got wrong as unlabeled and move on to the next batch. With an 80% accurate classifier you can get 80 samples labeled every 5 seconds or so.

And if you retrain the classifier regularly on the newly labeled samples you can improve its accuracy and the speed of labeling with it.

PS: congrats on the son!



Are they any tools you recommend for augmented labeling? The ones I looked at seemed a bit hard to get started with.


A data scientist friend of mine had some success with Figure8 but I haven't used it myself.

Honestly I always roll my own, it's dead fast to throw a simple GUI together in tkinter and it makes it easy to integrate your own models and custom sample rendering/plotting.

That is if you're doing simple discrete class labeling, as opposed to more complex labeling like box-labeling for image segmentation, or text2speech labeling for example.


Thanks for the insight, this is really useful!


Thx for sharing the link to your work. Interesting idea. A few thoughts:

1) Congrats on the birth of your child. I imagine you have zero time now but as they get older, you start getting your time back. I went through this and now I can sneak in personal projects while the kids are in their activities, late at night, early am. Be aware your body and stamina declines as you age. I am getting close to mid 40s and I can feel it.

2) The previous poster presented an interesting idea about putting a basic classifier in the loop. The challenge is how do you if the classifier gets it wrong. Confidence scores you get from the logits are extremely flakey. I think one solution is to metric learning methods (contrastive loss instead of cross entropy). I have seen some papers that dance around this but have not seen anything fully baked from a scientific perspective.

3) Your task is an interesting action recognition task. You should seriously consider putting it on kaggle or write a paper on it (and release the dataset). The easy off-the-shelf model you could try on this data for a video classification task like this is possibly X3D. But there are a variety of other methods (I'm a researcher in the field).




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: