Hacker Newsnew | past | comments | ask | show | jobs | submit | the_tli's commentslogin

Anyone aware of a practical how to on implementing a data flywheel for fine-tuning (improving the model with user feedback)?


Not seen a great explainer on this yet.

You'd either need access to the model weights or a fine-tuning API.

Then depending on which fine-tuning approach you want to use, the user data you need to collect will be different: RLHF requires multiple outputs to a single query vs instruction fine-tuning where you need great input-output pairs to train on. You could ask the user's feedback after running the LLM to pick out good training data.


Thanks for sharing. Why is the training dataset that contains instructions and output wrapped by another enclosing prompt (https://github.com/minosvasilias/godot-dodo/blob/f62b90a4622...)

Why does this even work when during inference this wrapping prompt is absent? Wouldnt the model then work best against a inference prompt that follows the wrapping prompt structure, however the desired outcome is to have a model that just works without the wrapping prompt?

Edit: see reply from OP, the wrapping prompt is used for inference as well, so misunderstanding on my part


The wrapping prompt is also used during inference. (https://github.com/minosvasilias/godot-dodo/blob/f62b90a4622...) Prompting like this is useful for instruct-finetunes, and similar prompts are used by other projects like stanford-alpaca.


Thanks for the clarification, makes sense now!


It's using Google Slides[0] which then allows to export to PPTX. There may be other better options.

[0] https://developers.google.com/apps-script/guides/slides


Obvious miss, fixed.


So you corrected by ensuring to censor genesis as well?



OP here. The purpose here of the "health" filter, which in its current state is fairly basic, is to to provide minimum protection for sensitive topics. For example, there are a larger number of health related requests and at the current state of quality/correctness of output I found it to be irresponsible to generate such content.


This may be an indication that it's at least useless, if not irresponsible, in other contexts too.


Can you let me know your prompt so I investigate on the images? It's currently using Unsplash which has a very permissible license. I need change the endpoint from Unsplash Source to their API which then allows to do proper attribution.

Thank you slides are being added depending on the mood/temperature of the model :)

Will adjust the prompt to include proper agenda and Thank you note.


You’re welcome to find it from logs — I honestly didn’t keep the prompt, and you seem to not provide the prompt back on the output screen (which would be very helpful for this purpose alone).

https://slidesgpt.com/l/TT58


OP here.

Tried the same with "Create a slide deck on the future of AI - use the iconic style and tone of an Apple Keynote" it generated https://slidesgpt.com/l/NqZ4

Looks like an area to improve on to support different tones and styles


OP here. The sweet spot seems to get off the ground quickly for now. I'm currently prototyping refinements/iterations so once the initial deck is created you can directly add your feedback to redo parts of the deck and gradually improve/steer the output. Thanks for testing and feedback!


Tom from SlidesGPT here.

About 40 days ago I've first posted SlidesGPT on HN (https://news.ycombinator.com/item?id=34503970)

Based on feedback, scalability should be improved now, also the content of slides should have improved (but still varies):

Content improvement (related to https://news.ycombinator.com/item?id=34505183) Old: Why use Hacker News: https://slidesgpt.com/l/X3sU New: Why use Hacker News: https://slidesgpt.com/l/iHPe

Please let me know your feedback, thanks, Tom


Could you explain the censorship on apparently controversial topics? The person that claimed a block on Acinetobacter baumannii for example? Wikipedia tells me it is a hospital infection, and also something that IEDs are known to be associated with infections.


This is one of few demos like this I have tried out that has worked near 100% of its promise with little revisionary effort.

Very impressive, great job!

What was an interesting challenge with this that no one has asked you about yet?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: