If you use humans to fine tune and judge the quality of output, then in some sense, that’s pretty much all the AI can possibly do.
Everyone can see the output, “I don’t know.” and mark it zero (dude). But the meat bags will definitely end up rewarding the model if it instead generates some plausible nonsense.
I’m using the bathroom right now, have only gotten 3 hours of sleep at most, and am browsing Lemmy so that I don’t get a panic attack over potentially not having my medication for two weeks…
lemmyshitpost
Hot
This magazine is from a federated server and may be incomplete. Browse more on the original instance.