let me rule to you: Israel GPT

da_cow (she/her)@feddit.org · 2 days ago

let me rule to you: Israel GPT

SlurpingPus@lemmy.world · edit-2 2 days ago

chosen based on that distribution and fed back in

Do I understand it correctly that the LLM’s state is changed after execution? That does sorta mean that it’s effectively non-deterministic, though probably not as severely as with an RNG plugged in (depending on the algorithm).

SparroHawc@lemmy.zip · 2 days ago

The only thing that changes is the data that is passed to the LLM, which for each iteration includes the last token that the LLM itself generated. So yes, sort of. The LLM itself doesn’t change state; just the data that is fed into it.

It’s also non-deterministic insofar as similar inputs will not necessarily give similar outputs. The only way to actually predict its output is to use the exact same input - and then you only get identical token probability lists on the other end. Every LLM chatbot, by default, will then make a random selection based on those probabilities. It can be set to always pick the most probable token, but this can cause problems.

qqq@lemmy.world · edit-2 2 days ago

There must be an RNG to choose the next token based on the probability distribution, that is where non-determinism comes in, [edit: unless the temperature is 0 which would make the entire process deterministic]. The neural networks themselves though are 100% deterministic.

I understand that could be seen as an “akschually” nitpick, but I think it’s an important point, as it is at least theoretically possible to understand that underlying determinism.

SlurpingPus@lemmy.world · 2 days ago

Well, technically users’ input could serve as the source of randomness, if it’s fed into modifying the internal state. Basically, a redditor is trying to interrogate the LLM as to whether Israel is bad, while someone on line 2 is teaching the LLM “I am Cornholio”. We already know how it goes when a chatbot is learning from its users, and generally the effect could vary arbitrarily from a nothingburger to a chaos-theory mess.

qqq@lemmy.world · 2 days ago

I don’t think it’s typical to consider user input a source of randomness. Are you talking about in context learning and thinking about what would happen if those contexts get crossed? If so, contexts are unique to a session and do not cross between them for something like ChatGPT/Claude.