I’m not talking about numerical data, the way LLMs work is to find a “most likely response” based on the input text. There is absolutely maths happening inside the model, how else do you think they work? I’m not saying they take numbers and find an average.
LLMs are trained on language based content. it doesn’t know how to extract answers from mathematical based problems. it only gives approximations based on model input. it also can be trained wrong based on user input of data.
to a purely mathematical logical operator 2+2=4.
to a LLM if told 2+2=9 it will then always respond with 2+2=9.
LLMs don’t count because they can’t count. without the ability to count it can never understand the proof behind mathematical formulas.
Yes, I understand that, you are not understanding what I’m describing. I am not talking about taking an average of numerical data. LLMs take something that can be thought of as an “average” of text. It says “given all the text I have seen, and this new text input, what’s the most likely output?” In some numerical contexts the expected value is also an average, LLMs find a similar result, and that is what I am drawing a parallel between here.
you’re saying that LLMs average words, and because it averages words, it can consistently return mathematically accurate averages based on empirical data that was provided to it. does that sum it up?
I’m not talking about numerical data, the way LLMs work is to find a “most likely response” based on the input text. There is absolutely maths happening inside the model, how else do you think they work? I’m not saying they take numbers and find an average.
LLMs are trained on language based content. it doesn’t know how to extract answers from mathematical based problems. it only gives approximations based on model input. it also can be trained wrong based on user input of data.
to a purely mathematical logical operator 2+2=4.
to a LLM if told 2+2=9 it will then always respond with 2+2=9.
LLMs don’t count because they can’t count. without the ability to count it can never understand the proof behind mathematical formulas.
Yes, I understand that, you are not understanding what I’m describing. I am not talking about taking an average of numerical data. LLMs take something that can be thought of as an “average” of text. It says “given all the text I have seen, and this new text input, what’s the most likely output?” In some numerical contexts the expected value is also an average, LLMs find a similar result, and that is what I am drawing a parallel between here.
let me make sure I understand.
you’re saying that LLMs average words, and because it averages words, it can consistently return mathematically accurate averages based on empirical data that was provided to it. does that sum it up?