Tokens are what an LLM actually predicts, one after another. If you have very slow models that produce less than 5 tokens/s or so, you can easily follow it with your own eyes. A token is what appears at once. Often it is an entire word but it can also be parts of a word or individual letters, digits, special signs for uncommone words or special formatting, number stuff.
Tokens are what an LLM actually predicts, one after another. If you have very slow models that produce less than 5 tokens/s or so, you can easily follow it with your own eyes. A token is what appears at once. Often it is an entire word but it can also be parts of a word or individual letters, digits, special signs for uncommone words or special formatting, number stuff.