Tokenizer playground

See how GPT, LLaMA, and other models chop your prompt into tokens.

Pick a model, type a prompt, and visualize the token boundaries, IDs, and vocabulary size.

Model

Choose a tokenizer

Idle

Vocabulary size

Tokens produced

Special tokens

Visualization

Token stream

Run a prompt to see color-coded tokens appear here.

#TokenIDType
No tokens yet.
What happens to tokens after this?Click to read the next steps

After tokenization, each token ID is mapped to an embedding vector that captures learned semantic and positional information. These vectors form the initial input sequence for the transformer stack.

The sequence is processed by repeated transformer layers. Within each layer, self-attention mixes information across positions so each token can borrow context from others, and a position-wise feed-forward network refines every token's representation. Residual connections and normalization help the model stay stable across depth.

The final hidden states feed into a projection head (for logits or other tasks), which turns the refined vectors back into probabilities over the vocabulary or other outputs. In autoregressive models, the process repeats for each new generated token.