LENS — Interactive Token Visualizer
LENS isolates individual token contributions in a text-to-image model by running the diffusion model with each token's embedding kept while all others are replaced by padding. This reveals which tokens drive which visual concepts in the generated image.
Running on flux-schnell with the T5 (text encoder).
Tokenization
4 tokens | model: flux-schnell | tokenizer: T5 (text encoder)
#0 pe [158]#1 lic [2176]#2 an [152]#3 </s> [1]
#0 pe [158]#1 lic [2176]#2 an [152]#3 </s> [1]
Generation
1 4
Example: 'pelican' with Flux Schnell + T5
| Prompt | Images per prompt | Full prompt | Per token |
|---|
First generation loads model weights (~1 min). Subsequent runs are fast.