← Back to VOLUME 14, ISSUE 4, APRIL 2025
This work is licensed under a Creative Commons Attribution 4.0 International License.
Transformer Visualizer
Akarshan Gupta, Karthikeyen Nair, Yash Rawat, Sumit Sharma, Avinash Sonule
DOI: 10.17148/IJARCCE.2025.14442
Abstract:
This paper focuses on unraveling the inner work- ings of the transformer architecture, a cornerstone of modern enabling parallel processing and long-range dependency cap- ture. From this seminal work, wโe adopt the core attention large language models (LLMs). While transformers have driven mechanism formula (Q ร KT )/ย dk and the multi-head at- breakthroughs in natural language processing through self- attention mechanisms, their internal operations remain complex and opaque. Using GPT-2 as an illustrative case study, we develop an interactive visualization framework to map information flow, display attention patterns, and illustrate token embeddings and layer interactions. These visualizations aim to deepen compre- hension of transformer mechanics, enhance model transparency, and guide future advancements in AI design.Keywords:
Transformer Architecture (TA): Neural network ar- chitecture based on self-attention mechanisms; Large Language Models (LLMs): Advanced AI models trained on vast text datasets; Natural Language Processing (NLP): AI technology for understanding and processing human language; Self-Attention Mechanism (SAM): Method allowing models to weigh importance of different input elements.๐ 23 views
How to Cite:
[1] Akarshan Gupta, Karthikeyen Nair, Yash Rawat, Sumit Sharma, Avinash Sonule, โTransformer Visualizer,โ International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2025.14442
