Lumen Language Model

Overview

A 128M Parameter Language Model build from Scratch for Education and research purposes.

Category

Artificial Intelligence

Explore

Documentaion

A 128M Parameter Language Model

LumenBase is a 128M-parameter transformer language model built from scratch using PyTorch, featuring a custom tokenizer and GQA-based architecture. It includes a complete training and evaluation pipeline, achieving competitive scores on ARC and HellaSwag reasoning benchmarks.

LumenBase is a 128M-parameter transformer language model developed entirely from scratch using PyTorch. The project includes a custom tokenizer, complete data pipeline, and a transformer architecture featuring Grouped Query Attention (GQA). Training was performed on an NVIDIA H100 GPU for around 10 hours using mixed-precision (FP16/BF16), gradient accumulation, and standard optimization techniques.

The model was evaluated on multiple reasoning benchmarks, achieving ARC-Easy 39.48%, ARC-Challenge 23.55%, and HellaSwag 32.62%. The repository provides training scripts, inference utilities, model checkpoints, and evaluation notebooks, enabling reproducibility and experimentation with lightweight transformer architectures.

Latest projects

Some of my other stuff

Some of my
other stuff

{

Mobile Application

}

Local Mind

An offline AI chatbot powered by lightweight models like Gemma and LLaMA, optimized for local use

{

Mobile Application

}

Local Mind

An offline AI chatbot powered by lightweight models like Gemma and LLaMA, optimized for local use

{

Mobile Application

}

Local Mind

An offline AI chatbot powered by lightweight models like Gemma and LLaMA, optimized for local use

{

Artificial Intelligence

}

Small Language Model

A 50M-parameter language model built from scratch

{

Artificial Intelligence

}

Small Language Model

A 50M-parameter language model built from scratch

{

Artificial Intelligence

}

Small Language Model

A 50M-parameter language model built from scratch

{

Language Modeling

}

Language Eye

{

Language Modeling

}

Language Eye

{

Language Modeling

}

Hariom.profile

Hariom.profile

Hariom.profile

Lumen Language Model

Some of my other stuff

Some of my
other stuff

Local Mind

Local Mind

Local Mind

Small Language Model

Small Language Model

Small Language Model

Language Eye

Language Eye

Language Eye

Hariom Jangra

Hit me up if you are having any Questions

Hariom.profile

Hariom Jangra

Hit me up if you are having any Questions

Hariom.profile

Hariom Jangra

Hit me up if you are having any Questions

Hariom.profile

Lumen Language Model

Some of my other stuff

Some of myother stuff

Local Mind

Local Mind

Local Mind

Small Language Model

Small Language Model

Small Language Model

Language Eye

Language Eye

Language Eye

Hariom Jangra

Hariom Jangra

Hariom Jangra

Some of my
other stuff