How to Create an LLM (Large Language Model) in Python
Welcome to Codes With Pankaj!
In today’s world of Artificial Intelligence, Large Language Models (LLMs) like ChatGPT have changed everything. Many students and developers want to learn how to create an LLM in Python.
Building a full-scale LLM from scratch requires massive data, GPUs, and time. In this beginner-friendly tutorial, we will cover two practical approaches:
- Easy Way – Use pre-trained models with Hugging Face (Recommended for beginners)
- From Scratch – Build a tiny LLM to understand the core concepts
This guide is perfect for Python developers, data science students, and AI enthusiasts who want to start creating their own language models.
Prerequisites
- Python 3.8 or higher
- Basic knowledge of Python and machine learning
- A computer with GPU (optional but recommended for faster training)
Install the required libraries:
pip install transformers torch acceleratePart 1: Easy Way – Build LLM Using Hugging Face Transformers
Hugging Face makes working with LLMs super simple. You can load powerful models in just a few lines of code.
Step 1: Load and Generate Text with Pipeline
from transformers import pipeline
# Load a pre-trained model (GPT-2 is great for beginners)
generator = pipeline('text-generation', model='gpt2')
# Generate text
prompt = "Machine Learning is an exciting field because"
result = generator(prompt, max_length=150, num_return_sequences=1, temperature=0.7)
print(result[0]['generated_text'])Expected Output Example:
Machine Learning is an exciting field because it allows computers to learn patterns from data and make intelligent decisions…
Step 2: More Control with Tokenizer and Model
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "gpt2" # Try "distilgpt2" for faster results
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Encode prompt
inputs = tokenizer("Python programming for beginners", return_tensors="pt")
# Generate response
outputs = model.generate(
inputs.input_ids,
max_new_tokens=100,
temperature=0.8,
do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Pro Tips for Better Results:
- Use
device_map="auto"for automatic GPU usage - Try 4-bit quantization to reduce memory usage
- For Hindi/Indian languages, explore models like
ai4bharat/indic-bertor multilingual LLMs
Part 2: Build a Tiny LLM from Scratch (Educational Purpose)
This section helps you understand how LLMs actually work under the hood.
Step 1: Prepare Your Dataset
Download a small text file (e.g., tiny_shakespeare.txt from Andrej Karpathy’s repo) or use any text file.
with open('input.txt', 'r', encoding='utf-8') as f:
text = f.read()
print(f"Total characters: {len(text)}")
print(text[:500]) # PreviewStep 2: Character-Level Tokenization
chars = sorted(list(set(text)))
vocab_size = len(chars)
# Create mapping
stoi = {ch: i for i, ch in enumerate(chars)}
itos = {i: ch for i, ch in enumerate(chars)}
def encode(s): return [stoi[c] for c in s]
def decode(l): return ''.join([itos[i] for i in l])
print(encode("hello"))
print(decode(encode("hello")))
Step 3: Create a Simple Bigram Language Model
import torch
import torch.nn as nn
import torch.nn.functional as F
class BigramLanguageModel(nn.Module):
def __init__(self, vocab_size):
super().__init__()
self.token_embedding_table = nn.Embedding(vocab_size, vocab_size)
def forward(self, idx):
logits = self.token_embedding_table(idx)
return logits
model = BigramLanguageModel(vocab_size)Step 4: Training the Model
data = torch.tensor(encode(text), dtype=torch.long)
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-3)
batch_size = 32
for step in range(10000):
# Sample batch
ix = torch.randint(0, len(data) - 1, (batch_size,))
xb = torch.stack([data[i:i+1] for i in ix])
logits = model(xb)
loss = F.cross_entropy(logits.view(-1, vocab_size), xb.view(-1))
optimizer.zero_grad()
loss.backward()
optimizer.step()
if step % 1000 == 0:
print(f"Step {step}, Loss: {loss.item():.4f}")Step 5: Generate Text
def generate(model, idx, max_new_tokens=200):
for _ in range(max_new_tokens):
logits = model(idx)
logits = logits[:, -1, :]
probs = F.softmax(logits, dim=-1)
idx_next = torch.multinomial(probs, num_samples=1)
idx = torch.cat((idx, idx_next), dim=1)
return idx
context = torch.zeros((1, 1), dtype=torch.long)
generated = generate(model, context)
print(decode(generated[0].tolist()))Conclusion
Congratulations! You have learned how to create an LLM in Python using both easy and advanced methods.
Start with Hugging Face for quick results, then explore building models from scratch to strengthen your fundamentals.
Bookmark this page and practice daily. AI skills are in high demand!
ng face tutorial, gpt2 python, python llm from scratch, beginner ai tutorial, machine learning python 2026
This blog post is written by Pankaj Chouhan – Codes With Pankaj
Helping students and beginners build strong careers in tech.
