Skip to content

LLMBuilder Documentation

🤖 LLMBuilder

A toolkit for building, training, and deploying language models

PyPI version Python 3.8+ License: MIT

What is LLMBuilder?

LLMBuilder is a framework for training and fine-tuning Large Language Models (LLMs). It provides a complete pipeline to go from raw documents to deployable models, with support for both CPU and GPU training.

Key Features

  • Easy to Use: Simple commands to train and deploy models
  • Multi-Format Support: Process HTML, Markdown, EPUB, PDF, TXT files
  • Complete Pipeline: From data processing to model deployment
  • Flexible: Works on both CPU and GPU

Quick Start

# Install LLMBuilder
pip install llmbuilder

# Create a new project
llmbuilder init my_project

# Navigate to your project
cd my_project

# Follow the step-by-step instructions in README.md

Simple Example

import llmbuilder as lb

# Load configuration
cfg = lb.load_config(preset="cpu_small")

# Build model
model = lb.build_model(cfg.model)

# Prepare data
from llmbuilder.data import TextDataset
dataset = TextDataset("data.txt", block_size=cfg.model.max_seq_length)

# Train model
results = lb.train_model(model, dataset, cfg.training)

# Generate text
text = lb.generate_text(
    model_path="./checkpoints/model.pt",
    tokenizer_path="./tokenizers",
    prompt="The future of AI is",
    max_new_tokens=50
)
print(text)

Getting Started

  1. Installation - Install LLMBuilder
  2. Quick Start - Train your first model
  3. User Guide - Learn all features

Community & Support

Built by Qub△se