LLM Building

Architecture

emerging-llm-app-stack

Emerging Architectures for LLM Applications | Andreessen Horowitz

Transformers, explained: Understand the model behind GPT, BERT, and T5 - YouTube

Positional encodings
Attention
Self attention
GPT3 - 45tb of text data

chat-gpt-working

Let’s Architect! Discovering Generative AI on AWS | AWS Architecture Blog

Building

GitHub - karpathy/nanoGPT: The simplest, fastest repository for training/finetuning medium-sized GPTs.

LLM Working

Decoding Strategies

Greedy Search
Beam search in Large Language Models (LLMs) is a decoding strategy that explores multiple potential output sequences simultaneously, keeping track of the most promising "beams" (or sequences) at each step, to find the most likely output.
Decoding Demystified : How LLMs Generate Text - III - DEV Community
Decoding Strategies in Large Language Models – Maxime Labonne
Decoding Strategies in Large Language Models
Decoding Strategies: How LLMs Choose The Next Word
Understanding greedy search and beam search | by Jessica López Espejel | Medium

How to train your ChatGPT

Stage 1: Pretraining

Download ~10TB of text
Get a cluster of ~6,000 GPUs
Compress the text into a neural network, pay ~$2M, wait ~12 days
Obtain base model

Stage 2: Finetuning

Write labeling instructions
Hire people (or use scale.ai!), collect 100K high quality ideal Q&A responses, and/or comparisons
Finetune base model on this data, wait ~1 day
Obtain assistant model
Run a lot of evaluations
Deploy
Monitor, collect misbehaviors, go to step 1

LLM Security

Jailbreaking
Prompt injection
Backdoors & data poisoning
Adversarial inputs
Insecure output handling
Data extraction & privacy
Data reconstruction
Denial of service
Escalation
Watermarking & evasion
Model theft

1hr Talk Intro to Large Language Models - YouTube

Awesome ChatGPT Prompts | This repo includes ChatGPT prompt curation to use ChatGPT better.

SynthID - Google DeepMind

Dev Tools

Ollama / LM Studio

The easiest way to get up and running with large language models locally.

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

docker exec -it ollama ollama run llama2

docker exec -it ollama ollama run llama2-uncensored

docker exec -it ollama ollama run mistral

>>> /? # for help

Docker

LM Studio - SUPER EASY Text AI - Windows, Mac & Linux / How To - YouTube

LM Studio - Discover, download, and run local LLMs

Ollama Course – Build AI Apps Locally - YouTube

Run DeepSeek-R1 on Your Laptop with Ollama - DEV Community

Jan: Open source ChatGPT-alternative that runs 100% offline - Jan

open-webui

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG, making it a powerful AI deployment solution.

GitHub - open-webui/open-webui: User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

oobabooga

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

GitHub - oobabooga/text-generation-webui: A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

GitHub - oobabooga/text-generation-webui-extensions

Ludwig

Ludwig is an open-source, declarative machine learning framework that makes it easy to define deep learning pipelines with a simple and flexible data-driven configuration system. Ludwig is suitable for a wide variety of AI tasks, and is hosted by the Linux Foundation AI & Data.

Ludwig enables you to apply state-of-the-art tabular, natural language processing, and computer vision models to your existing data and put them into production with just a few short commands.

GitHub - ludwig-ai/ludwig: Low-code framework for building custom LLMs, neural networks, and other AI models

Ludwig

What is Ludwig? - Ludwig

SAAS

Resources

Development with Large Language Models Tutorial - OpenAI, Langchain, Agents, Chroma - YouTube

document-based-question-answering-system

Architecture​

Building​

Decoding Strategies​

How to train your ChatGPT​

Stage 1: Pretraining​

Stage 2: Finetuning​

LLM Security​

Dev Tools​

Ollama / LM Studio​

open-webui​

oobabooga​

Ludwig​

SAAS​

Resources​

Architecture

Building

Decoding Strategies

How to train your ChatGPT

Stage 1: Pretraining

Stage 2: Finetuning

LLM Security

Dev Tools

Ollama / LM Studio

open-webui

oobabooga

Ludwig

SAAS

Resources