Technical Feasibility: Adaptive Learning Platform
- Build Date: 2026-05-05
- Context: Building AI-native adaptive learning platform for working professionals (25-45yo) seeking measurable salary increases through personalized skill development
- Reference: Adaptive Learning Platform Concept
Executive Summary
Feasibility Verdict: ✅ HIGHLY FEASIBLE with 2026 technology stack
Key Findings:
- Adaptive Learning: IRT/BKT algorithms proven (open-source libraries available), can be implemented in 2-3 months
- AI Question Generation: GPT-4/Claude 3.5 generate quality coding problems at $0.015-0.02/question (see AI Question Generation Study)
- Content Curation: YouTube Data API + web scraping achievable, ML tagging with existing models
- Salary Tracking: Job APIs available (LinkedIn, Naukri, Indeed), salary data accessible
- Cost: $80K-170K total investment for MVP → Scale → Production (12-18 months)
- Team: 2-3 engineers (full-stack + ML) for MVP, scale to 5-7 for production
Timeline:
- MVP (React skill path, Bangalore): 3 months (1 engineer)
- Beta (5 skills, 3 cities): 6 months (2 engineers)
- Scale (20 skills, pan-India): 12 months (3-5 engineers)
Recommendation: ✅ PROCEED - Technology is mature, cost is manageable, no major technical blockers
1. Technical Architecture (2026 Stack)
System Components Overview
Layer-by-Layer Breakdown
Layer 1: Frontend (User Experience)
// Next.js 15 + React 19 + TypeScript
Technology: Next.js 15.0 (App Router)
UI Framework: shadcn/ui + Tailwind CSS 4.0
State Management: Zustand 4.5 + TanStack Query 5.0
Forms: React Hook Form 7.5 + Zod validation
Charts: Recharts 2.12 (salary tracking visualizations)
Mobile: React Native (same components via Tamagui)
Deployment: Vercel (edge functions, $20/month Pro)
CDN: Vercel Edge Network (built-in)
Why Next.js 15:
- Server Components = faster initial load (critical for mobile users on 3G/4G in India)
- Edge rendering = low latency globally
- Built-in image optimization (course thumbnails, profile pics)
- ISR (Incremental Static Regeneration) for content pages
- tRPC integration = end-to-end type safety
Layer 2: API & Backend Services
// tRPC + Hono for edge
API Framework: tRPC 11.0 (type-safe API)
Alternative: Hono 4.0 (edge-optimized HTTP framework)
Authentication: Clerk (Indian pricing: ₹2K/month for 1K users)
Database ORM: Drizzle ORM 0.30 (type-safe, faster than Prisma)
Backend Runtime: Node.js 22 LTS
Serverless: Vercel Functions (99% uptime)
Background Jobs: Inngest (cron jobs, workflows, free tier: 1M steps/month)
Why tRPC:
- End-to-end type safety (frontend knows exact backend types)
- No OpenAPI spec needed (types = docs)
- React Query integration (automatic caching, optimistic updates)
- Perfect for Next.js + TypeScript projects
Layer 3: Databases & Storage
-- Primary Database: Neon PostgreSQL
Provider: Neon (serverless Postgres, $19/month for 10GB)
Why: Auto-scaling, branching (test env = clone), 99.9% uptime
Alternative: Supabase ($25/month, includes auth + storage)
-- Cache Layer: Upstash Redis
Provider: Upstash ($10/month for 1GB, pay-as-you-go)
Use Cases:
- Session management
- API rate limiting
- Real-time leaderboards
- Knowledge state caching (frequently accessed user data)
-- Search: Meilisearch
Provider: Meilisearch Cloud ($29/month for 100K docs)
Use Cases:
- Content search (find React tutorial videos)
- Skill search (autocomplete)
- Job search
Why: 10x faster than PostgreSQL full-text search, typo-tolerant
-- Vector DB: Pinecone
Provider: Pinecone ($70/month starter, 100K vectors)
Use Cases:
- Content similarity (recommend similar tutorials)
- Question uniqueness check
- Semantic search
Alternative: pgvector extension in Neon (free, less performant)
Layer 4: AI/ML Services
LLM Provider: Anthropic Claude 3.5 Sonnet (claude-3-5-sonnet-20241022)
- Pricing: $0.003/1K input tokens, $0.015/1K output tokens
- Use cases: Question generation ($0.015-0.02/question), content tagging ($0.005/video), explanations ($0.01/explanation)
- Alternative: OpenAI GPT-4 Turbo ($0.01/1K in, $0.03/1K out)
- Fallback: Meta Llama 3.3 70B via Together.ai ($0.0006/1K tokens)
ML Models: Hugging Face Inference API
- Embedding: sentence-transformers/all-MiniLM-L6-v2 for content embeddings and similarity search
- Cost: Free (self-hosted) or $0.0001/request (Inference API)
Layer 5: External Data Sources
# Job & Salary Data APIs
1. RapidAPI - LinkedIn Jobs API
Cost: $100/month for 10K requests
Data: Job postings, required skills, salary ranges
Coverage: Global (India, US, EU)
2. Naukri Job Search API
Cost: Enterprise contact (estimated ₹50K-1L/year)
Data: India-specific jobs, salaries, skill requirements
Coverage: Best for Indian market
3. Adzuna Job Search API
Cost: Free tier (250 calls/month), $500/month for 10K
Data: Salary estimates, skill trends, job volumes
Coverage: 19 countries including India
4. Glassdoor API (unofficial via SerpAPI)
Cost: $50/month for 5K searches via SerpAPI
Data: Company reviews, salaries, interview questions
Alternative: Web scraping (legal gray area)
5. AmbitionBox (India-specific)
Approach: Web scraping with Playwright
Cost: $100/month for proxy rotation (BrightData)
Data: Indian company salaries, reviews, culture
2. Adaptive Learning Engine Implementation
2.1 Knowledge State Model (Bayesian Knowledge Tracing)
Technology: pyBKT library (MIT licensed, maintained by Stanford BETA Lab)
Installation: pip install pyBKT==1.4.3
How It Works:
The system uses a KnowledgeStateTracker class that maintains BKT models for each skill. The tracker uses four key parameters:
- Prior (P(L₀) = 0.1): Initial probability student knows skill before any practice (10%)
- Learn Rate (P(T) = 0.3): Probability of learning from each practice opportunity (30%)
- Guess Rate (P(G) = 0.2): Chance of correct answer without knowing skill (20%)
- Slip Rate (P(S) = 0.1): Chance of mistake despite knowing skill (10%)
Model Training:
The BKT model is trained on historical interaction data with columns: user_id, skill_id, correct, timestamp. The pyBKT library fits the model 5 times and selects the best parameters using maximum likelihood estimation.
Mastery Prediction:
For any user-skill pair, the system:
- Retrieves the user's interaction history for that skill
- Starts with prior probability (0.1 if no history)
- Applies Bayesian update after each interaction:
- If correct: Posterior = P(correct | mastered) × Prior / P(correct)
- If incorrect: Posterior = P(incorrect | mastered) × Prior / P(incorrect)
- Adds learning increment: Final = Posterior + (1 - Posterior) × LearnRate
- Returns final mastery probability (0.0-1.0)
Skill Gap Analysis:
The get_skill_gaps() method identifies which skills need learning for a target role. It:
- Compares user's mastery probability against threshold (0.8 = mastered)
- For each skill below threshold, calculates gap size
- Prioritizes skills using formula: Priority = (Gap Size × Salary Impact) / Time to Learn
- Returns sorted list with current mastery, gap, and priority score
This ensures users focus on high-ROI skills first (biggest salary impact for least time investment).
Data Storage:
- user_skill_interactions: Raw interaction history (user, skill, correct/incorrect, time spent, difficulty, confidence)
- user_skill_mastery: Current mastery state per user-skill pair (mastery probability 0.00-1.00, interaction count, last activity timestamp)
2.2 Adaptive Sequencing (IRT - Item Response Theory)
Technology: py-irt library (v0.2.9) + custom implementation
How It Works:
The AdaptiveSequencer class uses a 3-Parameter Logistic (3PL) IRT model to select optimal next questions. Key components:
Item Calibration:
- Fits IRT model on historical response data (
user_id,item_id, correct/incorrect) - Extracts three parameters per question:
- Difficulty (b): -3.0 to +3.0 scale (higher = harder)
- Discrimination (a): 0.0 to 3.0 (how well question separates abilities)
- Guessing (c): 0.0 to 1.0 (probability of lucky correct answer)
Ability Estimation:
- Estimates user's ability (θ theta) on -3 to +3 scale using Maximum Likelihood Estimation
- θ = 0 is average, θ = +2 is advanced (top 2%), θ = -2 is struggling (bottom 2%)
- Updates after each question response
Adaptive Question Selection:
- Calculates information function for each available question: I(θ) = a² × [P(θ) - c]² / P(θ)
- Selects question with highest information gain
- Optimal when P(correct) ≈ 0.5 (maximum uncertainty = maximum learning)
- Uses 3PL formula: P(θ) = c + (1 - c) / (1 + e^(-a(θ - b)))
Learning Sequence Generation:
- Creates ordered path from current ability to target mastery (0.8)
- Selects items slightly above current ability (θ + 0.5 = Zone of Proximal Development)
- Increments ability estimate (+0.3) after each item (assumes learning occurs)
- Stops when predicted θ reaches mastery threshold
Mastery Conversion:
- Converts BKT mastery probability (0-1) to IRT theta scale using normal distribution inverse
- Example: 0.5 mastery → θ = 0, 0.8 mastery → θ = +1.4
Data Storage:
- learning_items: All questions/exercises with IRT parameters (difficulty, discrimination, guessing), skill mapping, content URLs, calibration status
- user_item_responses: Historical response data (user, item, correct/incorrect, time, attempts) used for IRT ability estimation
2.3 User Interaction Tracking & Long-term Data Management
Critical Component: The adaptive algorithms (BKT, IRT) are only as good as the interaction data they learn from. This section details how we capture, store, and leverage user interactions over months/years to build increasingly accurate models of what each student knows.