Hacker Newsnew | past | comments | ask | show | jobs | submit | ModelForge's submissionslogin
1.Claude Code's Real Secret Sauce Isn't the Model (sebastianraschka.com)
6 points by ModelForge 5 days ago | past | discuss
2.The State of LLMs 2025: Progress, Problems, and Predictions (sebastianraschka.com)
3 points by ModelForge 3 months ago | past
3.A Researcher's Field Guide to Non-Standard LLM Architectures (sebastianraschka.com)
2 points by ModelForge 5 months ago | past
4.Explanation of Gated DeltaNet (Qwen3-Next and Kimi Linear) (github.com/rasbt)
3 points by ModelForge 5 months ago | past
5.The Core Components of Modern LLMs and the Models Beyond Transformers [video] (youtube.com)
3 points by ModelForge 5 months ago | past
6.Popular Attention Alternatives: GQA, MLA, SWA (sebastianraschka.com)
4 points by ModelForge 5 months ago | past
7.Multi-Head Latent Attention (sebastianraschka.com)
4 points by ModelForge 5 months ago | past
8.Thinking Machines Lab Co-Founder Departs for Meta (wsj.com)
7 points by ModelForge 5 months ago | past
9.OpenAI's internal Slack messages could cost it billions in copyright suit (sherwood.news)
8 points by ModelForge 5 months ago | past | 1 comment
10.LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge (sebastianraschka.com)
4 points by ModelForge 6 months ago | past
11.Gemma 3 270M re-implemented in pure PyTorch for local tinkering (github.com/rasbt)
417 points by ModelForge 7 months ago | past | 57 comments
12.GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2 (sebastianraschka.com)
490 points by ModelForge 7 months ago | past | 97 comments
13.LLM Research Papers: The 2024 List (sebastianraschka.com)
5 points by ModelForge on Dec 18, 2024 | past
14.Scaling Test-Time Compute with Open LLM Models (huggingface.co)
3 points by ModelForge on Dec 18, 2024 | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: