DeepSeekMath introduces a 7B parameter model that achieves state-of-the-art mathematical reasoning by continuing pre-training on 120B math-related tokens from Common Crawl and introducing Group Relative Policy Optimization (GRPO)
27-01-2025mathematical-reasoning · reinforcement-learning · grpo
BERT (Bidirectional Encoder Representations from Transformers) pre-trains deep bidirectional transformer encoders using Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) to learn contextual word representations that can be fine-tuned for various NLP tasks, achieving state-of-the-art results without task-specific architectures.
25-01-2025transformer · nlp · language-modeling · bert · pre-training · fine-tuning · masked-language-model · bidirectional
LoRA (Low-Rank Adaptation) enables parameter-efficient fine-tuning of large language models by decomposing weight updates into low-rank matrices, dramatically reducing trainable parameters while preserving pre-trained knowledge and allowing zero-inference-latency deployment through weight merging.
25-01-2025fine-tuning · low-rank · transfer-learning
A deep dive into Rotary Position Embedding (RoPE), an elegant solution for encoding positional information in transformers that enables better length extrapolation and relative position modeling
25-01-2025transformers · position-encoding · attention · nlp
Adam (Adaptive Moment Estimation) is an optimization algorithm that adaptively adjusts learning rates per parameter by combining exponential moving averages of gradients (first moment) and squared gradients (second moment), with bias correction to achieve efficient stochastic optimization.
22-01-2025optimization · gradient-descent · adaptive-learning-rate · momentum · adam · stochastic-optimization
Generative Adversarial Networks (GANs) train two competing neural networks (a generator that creates synthetic samples and a discriminator that distinguishes real from fake) in a minimax game to learn data distributions and generate realistic samples without explicit density modeling.
22-01-2025generative-models · adversarial-training · deep-learning · neural-networks · gan
Deep Q-Network (DQN) combines Q-learning with convolutional neural networks to learn control policies directly from raw pixel inputs in Atari games, using experience replay to stabilize training and achieve human-level performance.
22-01-2025reinforcement-learning · deep-learning · q-learning · dqn · cnn
YouTube's deep neural network-based recommendation system using a two-stage architecture with candidate generation and ranking, incorporating negative sampling, importance weighting, and features like watch history and search queries to provide personalized video recommendations.
21-01-2025recsys · deep-learning · neural-networks · candidate-generation · ranking
Vision Transformer (ViT) replaces convolutions with a pure Transformer encoder over image patches, using a learnable [class] token and minimal inductive bias to achieve strong image recognition performance at scale.
20-01-2025transformer · attention