My Research Notes: AI Algorithms: Word2Vec

Thursday, 15 May 2025

AI Algorithms: Word2Vec

🔍 What is Word2Vec?

Word2Vec is a neural network-based algorithm that learns vector representations (embeddings) of words from a large corpus of text, capturing their semantic and syntactic meaning.

🎯 Goal

To place similar words (in meaning and context) close together in vector space.

🧠 Key Idea

Words that appear in similar contexts have similar meanings — “You shall know a word by the company it keeps.”

🛠️ How It Works

Word2Vec has two main model architectures:

CBOW (Continuous Bag of Words):
Predicts the current word based on its context (surrounding words).
Input: Context words → Output: Target word
Skip-Gram:
Predicts surrounding context words from the current word.
Input: Target word → Output: Context words
(Works better with small datasets and infrequent words.)

🧮 Training

Uses a shallow neural network with one hidden layer.
The model learns to predict probabilities using softmax or approximations like negative sampling or hierarchical softmax.
During training, the weight matrix between the input and hidden layer becomes the word embedding matrix.

📦 Input and Output

Input: Large corpus of raw text.
Output: A vector for each word (e.g., 100-300 dimensions) where semantic relationships are captured (e.g., vector("king") - vector("man") + vector("woman") ≈ vector("queen")).

✅ Why It’s Useful

Captures word similarity, analogy, and relationships.
Forms the backbone of many NLP models (pre-BERT era).
Fast and scalable.

My Research Notes