DSWoK — Data Science Well of Knowledge

#unsupervised

7 notes · co-occurs with 7 tags · last updated May 18, 2026

Co-tags#nlp4#topic-modeling4#concept3#algorithm3#clustering2#tabular-ml1#evaluation1
Notes tagged #unsupervised
01
Dimensionality Reduction
Dimensionality reduction is the process of reducing the number of features (dimensions) in a dataset while preserving as much relevant information as possible.
May 18, 2026
General ML
02
K-means clustering
K-means is an unsupervised machine learning algorithm used for partitioning a dataset into K distinct, non-overlapping subgroups (clusters).
May 18, 2026
General ML
03
Clustering metrics
Clustering metrics are quantitative measures used to evaluate the performance and quality of clustering algorithms.
May 18, 2026
Metrics and losses
04
BERTopic
BERTopic is a modular topic modeling pipeline.
May 18, 2026
NLP
05
LDA
Latent Dirichlet Allocation (Blei, Ng, Jordan, 2003) is the canonical probabilistic topic model.
May 18, 2026
NLP
06
Topic Modeling Methods
A survey of the main topic modeling methods, ordered roughly by historical development (matrix factorization → probabilistic generative models → neural → embedding-based).
May 18, 2026
NLP
07
Topic Modeling
Topic modeling is an unsupervised technique for discovering abstract themes in a document collection, where a document is whatever unit of text the project treats as one (article, review, tweet, paragraph, support ticket).
May 18, 2026
NLP

Created with Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community