Probability, Statistics, Stochastics#
Sections#
Resources#
[ h ] Kenneth Tay’s Statistical Odds & Ends
08-22-2020 Hall, Brayton. “The Reasoning Behind Bessel’s Correction: n-1: And Why it’s Not Always a Correction”. Towards Data Science. https://towardsdatascience.com/the-reasoning-behind-bessels-correction-n-1-eeea25ec9bc9.
A concrete introduction to probability
YouTube#
3Blue1Brown
[ y ]
04-02-2023. “Why π is in the normal distribution (beyond integral tricks)”. YouTube.[ y ]
12-22-2019. “Bayes theorem, the geometry of changing beliefs”.
Geek’s Lesson
[ y ]
06-21-2019“Statistic for beginners | Statistics for Data Science”.
My Lesson
[ y ]
07-29-2021. “Combinatorics and Probability (Complete Course) | Discrete Mathematics for Computer Science”.
[ h ][ y ] StatQuest with Josh Starmer
more
[ y ]
06-17-2023. Primer. “A Secret Weapon for Predicting Outcomes: The Binomial Distribution”.
[ y ] Stat Quest
[ y ] StatQuest. (07 Nov 2022). “Long Short-Term Memory (LSTM), Clearly Explained”.
[ y ] StatQuest. (19 Sep 2022). “Introduction to Coding Neural Networks with PyTorch and Lightning”.
[ y ] StatQuest. (11 Jul 2022). “Recurrent Neural Networks (RNNs), Clearly Explained!!!”.
[ y ] StatQuest. (25 Apr 2022). “The StatQuest Introduction to PyTorch”.
[ y ] StatQuest. (28 Feb 2022). “Tensors for Neural Networks, Clearly Explained!!!”.
[ y ] StatQuest. (08 Mar 2021). “Neural Networks Part 8: Image Classification with Convolutional Neural Networks (CNNs)”.
[ y ] StatQuest. (01 Mar 2021). “Neural Networks Part 7: Cross Entropy Derivatives and Backpropagation”.
[ y ] StatQuest. (01 Mar 2021). “Neural Networks Part 6: Cross Entropy”.
[ y ] StatQuest. (07 Feb 2021). “Neural Networks Part 5: ArgMax and SoftMax”.
[ y ] StatQuest. (01 Feb 2021). “Neural Networks Pt. 4: Multiple Inputs and Outputs”.
[ y ] StatQuest. (23 Nov 2020). “Neural Networks Pt. 3: ReLU In Action!!!”.
[ y ] StatQuest. (02 Nov 2020). “Backpropagation Details Pt. 2: Going bonkers with The Chain Rule”.
[ y ] StatQuest. (02 Nov 2020). “Backpropagation Details Pt. 1: Optimizing 3 parameters simultaneously”.
[ y ] StatQuest. (19 Oct 2020). “Neural Networks Pt. 2: Backpropagation Main Ideas”.
[ y ] StatQuest. (31 Aug 2020). “Neural Networks Pt. 1: Inside the Black Box”.
[ y ] StatQuest. (01 Aug 2020). “XGBoost in Python from Start to Finish”.
[ y ] StatQuest. (06 Jul 2020). “Alternative Hypotheses: Main Ideas!!!”.
[ y ] StatQuest. (06 Jul 2020). “Hypothesis Testing and The Null Hypothesis, Clearly Explained!!!”.
[ y ] StatQuest. (30 Jun 2020). “Support Vector Machines in Python from Start to Finish”.
[ y ] StatQuest. (06 Jun 2020). “Classification Trees in Python from Start to Finish”.
[ y ] StatQuest. (04 May 2020). “p-hacking: What it is and how to avoid it!”
[ y ] StatQuest. (04 May 2020). “Power Analysis, Clearly Explained!!!”.
[ y ] StatQuest. (04 May 2020). “Statistical Power, Clearly Explained!!!”.
[ y ] StatQuest. (23 Mar 2020). “How to calculate p-values”.
[ y ] StatQuest. (23 Mar 2020). “p-values: What they are and how to interpret them”.
[ y ] StatQuest. (02 Mar 2020). “XGBoost Part 4 (of 4): Crazy Cool Optimizations”.
[ y ] StatQuest. (10 Feb 2020). “XGBoost Part 3 (of 4): Mathematical Details”.
[ y ] StatQuest. (13 Jan 2020). “XGBoost Part 2 (of 4): Classification”.
[ y ] StatQuest. (16 Dec 2019). “XGBoost Part 1 (of 4): Regression”.
[ y ] StatQuest. (04 Nov 2019). “Support Vector Machines Part 3: The Radial (RBF) Kernel (Part 3 of 3)”.
[ y ] StatQuest. (04 Nov 2019). “Support Vector Machines Part 2: The Polynomial Kernel (Part 2 of 3)”.
[ y ] StatQuest. (30 Sep 2019). “Support Vector Machines Part 1 (of 3): Main Ideas!!!”.
[ y ] StatQuest. (13 Jul 2019). “The Chain Rule”.
[ y ] StatQuest. (11 Jul 2019). “ROC and AUC, Clearly Explained!”.
[ y ] StatQuest. (13 May 2019). “Stochastic Gradient Descent, Clearly Explained!!!”.
[ y ] StatQuest. (22 Apr 2019). “Gradient Boost Part 4 (of 4): Classification Details”.
[ y ] StatQuest. (08 Apr 2019). “Gradient Boost Part 3 (of 4): Classification”.
[ y ] StatQuest. (01 Apr 2019). “Gradient Boost Part 2 (of 4): Regression Details”.
[ y ] StatQuest. (25 Mar 2019). “Gradient Boost Part 1 (of 4): Regression Main Ideas”.
[ y ] StatQuest. (05 Feb 2019). “Gradient Descent, Step-by-Step”.
[ y ] StatQuest. (14 Jan 2019). “AdaBoost, Clearly Explained”.
[ y ] StatQuest. (08 Oct 2018). “Regularization Part 3: Elastic Net Regression”.
[ y ] StatQuest. (01 Oct 2018). “Regularization Part 2: Lasso (L1) Regression”.
[ y ] StatQuest. (24 Sep 2018). “Regularization Part 1: Ridge (L2) Regression”.
[ y ] StatQuest. (17 Sep 2018). “Machine Learning Fundamentals: Bias and Variance”.
[ y ] StatQuest. (03 Sep 2018). “The Central Limit Theorem, Clearly Explained!!!”.
[ y ] StatQuest. (09 Apr 2018). “StatQuest: PCA - Practical Tips”.
[ y ] StatQuest. (02 Apr 2018). “StatQuest: Principal Component Analysis (PCA), Step-by-Step”.
[ y ] StatQuest. (11 Dec 2017). “StatQuest: MDS and PCoA”.
[ y ] StatQuest. (20 Mar 2017). “Standard Deviation vs Standard Error, Clearly Explained!!!”.
[ y ] StatQuest. (23 Feb 2017). “Logs (logarithms), Clearly Explained!!!”.
[ y ] StatQuest. (10 Jan 2017). “False Discovery Rates, FDR, clearly explained”.
[ y ] StatQuest. (11 Oct 2016). “p-hacking and power calculations”.
[ y ] StatQuest. (10 Jul 2016). “StatQuest: Linear Discriminant Analysis (LDA) clearly explained”.
[ y ] StatQuest. (13 Aug 2015). “Principal Component Analysis (PCA) clearly explained (2015)”.
Figures#
[ w ]
1701-1761Bayes, Thomas[ w ]
1655-1705Bernoulli, Jacob[ w ]
1501-1576Cardano, Gerolamo[ w ]
1607-1665Fermat, Pierre[ w ]
1890-1962Fisher, Ronald[ w ]
1629-1695Huygens, Christiaan[ w ]
1915-2008Ito, Kiyoshi[ w ]
1903-1987Kolmogorov, Andrey[ w ]
1749-1827Laplace, Pierre-Simon[ w ]
1856-1922Markov, Andrey[ w ]
1623-1662Pascal, Blaise[ w ]
1857-1936Pearson, Karl[ w ]
1781-1840Poisson, Simeon[ w ]
1906-1973Stevens, Stanley[ w ]
1915-2000Tukey, John[ w ]
1834-1923Venn, John[ w ]
1894-1964Wiener, Norbert
Texts#
2020Bruce, Peter, Andrew Bruce, & Peter Gedeck. Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python, 2nd Ed. O’Reilly.2019Forsyth, David. Probability and Statistics for Computer Science. Springer.2012Givens, Geof H. & Jennifer A. Hoeting. Computational Statistics, 2nd Ed. Wiley.????Grus, Joel. Data Science from Scratch 2nd Ed. O’Reilly.????Kneusel, Ronald T. Math for Deep Learning. No Starch Press.2019Kurt, Will. Bayesian Statistics the Fun Way: Understanding Statistics and Probability with Star Wars, LEGO, and Rubber Ducks. No Starch Press.2017Mitzenmacher, Michael & Eli Upfal. Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis, 2nd Ed. Cambridge University Press.????Nelson, Hala. Essential Math for AI. O’Reilly.????Nield, Thomas. Essential Math for Data Science. O’Reilly.????Orland, Paul. Math for Programmers. Manning.GitHub.2015Reinhart, Alex. Statistics Done Wrong: The Woefully Complete Guide. No Starch Press.2007Rizzo, Maria L. Statistical Computing with R.2019Ross, Sheldon M. Introduction to Probability Models, 12th Ed.2018Ross, Sheldon M. A First Course in Probability, 10th Ed. Pearson.2019Tan, Pang-Ning et al. Introduction to Data Mining. 2nd Ed. Pearson. Home.2022Utts, Jessica M. & Robert F. Heckard. Mind on Statistics 6e. Cengage.????Wasserman, Larry. All of Statistics: A Concise Course in Statistical Inference.
Terms#
[ w ] 68-95-99.7 Rule
[ w ] Accuracy
[ w ] Algebra of Random Variables
[ w ] Analysis of Variance (ANOVA)
[ w ] Area Plot
[ w ] Arithmetic Mean
[ w ] Average
[ w ] Average
[ w ] Average Absolute Deviation (AAD)
[ w ] Bar Plot
[ w ] Bayes’ Theorem
[ w ] Bayesian Probability
[ w ] Bayesian Statistics
[ w ] Bernoulli Distribution
[ w ] Bernoulli Trial
[ w ] Bessel’s Correction
[ w ] Bi Plot
[ w ] Bin
[ w ] Binary Data
[ w ] Binomial Distribution
[ w ] Bootstrapping
[ w ] Box Plot
[ w ] Brownian Motion
[ w ] Bubble Plot
[ w ] Categorical Variable
[ w ] Centering Matrix
[ w ] Central Limit Theorem
[ w ] Central Moment
[ w ] Central Tendency
[ w ] Chi-Squared Distribution
[ w ] Choropleth Plot
[ w ] CI Confidence Interval
[ w ] Classical Probability
[ w ] Codebook
[ w ] Coefficient of Variation
[ w ] Combinatorics
[ w ] Conditional Probability
[ w ] Confidence Interval
[ w ] Continuous-Time Markov Chain (CTMC)
[ w ] Correlation
[ w ] Count Data
[ w ] Counting Process
[ w ] Covariance
[ w ] Cumulative Distribution Function (CDF)
[ w ] Data
[ w ] Data, Semi Structured
[ w ] Data, Structured
[ w ] Data, Unstructured
[ w ] Data Analysis
[ w ] Data Analytics
[ w ] Data Attribute
[ w ] Data Augmentation
[ w ] Data Cleaning
[ w ] Data Consistency
[ w ] Data Engineering
[ w ] Data Exploration
[ w ] Data Governance
[ w ] Data Integration
[ w ] Data Integrity
[ w ] Data Management
[ w ] Data Mining
[ w ] Data Munging
[ w ] Data Object
[ w ] Data Observation
[ w ] Data Point
[ w ] Data Preparation
[ w ] Data Profiling
[ w ] Data Quality
[ w ] Data Redundancy
[ w ] Data Science
[ w ] Data Security
[ w ] Data Set
[ w ] Data Transformation
[ w ] Data Type
[ w ] Data Validation
[ w ] Data Visualization
[ w ] Data Wrangling
[ w ] Decile
[ w ] Degrees of Freedom
[ w ] Deming Regression
[ w ] Descriptive Statistics
[ w ] Deviation
[ w ] Diffusion Process
[ w ] Dimensionality Reduction
[ w ] Discrete-Time Markov Chain (DTMC)
[ w ] Discretization
[ w ] Dispersion
[ w ] Distance Correlation
[ w ] Distance Covariance
[ w ] Dummy Variable
[ w ] Error
[ w ] Estimand
[ w ] Estimation
[ w ] Estimator
[ w ] Event
[ w ] Expectation
[ w ] Expectation Maximization (EM)
[ w ] Expected Value
[ w ] Experiment
[ w ] Experimental Design
[ w ] Exploratory Data Analysis (EDA)
[ w ] Exponential Distribution
[ w ] Fat-Tailed Distribution
[ w ] Feature Engineering
[ w ] Feature Scaling
[ w ] Five-Number Summary
[ w ] Frequency
[ w ] Frequentism
[ w ] Frequentist Inference
[ w ] Gambler’s Fallacy
[ w ] Game of Chance
[ w ] Gaussian Process
[ w ] Geometric Brownian Motion (GBM)
[ w ] Geometric Random Walk
[ w ] Gini Coefficient
[ w ] Goodness of Fit
[ w ] Hardware Random Number Generator
[ w ] Heavy-Tailed Distribution
[ w ] Hidden Markov Model (HMM)
[ w ] Histogram
[ w ] Hypothesis Test
[ w ] Independence
[ w ] Independent and Identical Distribution (IID)
[ w ] Independent and Identically Distributed (IID)
[ w ] Inter Quartile Mean (IQM)
[ w ] Inter Quartile Range (IQR)
[ w ] Interval Estimation
[ w ] Interval Scale
[ w ] Ito Calculus
[ w ] Ito’s Calculus
[ w ] Ito’s Lemma
[ w ] Jump Diffusion
[ w ] Jump Process
[ w ] Kolmogorov Axioms
[ w ] Kurtosis
[ w ] L-Moment
[ w ] Law of Large Numbers
[ w ] Law of Total Expectation
[ w ] Law of Total Variance
[ w ] Least Absolute Deviation (LAD)
[ w ] Level of Measurement
[ w ] Level/Scale of Measure(ment)
[ w ] Line Plot
[ w ] Log-Normal Distribution
[ w ] Long-Tailed Distribution
[ w ] Malliavin Calculus
[ w ] Markov Chain Monte Carlo (MCMC)
[ w ] Markov Model
[ w ] Markov Process
[ w ] Markov Property
[ w ] Markov Random Field
[ w ] Mathematical Statistics (MAD)
[ w ] Maximum
[ w ] Maximum A Posteriori Estimate (MAP)
[ w ] Maximum Likelihood Estimation (MLE)
[ w ] Mean
[ w ] Mean
[ w ] Mean Absolute Difference
[ w ] Mean Absolute Error (MAE)
[ w ] Mean Squared Error
[ w ] Mean, Arithmetic
[ w ] Mean, Geometric
[ w ] Mean, Harmonic
[ w ] Mean, Pythagorean
[ w ] Measurand
[ w ] Measurement Error
[ w ] Measurement Uncertainty
[ w ] Median
[ w ] Median Absolute Deviation
[ w ] Median Absolute Deviation (MAD)
[ w ] Minimum
[ w ] Mode
[ w ] Moment
[ w ] Monte Carlo
[ w ] Monte Carlo Method
[ w ] Monty Hall Problem
[ w ] Nominal Scale
[ w ] Non Parametric Statistics
[ w ] Normal Distribution
[ w ] Normal Distribution or Gaussian Distribution
[ w ] Normality Test
[ w ] Normality Testing
[ w ] Normalization
[ w ] Null Hypothesis
[ w ] Observational Error
[ w ] Odds
[ w ] Odds Ratio
[ w ] On-Line Analytical Processing (OLAP)
[ w ] One-Hot Encoding
[ w ] One-Tailed Test
[ w ] Optimality Criterion
[ w ] Order Statistic
[ w ] Ordinal Data
[ w ] Outcome
[ w ] Outlier
[ w ] P Value
[ w ] Parameter
[ w ] Pattern Recognition
[ w ] Pearson Correlation Coefficient
[ w ] Percentile
[ w ] Percentile or Centile
[ w ] Philosophy of Statistics
[ w ] Pie Plot
[ w ] Plot
[ w ] Point Estimation
[ w ] Poisson Distribution
[ w ] Poisson Process
[ w ] Population
[ w ] Possibility Theory
[ w ] Posterior Probability
[ w ] Precision
[ w ] Predictive Analytics
[ w ] Preferential Attachment
[ w ] Principle of Indifference
[ w ] Prior Probability
[ w ] Probabilistic Cellular Automaton
[ w ] Probability
[ w ] Probability Density Function (PDF)
[ w ] Probability Distribution
[ w ] Probability Mass Function
[ w ] Probability Space
[ w ] Probability Theory
[ w ] Probability, classical definition
[ w ] Pseudorandom Number Generator (PNG)
[ w ] Pseudorandomness
[ w ] Quantile
[ w ] Quantitative Variable
[ w ] Quartile
[ w ] Quickselect
[ w ] Random Cellular Automaton
[ w ] Random Field
[ w ] Random Function
[ w ] Random Number Generator
[ w ] Random Process
[ w ] Random Seed
[ w ] Random Variable
[ w ] Random Walk
[ w ] Random Walk, Geometric
[ w ] Randomness
[ w ] Range
[ w ] Rank
[ w ] Ratio Scale
[ w ] Regression, Deming
[ w ] Relative Frequency
[ w ] Renewal Theory
[ w ] Resampling
[ w ] Residual
[ w ] Residual Sum of Squares
[ w ] Robust Measure of Scale
[ w ] Robust Statistics
[ w ] Sample
[ w ] Sample
[ w ] Sample Covariance
[ w ] Sample Maximum
[ w ] Sample Mean
[ w ] Sample Minimum
[ w ] Sample Size
[ w ] Sampling
[ w ] Sampling Distribution
[ w ] Scatter Plot
[ w ] Seven-Number Summary
[ w ] Skewness
[ w ] Spaghetti Plot
[ w ] Spread
[ w ] Squared Deviation from the Mean (SDM)
[ w ] Standard Deviation
[ w ] Standard Error
[ w ] Standardized Moment
[ w ] Stationary Process
[ w ] Statistic
[ w ] Statistical Assumptions
[ w ] Statistical Data Type
[ w ] Statistical Graphics
[ w ] Statistical Inference
[ w ] Statistical Learning
[ w ] Statistical Model
[ w ] Statistical Significance
[ w ] Statistical Theory
[ w ] Statistics
[ w ] Stem and Leaf Plot
[ w ] Stochastic
[ w ] Stochastic Calculus
[ w ] Stochastic Cellular Automaton
[ w ] Stochastic Differential Equation
[ w ] Stochastic Differential Equation (SDE)
[ w ] Stochastic Process
[ w ] Stochasticity
[ w ] Summary Statistics
[ w ] Survey
[ w ] Survival Analysis
[ w ] Test Statistic
[ w ] Tidy Data
[ w ] Trimmed Mean
[ w ] Two-Tailed Test
[ w ] Unbiased Estimation of Standard Deviation
[ w ] Unbiased estimation of the standard deviation
[ w ] Uncertainty
[ w ] Unit of Observation
[ w ] Variability
[ w ] Variance
[ w ] White Noise
[ w ] Wiener Process