Category Statistics

Statistics

115. Jacobian Matrix

Jacobian Matrix is a matrix which stores all the partial derivatives for multiple functions. For example, let’s consider ex1 (top left). F(x) is a function containing 1 variable. If you calculate the derivative of F(x) it would be 2x. Now,…

Kyosuke
June 1, 2022

Deep Learning, Statistics

89. Max-Norm Regularization

Another useful regularization technique is called Max-Norm Regularization. Implementation layer = keras.layers.Dense(100, activation=”selu”, kernel_initializer=”lecun_normal”, kernel_constraint=keras.constraints.max_norm(1.)) By setting a hyper-parameter r you can set a max value for weights to prevent over-fitting. References: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow,…

Kyosuke
May 6, 2022

Deep Learning, Statistics

88. Monte Carlo Dropout

Dropout is one of the most popular regularization techniques for deep neural networks. Monte Carlo Dropout may help boost the dropout model even more. Full Implementation ys = np.stack([model(X_test, training=True) for sample in range(100)]) y = ys.mean(axis=0) Predict() method returns…

Kyosuke
May 5, 2022

Statistics

69. Maximum Likelihood

Maximum Likelihood, as the naming goes, is maximizing the likelihood of your prediction. Let’s say you have an input x and you want to predict y. In this scenario, you want to maximize the probability of Y given x and the bias…

Kyosuke
April 16, 2022

Statistics

67. Partition Function

Unnormalized probability distributions are guaranteed to be nonnegative, but not guaranteed to sum or integrate to 1.We need to consider a partition function to obtain a valid probability distribution.But how do you calculate that..? Since it is most often intractable,…

Kyosuke
April 14, 2022

Book Review, Statistics

66. Gibbs Sampling

Gibbs Sampling is used when the dimensions of the distribution you are trying to sample from are more than 2, AND when it is difficult to sample from that joint distribution. For example, we want to sample from a joint…

Kyosuke
April 13, 2022

Book Review, Statistics

65. Markov Chains

The central limit theorem states that the distribution of independent samples approximates a normal distribution as the number of those samples increases. On the other hand, the Markov Chain states that even dependent samples will converge to a certain state…

Kyosuke
April 12, 2022

Statistics

64. Energy-Based Models

When training a model, we usually use a cost function to calculate how far away the predictions are from the actual results. By replacing that cost function with a function called energy function, we call that an energy-based model. What…

Kyosuke
April 11, 2022