PAC-Bayes

Kullback-Leibler divergence / KL divergence is a natural measure of the distance of the two distances.

Maximum Likelihood

Likelihood: given the independence assumption, probability of observing sample under distribution $p \in \mathcal{P}$ is $ Pr[x_1, …, x_m] = \prod_{i=1}^{m} p(x_i) $

Principle: select distribution maximizing sample probability

Inequalities

Markov Inequality

Let $Z$ nonnegative random variable

Since $\mathbb{P}[Z \ge x]$ is monotonically nonincreasing, we have

Rearranging inequality, yields Markov’s Inequality

TODO