# PAC-Bayes

Kullback-Leibler divergence / KL divergence is a natural measure of the distance of the two distances.

### Maximum Likelihood

Likelihood: given the independence assumption, probability of observing sample under distribution $p \in \mathcal{P}$ is $Pr[x_1, …, x_m] = \prod_{i=1}^{m} p(x_i)$

Principle: select distribution maximizing sample probability $p^* = \underset{p\in \mathcal{P}}{arg \, max} \prod_{i=1}^{m} p(x_i)\\ \Longleftrightarrow p^* = \underset{p\in \mathcal{P}}{arg \, max} \sum_{i=1}^{m} log \, p(x_i)$

# Inequalities

## Markov Inequality

Let $Z$ nonnegative random variable

$\mathbb{E}[Z] = \int_{x=0}^{\infty} P[Z \ge x] dx$

Since $\mathbb{P}[Z \ge x]$ is monotonically nonincreasing, we have

$\forall a \ge 0, \mathbb{E}[Z] \ge \int_{x=0}^a \mathbb{P}[Z \ge x] dx \ge \int_{x=0}^a \mathbb{P}[Z \ge a] dx = a \mathbb{P}[Z \ge a]$

Rearranging inequality, yields Markov’s Inequality

$\forall a \ge 0, \mathbb{P}[Z \ge a] \le \frac{\mathbb{E}[Z]}{a}$