Ali Mirzaei Saghezchi
Author
The Viterbi algorithm is a dynamic programming algorithm for obtaining the maximum a posteriori probability estimate of the most likely sequence of hidden states—called the Viterbi path—that results in a sequence of observed events, especially in the context of Markov information sources and hidden Markov models (HMM).

And we have:

Thus, the algorithm for computing the most likely sequence is similar to filtering: it runs for- ward along the sequence, computing the m message at each time step, using Equation above. The progress of this computation is shown in Figure below. At the end, it will have the probability for the most likely sequence reaching each of the final states. One can thus easily select the most likely sequence overall (the states outlined in bold). In order to identify the actual sequence, as opposed to just computing its probability, the algorithm will also need to record, for each state, the best state that leads to it; these are indicated by the bold arrows in Figure below. The optimal sequence is identified by following these bold arrows backwards from the best final state.

The algorithm we have just described is called the Viterbi algorithm, after its inventor. Like the filtering algorithm, its time complexity is linear in $t$, the length of the sequence. Unlike filtering, which uses constant space, its space requirement is also linear in $t$. This is because the Viterbi algorithm needs to keep the pointers that identify the best sequence leading to each state.
Suppose someone wants to spy on HTTPS connections and infer the sequence of browsing webpages. How he can do this? If he/she measures the sequence of sizes of packets coming in as noisy observations and define the contents of packets as hidden variables he/she can use the HMM Model to reach that goal!
Transition model can be calculate via links on each webpage. In other words probability of choosing next webpage is related to the links in each webpage. That's mean we do random walk between webpages. After considering some tips such as dynamically generated content and user-specific content and ... we can run the HMM Algorithm to estimate probability distribution P(packet size| webpage).
In the following chart we can see the error of this algorithm is around 10% (BoG) and it's so so shocking. nowadays Deep-Learning can decrease error to about 0%! Do you think you have secuirty?!
