Markov Chains - Yoda

 


Yoda

Natural Language Process(NLP) is an AI that interacts between computers and human languages. Through the technique, the computer can generate or understand somewhat human languages based on the training set of words. On this page, we are using a Machine Learning(ML) model called, Markov Chains. First of all, we will briefly explain the concept of Markov Chains. Then, we will talk about the specific way to implement NLP using Markov Chains. Lastly, we will see the result and see what we can generate!

1) Markov Chains:

One of the most common example that we use for explaining Markov Chains is a weather prediction. 

Let's say it's we have only two status for weather: 'cold' and 'hot'. Then, the probability that it's hot today but cold tomorrow is 40%. On the other hand, the probability that it's cold today but tomorrow is also cold is 70%. Like the chart above, Markov Chain is a sequence of events which contains the probability of transitioning a state to another state. When the state moves to another state based on the provided probability,  we can use Bernoulli distribution here. Bernoulli distribution is used when we are experimenting only once and there are only two results. Flipping coin only once is a typical case for Bernoulli distribution.

But, as you know, weather is not just only hot and cold. It can be mild, burning, or freezing. Likewise, there could be several states other than just two. So, I brought another chart here.
Since we have more than 2 states, we use a different distribution called, Categorial distribution. Now, you noticed that we do not have only today and tomorrow. We have the day after tomorrow or even yesterday. So, we have to use the Categorical distribution several times and there is a distribution for that: Multinomial distribution. To use the distribution in Python, we will use Numpy. Commonly, people define Numpy as np for conveniency.


Let's say we have 3 days to guess and we started today as 'freezing'. After I run my own Markov Chain code, this is the result:

It started from 'freezing' today, 'freezing' still tomorrow, 'hot' the day after, and 'burning' at the end.

2) Markov Chains + NLP:

Now, we talked about the little bit of Markov Chains, we will use talk about how this is applied to NLP.
Let's say, we have a training set:

I like an apple.
The apple was great.
Do you like an apple?

First thing we do is to keep unique words from there. So, we have - { I, like, an, apple, The, was, great, Do, you}. Then we change that into a chart as we did above.
Then, instead of 0s, we will put the counts of how often the words show up right after. For example, 'like' is followed by 'I' once. 'an' is followed by 'like' once. So, we have
We are almost done. we just have to normalize them. So, we have:
Then you use the multinomial distribution again like we did above for the weather prediction.
3) Yoda

Now, we will be seeing what Yoda says based on his training words.
What do you think? Doesn't seem too awkward although we made it up! I didn't put the code here just because it wasn't a simple code. But, if any of you are willing to see the code, the code is listed down below!

*GitHub: https://github.com/KwakSukyoung/coding/blob/master/ACME/MarkovChains/markov_chains.py








Comments