Tuesday, November 15, 2011

HMMs and Filters

Yes, I've been AWOL for a week or so, while life got the better of me.   Picking back up with HMMs.  I'll try to get to the sections I missed before the midterm.


Something he hasn't said yet, but HMMs  can be thought of as a special-case of Bayes nets.  OK, now he's said it.  This special-case-ness took us a while to notice.  For years they were studied separately.

In the underground robot example, Sebastian mentions "noise in [the] motors."  That might be an odd phrase to hear, because we think of noise as something applying to a signal, and motors don't have a signal.  He's using the word noise metaphorically, referring to the noise in sensors.  Errors in motors occur when wheels slip, or bumps in the floor nudge the robot off its path.

Markov Chain questions:  note how using the law of total probability at each state is actually really efficient compared to doing the computation by treating the sequence as the outcome.  That is, we could say that the probability of rain on day three is the sum of all possible ways we could get rain on day 3:
\[P(R_0R_1R_2R_3) +
P(R_0S_1R_2R_3) +
P(R_0R_1S_2R_3) +
P(R_0S_1S_2R_3)
\]
The nice things about the Markov model is that we only need to look at the previous day and the transitions, to compute the current probability.  This reduces the number of mutliplications you need to make.

Nice discussion of particle filters.  I'm going to need to update what I do in Robotics next semester.  Overall, this lecture was pretty tight.

Tuesday, November 1, 2011

Logic

I usually find that people who like probabilities hate logic, and vice versa.  I was a logic person to start with, and it took years of training to finally learn to love probabilities.  This might be an unpleasant switch for some.

The first quiz is a little misleading because he appeals to what this means in English.  The sentence "If 5 is an odd number, then Paris is the capital of France," is false, since we know there is no connection.  He is actually asking about the propositional logic truth values.

It is kind of a cop out to say that implication \(P \Rightarrow Q\) is just defined that way.  There are really good reasons for it.  To understand, you first need to accept that the truth value of a sentence has nothing to do with whether it is a coherent idea in the real world, and it says nothing about the truth of it's component  pieces., it only checks to see if it is consistent with every else we know.  If I tell you that if it is raining, I bring my umbrella, it is not raining, and I did bring my umbrella anyway:
\[
\begin{align}
&R \Rightarrow U\\
&\neg R\\
&U
\end{align}
\]

What do we want to say about the truth of \(R \Rightarrow U\)?  is that sentence suddenly false just because I brought my umbrella on a sunny day?  Does that mean I won't the next time it rains?  Of course not, just because \(R\) is false doesn't make that rule suddenly untrue.  It is still consistent with the other stuff we know.  The only way we can say for sure that the rule is false is if we see a counter-example: \(R \wedge \neg U\).  Now, if that's true, then we know the rule as stated must be false.

Another way to think about this is to look at the truth tables.  What if we said that for the implication \(P \Rightarrow Q\) when P is false, the whole thing is false?  In that case, the truth table would be identical to the "and" truth table.  Do we really want to say that \(P \Rightarrow Q\) means the same thing as \(P \wedge Q\)?  "If P is true then Q is true" means the same thing as "both P and Q are true?"  What about if we said that the implication is true when both P and Q are false, but false when P is false and Q is true?  Then if we look at the truth table, that's the same as equivalence: \(P \Leftrightarrow Q\).  Do we really want to say that "If P is true then Q is true" means the same thing as "P and Q are the same"?  The table we have is our only choice.


First order logic: Logics are deceptive.  They're really easy to describe but they have lots of thorny implications.  It's easy to watch this and thing you get it perfectly, but really you know nothing yet.

He just said \(\forall_x Vowel(x) \Rightarrow NumberOf(x) = 1\) is a valid statement.  He means valid in the loose English sense.  It is not always true for all models.

That final quiz is a good example of how logic seems simple but gets tricky pretty fast.

Unsupervised learning 18-30

What he didn't say about the eigenvectors is that they are pulled from the covariance matrix that describes the shape of the gaussian.

He just dropped the phrase "principle component analysis."  That's just the name for finding that largest eigenvector.

I notice he isn't telling us how to map the data down to the smaller space once we find it.  Maybe that will come later?  Guess not.

I honestly had no idea about spectral clustering.  That was pretty cool.