What?

When segmenting spoken language into words, we have to assume if sounds are part of a new word or not.

How will we calculate it?

Conditional Probability!! Woooohoooo!! So, for the word β€œgdog”, the probability of the sound β€œd” given the sound of β€œg” is right before it.

Maths Example:

Suppose the phoneme [Γ°] (pronounced β€œth”) occurs 200,000 times in a text:

  • 190,000 times are before a vowel (as in the, this);
  • 200 times are before [m].