مدل سازی زبان language modeling
اسلاید 1: Introduction to N-grams Language Modeling
اسلاید 2: Probabilistic Language ModelsToday’s goal: assign a probability to a sentenceMachine Translation:P(high winds tonite) > P(large winds tonite)Spell CorrectionThe office is about fifteen minuets from my houseP(about fifteen minutes from) > P(about fifteen minuets from)Speech RecognitionP(I saw a van) >> P(eyes awe of an)+ Summarization, question-answering, etc., etc.!!Why?
اسلاید 3: Probabilistic Language ModelingGoal: compute the probability of a sentence or sequence of words: P(W) = P(w1,w2,w3,w4,w5…wn)Related task: probability of an upcoming word: P(w5|w1,w2,w3,w4)A model that computes either of these: P(W) or P(wn|w1,w2…wn-1) is called a language model.Better: the grammar But language model or LM is standard
اسلاید 4: How to compute P(W)How to compute this joint probability:P(its, water, is, so, transparent, that)Intuition: let’s rely on the Chain Rule of Probability
اسلاید 5: Reminder: The Chain RuleRecall the definition of conditional probabilities Rewriting:More variables: P(A,B,C,D) = P(A)P(B|A)P(C|A,B)P(D|A,B,C)The Chain Rule in General P(x1,x2,x3,…,xn) = P(x1)P(x2|x1)P(x3|x1,x2)…P(xn|x1,…,xn-1)
اسلاید 6: The Chain Rule applied to compute joint probability of words in sentenceP(“its water is so transparent”) =P(its) × P(water|its) × P(is|its water) × P(so|its water is) × P(transparent|its water is so)
اسلاید 7: How to estimate these probabilitiesCould we just count and divide?No! Too many possible sentences!We’ll never see enough data for estimating these
اسلاید 8: Markov AssumptionSimplifying assumption:Or maybeAndrei Markov
اسلاید 9: Markov AssumptionIn other words, we approximate each component in the product
نقد و بررسی ها
هیچ نظری برای این پاورپوینت نوشته نشده است.