bazyabiye_ettelaate_novin_3

در نمایش آنلاین پاورپوینت، ممکن است بعضی علائم، اعداد و حتی فونت‌ها به خوبی نمایش داده نشود. این مشکل در فایل اصلی پاورپوینت وجود ندارد.




  • جزئیات
  • امتیاز و نظرات
  • متن پاورپوینت

امتیاز

درحال ارسال
امتیاز کاربر [0 رای]

نقد و بررسی ها

هیچ نظری برای این پاورپوینت نوشته نشده است.

اولین کسی باشید که نظری می نویسد “Modern Information Retrieval”

Modern Information Retrieval

اسلاید 1: Modern Information RetrievalLecture 1: Introduction

اسلاید 2: Lecture OverviewIntroduction to the CourseIntroduction to Information RetrievalThe Information Seeking ProcessInformation Retrieval History and DevelopmentsDiscussionReferencesMarjan Ghazvininejad2Sharif University Spring 2012

اسلاید 3: Lecture OverviewIntroduction to the CourseIntroduction to Information RetrievalThe Information Seeking ProcessInformation Retrieval History and DevelopmentsDiscussionReferencesMarjan Ghazvininejad3Sharif University Spring 2012

اسلاید 4: Purposes of the CourseTo impart a basic theoretical understanding of IR models Boolean Vector SpaceProbabilistic (including Language Models)To examine major application areas of IR including:Web SearchText categorization and clusteringCross language retrievalText summarizationDigital LibrariesMarjan Ghazvininejad4Sharif University Spring 2012

اسلاید 5: Purposes of the Course …To understand how IR performance is measured:Recall/PrecisionStatistical significanceGain hands-on experience with IR systemsMarjan Ghazvininejad5Sharif University Spring 2012

اسلاید 6: Lecture OverviewIntroduction to the CourseIntroduction to Information RetrievalThe Information Seeking ProcessInformation Retrieval History and DevelopmentsDiscussionReferencesMarjan Ghazvininejad6Sharif University Spring 2012

اسلاید 7: IntroductionGoal of IR is to retrieve all and only the “relevant” documents in a collection for a particular user with a particular need for informationRelevance is a central concept in IR theoryHow does an IR system work when the “collection” is all documents available on the Web?Web search engines are stress-testing the traditional IR modelsMarjan Ghazvininejad7Sharif University Spring 2012

اسلاید 8: Information RetrievalThe goal is to search large document collections (millions of documents) to retrieve small subsets relevant to the user’s information need Examples are:Internet search enginesDigital library cataloguesMarjan Ghazvininejad8Sharif University Spring 2012

اسلاید 9: Information Retrieval …Some application areas within IRCross language retrievalSpeech/broadcast retrievalText categorizationText summarizationSubject to objective testing and evaluationhundreds of queriesmillions of documents Marjan Ghazvininejad9Sharif University Spring 2012

اسلاید 10: OriginsCommunication theory revisitedProblems with transmission of meaningNoiseSourceDecodingEncodingDestinationMessageMessageChannelStorageSourceDecoding(Retrieval/Reading)Encoding(writing/indexing)DestinationMessageMessageMarjan Ghazvininejad10Sharif University Spring 2012

اسلاید 11: Components of an IR SystemDocumentsAuthoritativeIndexing RulesIndexing Process Index Records &Document SurrogatesRetrieval ProcessRetrieval Rules User’s Information Need QuerySpecification ProcessQueryList of DocumentsRelevant to User’sInformation NeedMarjan Ghazvininejad11Sharif University Spring 2012Severe Information Loss

اسلاید 12: Lecture OverviewIntroduction to the CourseIntroduction to Information RetrievalThe Information Seeking ProcessInformation Retrieval History and DevelopmentsDiscussionReferencesMarjan Ghazvininejad12Sharif University Spring 2012

اسلاید 13: Review: Information Overload“The worlds total yearly production of print, film, optical, and magnetic content would require roughly 1.5 billion gigabytes of storage. This is the equivalent of 250 megabytes per person for each man, woman, and child on earth.” (Varian & Lyman)“The greatest problem of today is how to teach people to ignore the irrelevant, how to refuse to know things, before they are suffocated. For too many facts are as bad as none at all.” (W.H. Auden)Marjan Ghazvininejad13Sharif University Spring 2012

اسلاید 14: The Standard Retrieval Interaction ModelMarjan Ghazvininejad14Sharif University Spring 2012

اسلاید 15: Standard Model of IRAssumptions:The goal is maximizing precision and recall simultaneouslyThe information need remains staticThe value is in the resulting document setMarjan Ghazvininejad15Sharif University Spring 2012

اسلاید 16: Problems with Standard ModelUsers learn during the search process:Scanning titles of retrieved documentsReading retrieved documentsViewing lists of related topics/thesaurus termsNavigating hyperlinksSome users don’t like long (apparently) disorganized lists of documentsMarjan Ghazvininejad16Sharif University Spring 2012

اسلاید 17: IR is an Iterative ProcessRepositoriesWorkspaceGoalsMarjan Ghazvininejad17Sharif University Spring 2012

اسلاید 18: IR is a DialogThe exchange doesn’t end with first answerUsers can recognize elements of a useful answer, even when incompleteQuestions and understanding changes as the process continuesMarjan Ghazvininejad18Sharif University Spring 2012

اسلاید 19: Bates’ “Berry-Picking” ModelStandard IR modelAssumes the information need remains the same throughout the search processBerry-picking modelInteresting information is scattered like berries among bushesThe query is continually shiftingMarjan Ghazvininejad19Sharif University Spring 2012

اسلاید 20: Berry-Picking ModelQ0Q1Q2Q3Q4Q5A sketch of a searcher… “moving through many actions towards a general goal of satisfactory completion of research related to an information need.” (after Bates 89)Marjan Ghazvininejad20Sharif University Spring 2012

اسلاید 21: Berry-Picking Model …The query is continually shiftingNew information may yield new ideas and new directionsThe information needIs not satisfied by a single, final retrieved setIs satisfied by a series of selections and bits of information found along the wayMarjan Ghazvininejad21Sharif University Spring 2012

اسلاید 22: Restricted Form of the IR ProblemThe system has available only pre-existing, “canned” text passagesIts response is limited to selecting from these passages and presenting them to the userIt must select, say, 10 or 20 passages out of millions or billions!Marjan Ghazvininejad22Sharif University Spring 2012

اسلاید 23: Information RetrievalRevised Task Statement:Build a system that retrieves documents that users are likely to find relevant to their queriesThis set of assumptions underlies the field of Information RetrievalMarjan Ghazvininejad23Sharif University Spring 2012

اسلاید 24: Lecture OverviewIntroduction to the CourseIntroduction to Information RetrievalThe Information Seeking ProcessInformation Retrieval History and DevelopmentsDiscussionReferencesMarjan Ghazvininejad24Sharif University Spring 2012

اسلاید 25: IR History OverviewInformation Retrieval HistoryEarly “IR” Non-Computer IR (mid 1950’s)Interest in computer-based IR from mid 1950’sModern IR – Large-scale evaluations, Web-based search and Search Engines -- 1990’sMarjan Ghazvininejad25Sharif University Spring 2012

اسلاید 26: OriginsVery early history of content representationSumerian tokens and “envelopes”Alexandria - pinakesMarjan Ghazvininejad26Sharif University Spring 2012

اسلاید 27: Visions of IR SystemsRev. John Wilkins, 1600’s : The Philosophical Language and tablesWilhelm Ostwald and Paul Otlet, 1910’s: The “monographic principle” and Universal ClassificationEmanuel Goldberg, 1920’s - 1940’sH.G. Wells, “World Brain: The idea of a permanent World Encyclopedia.” (Introduction to the Encyclopédie Française, 1937)Vannevar Bush, “As we may think.” Atlantic Monthly, 1945.Term “Information Retrieval” coined by Calvin Mooers. 1952Marjan Ghazvininejad27Sharif University Spring 2012

اسلاید 28: Card-Based IR SystemsUniterm (Casey, Perry, Berry, Kent: 1958)Developed and used from mid 1940’sEXCURSION 43821 90 241 52 63 34 25 66 17 58 49130 281 92 83 44 75 86 57 88 119640 122 93 104 115 146 97 158 139870 342 157 178 199 207 248 269 298LUNAR 12457110 181 12 73 44 15 46 7 28 39430 241 42 113 74 85 76 17 78 79820 761 602 233 134 95 136 37 118 109 901 982 194 165 127 198 179 377 288 407Marjan Ghazvininejad28Sharif University Spring 2012

اسلاید 29: Card-Based IR Systems …Batten Optical Coincidence Cards (“Peek-a-Boo Cards”), 1948Lunar Excursion Marjan Ghazvininejad29Sharif University Spring 2012

اسلاید 30: Marjan GhazvininejadCard-Based IR Systems …Zatocode (edge-notched cards) Mooers, 1951 Document 1 Title: lksd ksdj sjd sjsjfkl Author: Smith, J. Abstract: lksf uejm jshy ksd jh uyw hhy jha jsyhe Document 200 Title: Xksd Lunar sjd sjsjfkl Author: Jones, R. Abstract: Lunar uejm jshy ksd jh uyw hhy jha jsyhe Document 34 Title: lksd ksdj sjd Lunar Author: Smith, J. Abstract: lksf uejm jshy ksd jh uyw hhy jha jsyhe 30Sharif University Spring 2012

اسلاید 31: Computer-Based IR SystemsBagley’s 1951 MS thesis from MIT suggested that searching 50 million item records, each containing 30 index terms would take approximately 41,700 hours Due to the need to move and shift the text in core memory while carrying out the comparisons1957 – Desk Set with Katharine Hepburn and Spencer Tracy – EMERAC Marjan Ghazvininejad31Sharif University Spring 2012

اسلاید 32: Historical Milestones in IR Research1958 Statistical Language Properties (Luhn)1960 Probabilistic Indexing (Maron & Kuhns)1961 Term association and clustering (Doyle)1965 Vector Space Model (Salton)1968 Query expansion (Roccio, Salton)1972 Statistical Weighting (Sparck-Jones)1975 2-Poisson Model (Harter, Bookstein, Swanson)1976 Relevance Weighting (Robertson, Sparck-Jones)1980 Fuzzy sets (Bookstein)1981 Probability without training (Croft)Marjan Ghazvininejad32Sharif University Spring 2012

اسلاید 33: Historical Milestones in IR Research …1983 Linear Regression (Fox)1983 Probabilistic Dependence (Salton, Yu)1985 Generalized Vector Space Model (Wong, Rhagavan)1987 Fuzzy logic and RUBRIC/TOPIC (Tong, et al.)1990 Latent Semantic Indexing (Dumais, Deerwester)1991 Polynomial & Logistic Regression (Cooper, Gey, Fuhr)1992 TREC (Harman)1992 Inference networks (Turtle, Croft)1994 Neural networks (Kwok)1998 Language Models (Ponte, Croft)Marjan Ghazvininejad33Sharif University Spring 2012

اسلاید 34: Boolean IR SystemsSynthex at SDC, 1960Project MAC at MIT, 1963 (interactive)BOLD at SDC, 1964 (Harold Borko)1964 New York World’s Fair – Becker and Hayes produced system to answer questions (based on airline reservation equipment)SDC began production for a commercial service in 1967 – ORBITNASA-RECON (1966) becomes DIALOG1972 Data Central/Mead introduced LEXIS – Full text of legal informationOnline catalogs – late 1970’s and 1980’sMarjan Ghazvininejad34Sharif University Spring 2012

اسلاید 35: Experimental IR systemsProbabilistic indexing – Maron and Kuhns, 1960SMART – Gerard Salton at Cornell – Vector space model, 1970’sSIRE at SyracuseI3R – CroftCheshire I (1990)TREC – 1992InqueryCheshire II (1994)MG (1995?)Lemur (2000?)Marjan Ghazvininejad35Sharif University Spring 2012

اسلاید 36: The Internet and the WWWGopher, Archie, Veronica, WAISTim Berners-Lee, 1991 creates WWW at CERN – originally hypertext onlyWeb-crawlerLycosAlta VistaInktomiGoogle(and many others)Marjan Ghazvininejad36Sharif University Spring 2012

اسلاید 37: Information Retrieval – Historical ViewBoolean model, statistics of language (1950’s)Vector space model, probablistic indexing, relevance feedback (1960’s)Probabilistic querying (1970’s)Fuzzy set/logic, evidential reasoning (1980’s)Regression, neural nets, inference networks, latent semantic indexing, TREC (1990’s)DIALOG, LexusNexus, STAIRS (Boolean based) Information industry (O($B))Verity TOPIC (fuzzy logic)Internet search engines (O($100B?)) (vector space, probabilistic)ResearchIndustryMarjan Ghazvininejad37Sharif University Spring 2012

اسلاید 38: Research Sources in Information RetrievalACM Transactions on Information SystemsAm. Society for Information Science JournalDocument Analysis and IR Proceedings (Las Vegas)Information Processing and Management (Pergammon)Journal of DocumentationSIGIR Conference ProceedingsTREC Conference ProceedingsLectures in Computer ScienceMarjan iGhazvininejad38Sharif University Spring 2012

اسلاید 39: Research Systems SoftwareINQUERY (Croft)OKAPI (Robertson)PRISE (Harman)http://potomac.ncsl.nist.gov/priseSMART (Buckley)MG (Witten, Moffat)CHESHIRE (Larson)http://cheshire.berkeley.eduLEMUR toolkitLuceneOthers Marjan Ghazvininejad39Sharif University Spring 2012

اسلاید 40: Lecture OverviewIntroduction to the Course(re)Introduction to Information RetrievalThe Information Seeking ProcessInformation Retrieval History and DevelopmentsDiscussionReferencesMarjan Ghazvininejad40Sharif University Spring 2012

اسلاید 41: Next TimeBasic Concepts in IRReadingsChapter 1 in IR text (?????????????)Joyce & Needham “The Thesaurus Approach to Information Retrieval” (in Readings book)Luhn “The Automatic Derivation of Information Retrieval Encodements from Machine-Readable Texts” (in Readings)Doyle “Indexing and Abstracting by Association, Pt I” (in Readings)Marjan Ghazvininejad41Sharif University Spring 2012

اسلاید 42: Lecture OverviewIntroduction to the Course(re)Introduction to Information RetrievalThe Information Seeking ProcessInformation Retrieval History and DevelopmentsDiscussionReferencesMarjan Ghazvininejad42Sharif University Spring 2012

اسلاید 43: ReferencesMarjan GhazvininejadSharif University Spring 201243

34,000 تومان

خرید پاورپوینت توسط کلیه کارت‌های شتاب امکان‌پذیر است و بلافاصله پس از خرید، لینک دانلود پاورپوینت در اختیار شما قرار خواهد گرفت.

در صورت عدم رضایت سفارش برگشت و وجه به حساب شما برگشت داده خواهد شد.

در صورت بروز هر گونه مشکل به شماره 09353405883 در ایتا پیام دهید یا با ای دی poshtibani_ppt_ir در تلگرام ارتباط بگیرید.

افزودن به سبد خرید