علوم مهندسی

Data Warehousing

صفحه 1:
Chapter 0: 0۵۵) حول :0 dePPrep @. WofFer, Dav ®. Presvot, Pred R. OrtPackteo

صفحه 2:
ODePicitiog oP ters (Reusous Por ‏تا ونوا مور وا‎ Reuse Por ceed ‏حاص جاع سد ولوك خام‎ Oesoribe three levels oP dota worekouse achitertures List Pour steps oP ‏توص ول‎ Orsoribe two vowpouesis oP star eschew Cstccate Pact table size Opsiqa a dota wart [ATE UNIVERSITY

صفحه 3:
(0 Py ۹ ۰ ۵0 — © sbievt-orieuted, ioteqrated, fve-vartcdl, ave-updatabhe vollevtioa oP data used i Support oP eacagened ‏سل‎ ‏مس باه‎ — Ohevtorieded: 4. ostowvers, poicuts, studects, products - Teteqraed: Ovesistedt cexviey poovedives, Porwats, ‏لصو‎ ‎structures; Proc wutiple data sources — Preevortad, Oon study treads cod choages — Opoudabe: Read-ooly, perivdicdly rePreshed * Ode Owt — © data worekouse thal is hevited tt soope E UNIVERSITY

صفحه 4:
© Aoteqroted, pospoep-wide view oP ‏ماه لامج‎ (Proc disporde databases) ۰ Gepardiv oP operciocd and icPforwaiccd systews ord dott (Por teeproved perPorxvade) Table 11-1 Comparison of Operational and Informational Systems informational Systems ‘Support managerial decision making Historical point.in-time (snapshots) and predictions Managers, business analysts, customers Broad, ad hoc, complex queries and analysis Ease of flexible access and use Periodic batch updates and queries requiring many oral rows Operational Systems Run the business on a current basis Current representation of state of the business Clerks, salespersons, administrators Narrow, planned, and simple updates and queries Performance: throughput, availability Many, constant updates and queries on one or a few table rows Characteristic Primary purpose Type of data Primary users Scope of usage Design goal Volume A STATE UNIVERSITY

صفحه 5:
e Versus Data Mart Data Mart Scope + Specific DSS application * Decentralized by user area * Organic, possibly not planned Data * Some history, detailed, and summarized + Highly denormalized Subjects * One central subject of concer to users Sources ۰ Few internal and extemal sources Other Characteristics + Restrictive + Project-oriented * Short lite + Start small, becomes large + Multi, semi-complex structures, together complex Table 11-2 Data Wareho Data Warehouse Scope + Application independent * Centralized, possibly enterprise-wide * Planned Data > Historical, detailed, and summarized + Lightly denormalized Subjects * Multiple subjects Sources + Many internal and external sources Other Characteristics * Flexible * Data-oriented Long life * Large + Single complex structure Adapted rom Strange (1997) TATE UNIVERSITY IOWA

صفحه 6:
Outa Ourehouse (rchitevtures * ‏وق‎ Two-Level Orckitecture ۰ Iedepesdedt Data Dart * Oepertedt Data Out ond Operatiocd Data Gtore Lovicd Duta Out ant @rive Oarekouer ° DrreeLwer ucvhiterture lve some form of extraction, transformation andgoading TOWA STATE UNIVERSITY

صفحه 7:
Figure 11-2: Generic two-level data warehousing architectt Source Data Staging roa Data &Netsdata End-User Data Systems Storage Area Presentation Tools ۳ ea Processing ‘Ad hoe query i tools oak ‏فلت‎ ‎done pee ۲1 ‏مه عم‎ nat ieee Report mitre stone ‏تسس‎ Endusor 0 (aie > al ‘pplceions ۳ ‏تم سر‎ tea casting ae ‏و‎ 1 mining tools: , Visualization E 72 delivery occ riodic extraction > data is not completely curren TOWA STATE UNIVERSITY

صفحه 8:
Figure 11-3 Independent data mart data warehousing Dew : Mini-warehouses, limited in architecture ‏و‎ noe Data Systems Storage Area Presentation Tools ۳ Processing Ad hoc every ean ‘ole reconcile matched to ae preservation ‏ع‎ format ۲-۸ ie temo chips Report writers ‏مامه‎ ‎transform End-user conform ‘plications Modeling internat expert to dats ‘mining tools Exemal teks ۳ hy 7 ‏ا‎ tools ~_E 1 1 Macy ess = a Separate ETL for each ata access 4 independent data mart complexity d, inle d TOWA STATE UNIVERSITY

صفحه 9:
ODS provides option a three- _ for obtaining ‎currentdata‏ سس ‎Storage Area Presentation Tecs‏ ‎‘Ad hoc query teals ‏لمع‎ ‎preceriation ‏سس‎ ‎Report writers ‎End-user ‎‘applications ‎Modeling ‎‘mining tools ‎Visualization ‎ ‎ ‎impler data access ‎۹ ‎Dependent data ‎ ‎Figure 11-4 Dependent data mart ‎with operational data store: level arehitecture Data Staging Area ‎ata Systems (Operational Data Store) Internal ‎< ‎External ‎2 mip ‎Data Storage relvional, fast Processing deen reconcile erie match ‏ام‎ ‎remove dupe ‏مدا ماه‎ tensiorm center dimensions export = DW ‘nel DIM: ‎leanee ‏و‎ ‎waa ‎ ‎ ‎ ‎Single ETL for enterprise data warehouse ‎TOWA STATE UNIV E1511 Y ‎ ‎ ‎ ‎ ‎

صفحه 10:
ODS and data warehouse are one and the same Figure 11-5 Logical data mart and real time warehouse ‏فیس‎ ‎Dota & Motadata New buss nies ۳ architegture Data Staging hes End User Deta Systems (Operations Dats Stor) Storage Area Presentation Tec ‏سس سا‎ 0 | Data Storage Ad hoc query vena st ‏مس‎ tons Real-time Report writers Processing Entucer ۳۳ ‏ی‎ ‏مهم ی‎ a, CRM ane ‏ا‎ er sti ATH atch 2 ‏نگ ات‎ (Fe (tial >| remove dus 4 ۳ ‏سس‎ ‎| ‏و‎ ‏عاد و ساد | و جح‎ 2 ‏ا‎ 01 ۳ tools Near real-time ETL fotata marts are NOT sepg Data Warehouse databases, but logical, TOWA STATE UNIVERSITY

صفحه 11:
Outa Charunteristivs 0 سب ‎Exavple oF OBOG‏ ‎Gtatus vs. Cuect Data‏ روت رجا ‎eore image‏ ‎x2 | abeat | oaza004 | 720 Statu‏ ‎J 5‏ ‎Update‏ ‎Kis‏ ‎coargr2008 | | vent (wthetawa) Event = a database‏ ‎action‏ ‎(create/update/delete)‏ 78 ‎that results from a‏ After image transaction [rs [acm Statu 4 Ki 5 WA STATE UNIVERSITY

صفحه 12:
hii Outa Ckarunteristics Drecwiedt pperciocd ‏بل‎ Trassiedt vs. Periodic Duta Table X (10/05) With ky [a | 8 transient oot | a |b data, oo2 |e | a changes to ooa | ce | f existing oot | a | on records are written Table X (10/06) Table X (10/07 vail able X (1 able X (10107) 1 6006 ‏ا‎ 6 previous ‏د‎ | 2 a oe records, ۱ ‏اع أ ممه ۵ نت‎ thus ۱ wos |e br ۷ c08: destroying ۳

صفحه 13:
we Outs Churucteristics ‏ول‎ Drasiedt vs. Periodic Duta وی دس كه ص | © ‎Om [A]‏ | وه ‎Periodic‏ 6ص | ‎wa [vo [fe] data are‏ ‎never‏ اه ‎physicall‏ ‎wbx 007‏ وا ‎“ie‏ ‎Key [| Ow [A | 8 | Aton key | Ove | A | 8 | Aten altered‏ ‎cor | voor [a [>| c cor | woos [+ [® | © or‏ ‎deleted‏ ۳ - ال ات | ‎os | ۱08 | ۰ [| ۲ 0 once‏ 3 ۲ | ۰ | ۱۵8 | موه ‎om | wes fa | | 6 pw | wor fe ls |v they‏ ‎wo | we fel] e have‏ م ‎oo | oe fy Le]‏ ‎wow [wos fy [|v 3‏ م |[ ‎row [awe Tm‏ ‎Dw os | wor | y | > ۲ > 5‏ 1 6 هه | ۱۹۹۱۱۵۹ Bt

صفحه 14:
Other Duta Darehouse Changes © Dew desoipive utributes ٠ Dew busicess uniivity otrbutes © Dew chsses oP desoipive utributes © Orsmipive utbutes bevowe wore rePiced ٠ Orsoripive dota ae related ty poe water * Dew source vP dot rE UNIVERSITY

صفحه 15:
Phe Revowiled Duta Lauper * Dypicdl operatccd cata te — Dresiect-ont histericad Der ested (pete dee ceased Bie padres) — Restricted ta scope powprekeuive — Goweitres poor qudliyoovesisieanies urd errors * OPter GPL, dota should be: — Detaled-ont ‏اسر موه‎ ‏ج اما‎ — ‏و او لمر لد اده(‎ or higher — Oowpreheusiveruerprise-wide perspevive — Doel ‏وملعم‎ should be cond eapuyh ‏عامط اطوعلك اطوجه صا‎ — Quality ‏رکه انح ات عمجم امن‎ rE UNIVERSITY

صفحه 16:
Phe EPL Process * Cupture/@xtrant © Gob or cote ‏وان‎ ‎٠ ProwPornv © Lowd ocd ‏لها‎ ETL = Extract, transform, and Igat

صفحه 17:
Capture/Extract...obtaining a snapshot of a chosen subset of the source data for loading into the data 0 0 عدره وك :00 مس ‎Gteps ia toto _- Serub/Cleanse Transform ~—‏ ‎a‏ 0 ‎recoveries J 0 a‏ ‎i Staging Area ۱‏ ‎Load \,‏ 7 ‎/|Capture/Extract 5‏ ‎en ee ”"Meceages sbout ,‏ ‎et rejected data 5‏ + للد اس 5 مدا ۲ ماس -~ 57 د ‎Operational Messages about Enterprise data‏ ‎systems. rejected data warehouse or‏ ‎operational data‏ ‎tore‏ ‎Static extract = Incremental extract =‏ ‎capturing a snapshot of capturing changes a ۱‏ ‎the source data at a have occurred ۱‏ ‎iot in ti last static extag‏ point in time TOWA STATE UNIVERSITY

صفحه 18:
Scrub/Cleanse...uses pattern recognition and Al techniques to upgrade data quality ‎ea =‏ :00 سم ‎Gteps ia dota ‘ScrubiGleanse] .“ Transform ~~‏ ‎revatio x /‏ ‎(coct.) 0 “Staging Area‏ ‎CapturelExtract,‏ / ‎Mescages about‏ ” ار ‎ee a rejected data at‏ - 2 = 7 2-9 ‎ee b >‏ ا سد ‎data‏ سر ‎Operational Meesages about‏ ‎systems rejected data warehouse or‏ ‎operational data‏ ی ‎store‏ ‎Fixing errors: Also: decoding,‏ ‎misspellings, erroneous reformatting, time stamping,‏ ‎dates, incorrect field usage, conversion, key genera |‏ ‎ ‎ ‎ ‎mismatched addresses, merging, error missing data, duplicate data, detection/loggingg# ‎ATE UINIV ‎ ‎ ‎ ‎RSITY ‎ ‎

صفحه 19:
Transform = convert data from format of operational system to format of data warehouse ‎ae‏ و تست :100 سب ‎Transform | ~‏ را ‎Gteps in dota _~~ Scrub/Cleanse‏ ‎Z‏ ترس مت وا ‎(cout.) Me ‘Staging Area‏ ‎ ‎/ Capture/Extract ‎Messages about ‎ ‎rejected data 9 ‏هه‎ ۳ as Operational Messages about Enterprise data systems rejected data warehouse or ‘operational data store Record-level: Field-level: Selection-data partitioning single-field-from one field tp (24 Joining-data combining field ‎Aggregation-data multi-field-from man summarization 1 5 1 ‎ ‎ ‎

صفحه 20:
Load/Index= place transformed data into the warehouse and ۳ 1040: create indexes Geeps ta dota ‏ساي تر‎ 7 Transform > ۳ 3 قهة ووأوهاه ر يهنا ‎CaptuerExtract‏ / ‎ole _”” Messages about — |‏ ‎a rejected data pea‏ _— 49 ~ ف ‎Sess ie‏ 3 = ‎Operational Messages about Enterprise data‏ ‎systems rejected data warehouse or‏ ‎‘operational data‏ ‎store‏ ‎Refresh mode: bulk Update mode: onlys‏ ‎rewriting of target data at changes in source da; ۳‏ ‎periodic intervals written to data wa‏ TOWA STATE UNIVERSITY

صفحه 21:
دمشدوسرو صصص لاصخ جابيد 8) :00-00 سب In general-some transformation function translates data from old form 555 5 to new form ‏مس‎ Algorithmic transformation 1 uses a formula or logical Sarees expression Table lookup-another approach, uses a separate table keyed b) source record code

صفحه 22:
Piure (0-08: OutiPield trocsPoreativd Source Record ‎Telephone No | +=‏ | شم | مس مه ‎Sie‏ ‎M:1-from many source‏ 5 ‎fields to one target‏ ‎field‏ ‎“angst Recor‏ ‎addrose | ++‏ | قيمك | مصرممع لمعته ماوق ‎ ‎ ‎1:M-from one source field to many target fields ‎Target Record ‎ ‎ ‎ ‎Product iD | Brond Name | Product Name | === ‎ ‎ ‎ ‎TOWA STA JNIVERSITY ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎

صفحه 23:
Orrived Data ٠ Obievives — Gose oP use Por devisivg support uppiicotioas — Post respouse to predePiced user queries — Custowized data Por poricuhar forget cudieures — Od-koo query support — Oats wictay capabilities ‏سمل دعر‎ — Ortaled (costly perivdiz) dota = Bycreyae (Por suoaary) — Ostributed (10 departeectdl servers) Most common data model = star schemg Gi (also called “dimensional model”) rE UNIVERSITY

صفحه 24:
‎Cowpourcds of o ster schewe‏ 1-09( مب ‎Fact tables contain ‎ee nen Dimension table actual or quantitative ۳ 2 ‏سم‎ Key 3 (PK) Fact table ‘Attribute ‏وموم درم | سس‎ | Attribute > ۲ 2 (PK\FK) Key 3 (PK\(FK) [>> 5 Attribute Key 4 (PK\FK) | Dimension tables are denormalized ind fact Key 5 (PK) to maximize performance ‏چم‎ Dimension table 1 ‏اس‎ Key 4 (PK) Data column Attribute Attribute Data column Dimension tables contain ۳7۳ descriptions about the subjects of ۳ ‎ ‎ ‎ ‎ ‎ ‎Dimension table Key 1 (PK) Attribute Attribute ‎Attribute ‎Dimension table Key 2 (PK) ‎Attribute ‎Attribute ‎Attribute ‎ ‎the business Excellent for ad-hoc queries, but bad for online transa: ‎RSITY ‎IMA‏ كن ‎1:N relationship between dimension tables a ‎table: ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎

صفحه 25:
امه موه وق :00-06 سم PRODUCT Fact table provides statistics for Product_Code sales broken down by product, Description period and store dimensions Color See. SALES STORE ‏ها‎ Product _Code ;—] Store_Code <<) Period Code Store_Name PERIOD 2 Store Code ‏لخ‎ City Period Code ‏سس‎ 8 Units. Sold ‘oor nits § Telephone Dollars Sold ay ollars Manager Dollars_Cost Month Day TOWA STATE UNIVERSITY

صفحه 26:
‎sckews wit socople dota‏ 6 10-9 سم ‎ ‎ ‎ ‎Product Period ‎“Gade | Deceripton | Color | size ede | Year | Quarter | Month 100 | Sweater | Blue | 40 zoos] 4 4 410 | shoes | Brown | 10 1/2 zoos] 4 3 125 | Glows [tan | 2004] 1 3 ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎Product | Period | Store | unite | Dotare | Datars Cede | Cece | Sola | soit | Cost ‏6د‎ | 62 | sr | a0 | 1800 | 1200 sates] 126 | 002 | sz | 50 | 1000 | 600 100 | oor | si | 40 | 1600 | 1000 ‏ود‎ | 02 | sa | ‏مه‎ | 2000 | 1200 100 | 00a | se | a0 | 1200 | 750 sia | Store Corte } Name | cry | Tatepnone | manager y Store Tan's | San Antonia | 689-109-1400 | Burgace 1 eure | Portna | 042.621.2125 | Thomas cde | Bouser | 417-106-0007} ‏بده‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎TOWA STATE UNIVERSITY ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎

صفحه 27:
Issues Rexardiccy Star Gokewu Onweusive toble keys woust be sunpyate (aoc-icteliqedt! aad ae-busitess irokted), beoxiter! = (Keys wey cheep over eo بمج مت مالسا ‎Pant Pable-whot tevel oP detal do pou wat?‏ اه رطلجم) ‎= Drocsurtord yrcie Piast vel ‏سای مق تلهم( ‎— Creer yu D> beter worket bosket voapsts copcbiliy ‎= Creer gra D wore ckeewina tbls, wore rows ki Pa table ‎QOuratica oP the dotabase—how wurk history shoud be kept? ‎= Doturd dura ‏مه مین‎ S quarters ‎= Prrcetd ketliiows cay wed baer duration 5 — Older data i ‏ال ل‎ | ‎ ‎ATE UNIVERSITY

صفحه 28:
عصل مط:() :0-19 سم Country Calendar Table Date Dimension Table Fact Table Date key (PKIIFK] Date key [PK] Date key [PKIIFK] Country [PK] Full date Other PKs Holiday flag Day of week (Country PK needed Religious holiday flag Day number in month if facts relate to a Civil holiday flag Day number overall specific country) Holiday name ‘Week number in year ‏تم‎ ‎Season Week number overall Month Month number overall Event Table ‏ات‎ ‎Event key [PK] Weekday flag Event type Last day in month flag Event name Event key [FK] a ۲ 5 Fact tables contain time-period data ‏ی‎ > Date dimensions are important ‏ا‎

صفحه 29:
ke use oP a set oP qrophicd tools thot provides users with ‏اجان‎ views oP their dota cord ‏ما چاه‎ ‏راون صا‎ the dota ustey steeple wierdowwieg techoiques * OLOP Operativas - Onbe slit — coe up wil O-D view oP dota - Ortkdawe — spicy Prow sucvary to score detoiled views [ATE UNIVERSITY

صفحه 30:
له ول و م6 41-981 سم Measure Units | Revenue | Cost 200 | 1863 | 1020 200 | 1278 | ars 350 | 1800 | 1275 400 | 1935. | 1800

صفحه 31:
Sales 8 $100 1 ‘Sales EJ ‏ويه‎ ‏و‎ ‎0 Summary الس رومع ‎Speck‏ ال ‎Spach‏ | ميت هت | مامت ‎Drill-down with‏ ‎color added‏ ‎rene | Package sze | coor‏ ‎Setow [2m | white‏ سا ‎en‏ ‎Fin‏ 7 ال ‎i ee ۶‏ ‎Gren‏ | ۳ ‎Snack | Yaw‏ | سمه عي | ‎Sete [Spat‏ سن | اسه | ‎Seto‏ 06 سم تیه )] Starting with summary data, users can obtain details for particular cells JNIVERSITY ۹ 10114

صفحه 32:
Outa Diciagy cord Oiscctizaica (Geawedge dscovery vestry a beud oP statsticd, 1, vnd coor ‏جات اي‎ ods: ماوت بسن سره لاه ‎Cxpkin‏ = - OnrPinw kypokeses = Orpbre dara Por caw or ‏ساسا لصو‎ Devbuiques مس ایو = مد ط) - ۹ - Verdes = Protas Que visudizaiva — represeuiey data ia yruphicalemulicreda Boreas Por ۱۹۹۱۱۵۹ Bt

34,000 تومان