صفحه 1:
Chapter 0:
0۵۵) حول :0
dePPrep @. WofFer, Dav ®. Presvot, Pred R.
OrtPackteo
صفحه 2:
ODePicitiog oP ters
(Reusous Por تا ونوا مور وا
Reuse Por ceed حاص جاع سد ولوك خام
Oesoribe three levels oP dota worekouse achitertures
List Pour steps oP توص ول
Orsoribe two vowpouesis oP star eschew
Cstccate Pact table size
Opsiqa a dota wart
[ATE UNIVERSITY
صفحه 3:
(0 Py ۹
۰ ۵0
— © sbievt-orieuted, ioteqrated, fve-vartcdl, ave-updatabhe
vollevtioa oP data used i Support oP eacagened سل
مس باه
— Ohevtorieded: 4. ostowvers, poicuts, studects, products
- Teteqraed: Ovesistedt cexviey poovedives, Porwats, لصو
structures; Proc wutiple data sources
— Preevortad, Oon study treads cod choages
— Opoudabe: Read-ooly, perivdicdly rePreshed
* Ode Owt
— © data worekouse thal is hevited tt soope
E UNIVERSITY
صفحه 4:
© Aoteqroted, pospoep-wide view oP ماه لامج
(Proc disporde databases)
۰ Gepardiv oP operciocd and icPforwaiccd systews ord dott
(Por teeproved perPorxvade)
Table 11-1 Comparison of Operational and Informational Systems
informational Systems
‘Support managerial decision
making
Historical point.in-time
(snapshots) and predictions
Managers, business analysts,
customers
Broad, ad hoc, complex queries
and analysis
Ease of flexible access and use
Periodic batch updates and
queries requiring many oral rows
Operational Systems
Run the business on a current basis
Current representation of state of
the business
Clerks, salespersons, administrators
Narrow, planned, and simple
updates and queries
Performance: throughput, availability
Many, constant updates and queries
on one or a few table rows
Characteristic
Primary purpose
Type of data
Primary users
Scope of usage
Design goal
Volume
A STATE UNIVERSITY
صفحه 5:
e Versus Data Mart
Data Mart
Scope
+ Specific DSS application
* Decentralized by user area
* Organic, possibly not planned
Data
* Some history, detailed, and summarized
+ Highly denormalized
Subjects
* One central subject of concer to users
Sources
۰ Few internal and extemal sources
Other Characteristics
+ Restrictive
+ Project-oriented
* Short lite
+ Start small, becomes large
+ Multi, semi-complex structures, together complex
Table 11-2 Data Wareho
Data Warehouse
Scope
+ Application independent
* Centralized, possibly enterprise-wide
* Planned
Data
> Historical, detailed, and summarized
+ Lightly denormalized
Subjects
* Multiple subjects
Sources
+ Many internal and external sources
Other Characteristics
* Flexible
* Data-oriented
Long life
* Large
+ Single complex structure
Adapted rom Strange (1997)
TATE UNIVERSITY
IOWA
صفحه 6:
Outa Ourehouse (rchitevtures
* وق Two-Level Orckitecture
۰ Iedepesdedt Data Dart
* Oepertedt Data Out ond Operatiocd Data
Gtore
Lovicd Duta Out ant @rive Oarekouer
° DrreeLwer ucvhiterture
lve some form of extraction, transformation andgoading
TOWA STATE UNIVERSITY
صفحه 7:
Figure 11-2: Generic two-level data warehousing architectt
Source Data Staging roa Data &Netsdata End-User
Data Systems Storage Area Presentation Tools
۳ ea Processing ‘Ad hoe query
i tools
oak فلت
done pee
۲1 مه عم nat
ieee Report mitre
stone
تسس Endusor
0 (aie > al ‘pplceions
۳ تم سر
tea casting
ae و 1 mining tools:
, Visualization
E
72
delivery occ
riodic extraction > data is not completely curren
TOWA STATE UNIVERSITY
صفحه 8:
Figure 11-3 Independent
data mart data warehousing Dew :
Mini-warehouses, limited in
architecture و noe
Data Systems Storage Area Presentation Tools
۳ Processing Ad hoc every
ean ‘ole
reconcile matched to
ae preservation
ع format
۲-۸ ie
temo chips Report writers
مامه
transform End-user
conform ‘plications
Modeling
internat expert to dats ‘mining tools
Exemal teks ۳
hy 7 ا tools
~_E 1 1
Macy ess
= a
Separate ETL for each ata access 4
independent data mart complexity d,
inle d
TOWA STATE UNIVERSITY
صفحه 9:
ODS provides option
a three- _ for obtaining
currentdata سس
Storage Area Presentation Tecs
‘Ad hoc query
teals
لمع
preceriation
سس
Report writers
End-user
‘applications
Modeling
‘mining tools
Visualization
impler data access
۹
Dependent data
Figure 11-4 Dependent data mart
with operational data store:
level arehitecture Data Staging Area
ata Systems (Operational Data Store)
Internal
<
External
2 mip
Data Storage
relvional, fast
Processing
deen
reconcile
erie
match
ام
remove dupe
مدا ماه
tensiorm
center
dimensions
export = DW
‘nel DIM:
leanee
و
waa
Single ETL for
enterprise data
warehouse
TOWA STATE UNIV E1511 Y
صفحه 10:
ODS and data
warehouse are one
and the same
Figure 11-5 Logical data mart
and real time warehouse فیس
Dota & Motadata
New buss nies
۳
architegture Data Staging hes End User
Deta Systems (Operations Dats Stor) Storage Area Presentation Tec
سس سا
0 | Data Storage Ad hoc query
vena st مس tons
Real-time Report writers
Processing Entucer
۳۳ ی
مهم ی a, CRM ane
ا er sti ATH
atch
2 نگ ات (Fe
(tial >| remove dus
4 ۳
سس
| و
عاد و ساد | و جح
2 ا 01 ۳
tools
Near real-time ETL fotata marts are NOT sepg
Data Warehouse
databases, but logical,
TOWA STATE UNIVERSITY
صفحه 11:
Outa Charunteristivs
0 سب
Exavple oF OBOG
Gtatus vs. Cuect Data روت رجا
eore image
x2 | abeat | oaza004 | 720 Statu
J 5
Update
Kis
coargr2008 | | vent (wthetawa) Event = a database
action
(create/update/delete) 78
that results from a
After image transaction
[rs [acm Statu 4
Ki
5
WA STATE UNIVERSITY
صفحه 12:
hii Outa Ckarunteristics
Drecwiedt pperciocd
بل Trassiedt vs. Periodic Duta
Table X (10/05) With
ky [a | 8 transient
oot | a |b data,
oo2 |e | a changes to
ooa | ce | f existing
oot | a | on records are
written
Table X (10/06) Table X (10/07 vail
able X (1 able X (10107) 1
6006 ا 6 previous
د | 2 a oe records,
۱ اع أ ممه ۵ نت thus ۱
wos |e br ۷ c08: destroying
۳
صفحه 13:
we Outs Churucteristics
ول Drasiedt vs. Periodic Duta
وی دس
كه ص | © Om [A] | وه
Periodic 6ص |
wa [vo [fe] data are
never اه
physicall
wbx 007 وا “ie
Key [| Ow [A | 8 | Aton key | Ove | A | 8 | Aten altered
cor | voor [a [>| c cor | woos [+ [® | © or
deleted ۳ - ال ات |
os | ۱08 | ۰ [| ۲ 0 once 3 ۲ | ۰ | ۱۵8 | موه
om | wes fa | | 6 pw | wor fe ls |v they
wo | we fel] e have م oo | oe fy Le]
wow [wos fy [|v 3 م |[ row [awe Tm
Dw os | wor | y | > ۲ > 5
1 6 هه |
۱۹۹۱۱۵۹ Bt
صفحه 14:
Other Duta Darehouse Changes
© Dew desoipive utributes
٠ Dew busicess uniivity otrbutes
© Dew chsses oP desoipive utributes
© Orsmipive utbutes bevowe wore rePiced
٠ Orsoripive dota ae related ty poe water
* Dew source vP dot
rE UNIVERSITY
صفحه 15:
Phe Revowiled Duta Lauper
* Dypicdl operatccd cata te
— Dresiect-ont histericad
Der ested (pete dee ceased Bie padres)
— Restricted ta scope powprekeuive
— Goweitres poor qudliyoovesisieanies urd errors
* OPter GPL, dota should be:
— Detaled-ont اسر موه
ج اما
— و او لمر لد اده( or higher
— Oowpreheusiveruerprise-wide perspevive
— Doel وملعم should be cond eapuyh عامط اطوعلك اطوجه صا
— Quality رکه انح ات عمجم امن
rE UNIVERSITY
صفحه 16:
Phe EPL Process
* Cupture/@xtrant
© Gob or cote وان
٠ ProwPornv
© Lowd ocd لها
ETL = Extract, transform, and Igat
صفحه 17:
Capture/Extract...obtaining a snapshot of a chosen
subset of the source data for loading into the data
0 0 عدره وك :00 مس
Gteps ia toto _- Serub/Cleanse Transform ~—
a 0
recoveries J 0 a
i Staging Area ۱
Load \, 7
/|Capture/Extract 5
en ee ”"Meceages sbout ,
et rejected data 5 +
للد اس 5
مدا ۲ ماس -~ 57 د
Operational Messages about Enterprise data
systems. rejected data warehouse or
operational data
tore
Static extract = Incremental extract =
capturing a snapshot of capturing changes a ۱
the source data at a have occurred ۱
iot in ti last static extag
point in time
TOWA STATE UNIVERSITY
صفحه 18:
Scrub/Cleanse...uses pattern recognition and Al
techniques to upgrade data quality
ea = :00 سم
Gteps ia dota ‘ScrubiGleanse] .“ Transform ~~
revatio x /
(coct.) 0 “Staging Area
CapturelExtract, /
Mescages about ” ار
ee a rejected data at
- 2 = 7 2-9
ee b > ا سد
data سر Operational Meesages about
systems rejected data warehouse or
operational data ی
store
Fixing errors: Also: decoding,
misspellings, erroneous reformatting, time stamping,
dates, incorrect field usage, conversion, key genera |
mismatched addresses, merging, error
missing data, duplicate data, detection/loggingg#
ATE UINIV
RSITY
صفحه 19:
Transform = convert data from format of
operational system to format of data warehouse
ae و تست :100 سب
Transform | ~ را Gteps in dota _~~ Scrub/Cleanse
Z ترس مت وا
(cout.) Me ‘Staging Area
/ Capture/Extract
Messages about
rejected data
9
هه ۳ as
Operational Messages about Enterprise data
systems rejected data warehouse or
‘operational data
store
Record-level: Field-level:
Selection-data partitioning single-field-from one field tp (24
Joining-data combining field
Aggregation-data multi-field-from man
summarization 1 5
1
صفحه 20:
Load/Index= place transformed
data into the warehouse and
۳ 1040: create indexes
Geeps ta dota ساي تر 7 Transform >
۳ 3
قهة ووأوهاه ر يهنا
CaptuerExtract /
ole _”” Messages about — |
a rejected data pea
_— 49 ~
ف Sess ie 3 =
Operational Messages about Enterprise data
systems rejected data warehouse or
‘operational data
store
Refresh mode: bulk Update mode: onlys
rewriting of target data at changes in source da; ۳
periodic intervals written to data wa
TOWA STATE UNIVERSITY
صفحه 21:
دمشدوسرو صصص لاصخ جابيد 8) :00-00 سب
In general-some
transformation function
translates data from old form
555
5 to new form
مس Algorithmic transformation
1 uses a formula or logical
Sarees expression
Table lookup-another
approach, uses a
separate table keyed b)
source record code
صفحه 22:
Piure (0-08: OutiPield trocsPoreativd
Source Record
Telephone No | += | شم | مس مه
Sie
M:1-from many source 5
fields to one target
field
“angst Recor
addrose | ++ | قيمك | مصرممع
لمعته ماوق
1:M-from one
source field to
many target
fields
Target Record
Product iD | Brond Name | Product Name | ===
TOWA STA JNIVERSITY
صفحه 23:
Orrived Data
٠ Obievives
— Gose oP use Por devisivg support uppiicotioas
— Post respouse to predePiced user queries
— Custowized data Por poricuhar forget cudieures
— Od-koo query support
— Oats wictay capabilities
سمل دعر
— Ortaled (costly perivdiz) dota
= Bycreyae (Por suoaary)
— Ostributed (10 departeectdl servers)
Most common data model = star schemg Gi
(also called “dimensional model”)
rE UNIVERSITY
صفحه 24:
Cowpourcds of o ster schewe 1-09( مب
Fact tables contain
ee nen Dimension table
actual or quantitative
۳ 2 سم Key 3 (PK)
Fact table ‘Attribute
وموم درم | سس
| Attribute
> ۲ 2 (PK\FK)
Key 3 (PK\(FK) [>>
5 Attribute
Key 4 (PK\FK)
| Dimension tables are denormalized
ind fact Key 5 (PK) to maximize performance
چم Dimension table
1 اس Key 4 (PK)
Data column
Attribute
Attribute
Data column
Dimension tables contain ۳7۳
descriptions about the subjects of ۳
Dimension table
Key 1 (PK)
Attribute
Attribute
Attribute
Dimension table
Key 2 (PK)
Attribute
Attribute
Attribute
the business
Excellent for ad-hoc queries, but bad for online transa:
RSITY
IMA كن
1:N relationship between
dimension tables a
table:
صفحه 25:
امه موه وق :00-06 سم
PRODUCT Fact table provides statistics for
Product_Code sales broken down by product,
Description period and store dimensions
Color
See. SALES STORE
ها Product _Code ;—] Store_Code
<<) Period Code Store_Name
PERIOD
2 Store Code لخ City
Period Code سس 8
Units. Sold
‘oor nits § Telephone
Dollars Sold
ay ollars Manager
Dollars_Cost
Month
Day
TOWA STATE UNIVERSITY
صفحه 26:
sckews wit socople dota 6 10-9 سم
Product Period
“Gade | Deceripton | Color | size ede | Year | Quarter | Month
100 | Sweater | Blue | 40 zoos] 4 4
410 | shoes | Brown | 10 1/2 zoos] 4 3
125 | Glows [tan | 2004] 1 3
Product | Period | Store | unite | Dotare | Datars
Cede | Cece | Sola | soit | Cost
6د | 62 | sr | a0 | 1800 | 1200
sates] 126 | 002 | sz | 50 | 1000 | 600
100 | oor | si | 40 | 1600 | 1000
ود | 02 | sa | مه | 2000 | 1200
100 | 00a | se | a0 | 1200 | 750
sia | Store
Corte } Name | cry | Tatepnone | manager y
Store Tan's | San Antonia | 689-109-1400 | Burgace 1
eure | Portna | 042.621.2125 | Thomas
cde | Bouser | 417-106-0007} بده
TOWA STATE UNIVERSITY
صفحه 27:
Issues Rexardiccy Star Gokewu
Onweusive toble keys woust be sunpyate (aoc-icteliqedt! aad ae-busitess
irokted), beoxiter!
= (Keys wey cheep over eo
بمج مت مالسا
Pant Pable-whot tevel oP detal do pou wat? اه رطلجم)
= Drocsurtord yrcie Piast vel
سای مق تلهم(
— Creer yu D> beter worket bosket voapsts copcbiliy
= Creer gra D wore ckeewina tbls, wore rows ki Pa table
QOuratica oP the dotabase—how wurk history shoud be kept?
= Doturd dura مه مین S quarters
= Prrcetd ketliiows cay wed baer duration 5
— Older data i ال ل |
ATE UNIVERSITY
صفحه 28:
عصل مط:() :0-19 سم
Country Calendar Table Date Dimension Table Fact Table
Date key (PKIIFK] Date key [PK] Date key [PKIIFK]
Country [PK] Full date Other PKs
Holiday flag Day of week (Country PK needed
Religious holiday flag Day number in month if facts relate to a
Civil holiday flag Day number overall specific country)
Holiday name ‘Week number in year تم
Season Week number overall
Month
Month number overall
Event Table ات
Event key [PK] Weekday flag
Event type Last day in month flag
Event name Event key [FK]
a ۲ 5
Fact tables contain time-period data ی
> Date dimensions are important ا
صفحه 29:
ke use oP a set oP qrophicd tools thot provides users
with اجان views oP their dota cord ما چاه
راون صا the dota ustey steeple wierdowwieg techoiques
* OLOP Operativas
- Onbe slit — coe up wil O-D view oP dota
- Ortkdawe — spicy Prow sucvary to score detoiled views
[ATE UNIVERSITY
صفحه 30:
له ول و م6 41-981 سم
Measure
Units | Revenue | Cost
200 | 1863 | 1020
200 | 1278 | ars
350 | 1800 | 1275
400 | 1935. | 1800
صفحه 31:
Sales
8
$100
1
‘Sales
EJ
ويه
و
0
Summary
الس رومع
Speck ال
Spach | ميت
هت | مامت
Drill-down with
color added
rene | Package sze | coor
Setow [2m | white
سا en
Fin 7 ال
i ee ۶
Gren | ۳
Snack | Yaw | سمه
عي | Sete [Spat
سن | اسه | Seto
06 سم
تیه )]
Starting with
summary data,
users can obtain
details for particular
cells
JNIVERSITY
۹
10114
صفحه 32:
Outa Diciagy cord Oiscctizaica
(Geawedge dscovery vestry a beud oP statsticd, 1, vnd coor
جات اي
ods:
ماوت بسن سره لاه Cxpkin =
- OnrPinw kypokeses
= Orpbre dara Por caw or ساسا لصو
Devbuiques
مس ایو =
مد ط) -
۹
- Verdes
= Protas
Que visudizaiva — represeuiey data ia yruphicalemulicreda Boreas Por
۱۹۹۱۱۵۹ Bt