صفحه 1:
Statistics From BSCS: Interaction of experiments and ideas, 2°¢ Edition. Prentice Hall, 1970 and Statistics for the Utterly Confused by Lloyd Jaisingh, McGraw-Hill, 2000

صفحه 2:
وگ is the fraction of the variation in the values of y that is explained by the least-squares regression line of y on x. Exanple: IP? = 0.000 ta the gropk to the kA, his wee ‏تاه با‎ 00% oP pae's qrade is acerunited Por by the foe Cites ‏مق‎ wih otieadsare. Che other 99% could be due ‏و( اه له و و‎ Chose Pteuddace

صفحه 3:
What is statistics? a branch of mathematics that provides techniques to analyze whether or not your data is significant (meaningful) Statistical applications are based on probability statements Nothing is “proved” with statistics Statistics are reported Statistics report the probability that similar results would occur if you repeated the experiment

صفحه 4:
Statistics deals with numbers * Need to know nature of numbers collected — Continuous variables: type of numbers associated with measuring or weighing; any value in a continuous interval of measurement. * Examples: — Weight of students, height of plants, time to flowering — Discrete variables: type of numbers that are counted or categorical + Examples: — Numbers of boys, girls, insects, plants

صفحه 5:
Can you figure out... Which type of numbers (discrete or continuous?) — Numbers of persons preferring Brand X in 5 different towns —The weights of high school seniors —The lengths of oak leaves —The number of seeds germinating — 35 tall and 12 dwarf pea plants — Answers: all are discrete except the 2"? and 3" examples are continuous.

صفحه 6:
Populations and Samples ٠ Population includes all members of a group — Example: all 98 grade students in America — Number of 9* grade students at SR - No absolute number ٠ Sample — Used to make inferences about large populations — Samples are a selection of the population — Example: 6'* period Accelerated Biology + Why the need for statistics? — Statistics are used to describe sample populations as estimators of the corresponding population — Many times, finding complete information about a population is costly and time consuming. We can use samples to represent a population.

صفحه 7:
Sample Populations avoiding Bias Individuals in a sample population — Must be a fair representation of the entire pop. —Therefore sample members must be randomly selected (to avoid bias) — Example: if you were looking at strength in students: picking students from the football team would NOT be random

صفحه 8:
Is there bias? A cage has 1000 rats, you pick the first 20 you can catch for your experiment A public opinion poll is conducted using the telephone directory You are conducting a study of a new diabetes drug; you advertise for participants in the newspaper and TV All are biased: Rats-you grab the slower rats. Telephone-you call only people with a phone (wealth?) and people who are listed (responsible?). Newspaper/TV-you reach only people with newspaper (wealth/educated?) and TV( wealth?).

صفحه 9:
Statistical Computations (the Math) * If you are using a sample population — Arithmetic Mean (average) z. =x N «= {1,2,3,4,5};2=3 The sum of all the score. divided by the total number of scores. — The mean shows that % the members of the pop fall on either side of an estimated value: mean |

صفحه 10:
“Looking at profile of data: Distribution * What is the frequency of distribution, where are the data points? Distribution Chart of Heights of 100 Control Plants er of plants in Class (height of plants: Nu cm) eat 3 0.0.0.9 10 هر 21 2.0.2.9 30 3.03.9 20 40.4.9 14 5.05.9 6.06.9 2

صفحه 11:
Histogram-Frequency Distribution Charts Number of Plants in each Class sa Number of plants ineach This is called a “normal” curve or a bell curve This is an “idealized” curve and is theoretical based on an infinite number derived from a sample

صفحه 12:
Mode and Median * Mode: most frequently seen value (if no numbers repeat then the mode = 0) * Median: the middle number —If you have an odd number of data then the median is the value in the middle of the set —If you have an even number of data then the median is the average between the two middle values in the set.

صفحه 13:
Variance (s2) * Mathematically expressing the degree of variation of scores (data) from the mean + A large variance means that the individual scores (data) of the sample deviate a lot from the mean. « Asmall variance indicates the scores (data) deviate little from the mean

صفحه 14:
مس سار و و سس ولج( بصخم ‎eu of X = sore,‏ = 2 ‎ween, = td of scores or usher‏ =[ لل 22 _ و ‎N‏ OR use the OBR Pucrtivs is Bare Worksheet for Calculating the Variance for 7 scores For this problem the population variance is 0.57 XX (X- Wf 5 ۲ 1 3 1 1 4 ‏م6‎ 0 4 0 0 3 1 1 4 0 0 S| 1 bg 4 hip Iunnw xecokte eduluxesrled DC rdovardevs. hice

صفحه 15:
منم ‎Por a Bised GBOPLE‏ جومم سا مطولطله) ‎anv of; X = sore, uch,‏ = ‎totd oF scores or uches-(1‏ = 4 و vy Sa L(x 0 xX) 7۱-1 (often.read as “x bar”) is the mean (average value of Worksheet for Calculating the Qote the ‏نون موی‎ ts haryer...why? ‘Variance for 7 scores لي 1 وو كر ‎ge EON)‏ قر ع عر م اع 7-1 7-1 د 3 1 1 ۱ For this problem the population variance is 0.57 3 1 1 4 6 S| 1 28 4 جما .وجل صا ©( ©لج لمموجسب ادلب جاداصمج. بصنم | ‎hip‏

صفحه 16:
ights in Centimeters of Five Randomly Selected Pea Plants Grown at 8-1 Pla Height Deviations Squares of nt (cm) from mean deviation from mean (x) (x; x) (x,- x)? A 10 2 4 B 7 1 1 © 6 2 4 D 8 0 0 E 9g 1 1 2 ۲ < = (x; x) =0 = (x- x)? = 10 40 X, = score of value; X (bar) = mean; = = sum of

صفحه 17:
عمنو() سا مطلطل() ا۳) “Dhere were Pave phate; 09; therePore o- 42 2 ‎ge LX)‏ 99 10/6 :6 7-1 ‎der‏ با بط ام و همطل سا سس واه وا ‎helps‏ ه00 مت روم عامومجو با مات ای امه ات با

صفحه 18:
Standard Deviation An important statistic that is also used to measure variation in biased samples. S is the symbol for standard deviation Calculated by taking the square root of the variance So from the previous example of pea plants: The square root of 2.5 ; s=1.6 Which means the measurements vary plus or minus +/- 1.6 cm from the mean

صفحه 19:
What does “S” mean? * We can predict the probability of finding a pea plant at a predicted height... the probability of finding a pea plant above 12.8 cm or below 3.2 cm is less than 1% * Sis a valuable tool because it reveals predicted limits of finding a particular value

صفحه 20:
Pra Plact Oerwal Disttbuticd Curve uth Grt Dev:

صفحه 21:
The Normal Curve and Standard Deviationer ian: Cock veriod ke io ‏اه اه سس‎ cae 00% of uches Pal wets +1 or Cl of the sees 98% oF uches Pal wikis +O 8, © ‏صب‎ ‏لس اه رام‎ (299%) Pal whic © ‏لد‎ dev write Same as others Probably less than others Definitely less than others wae) ome trae tom Probably more than others Definitely more than others 2 ۲ 4 ۲ رس امسر راو حلله مد اهنا

صفحه 22:
Standard Error of the Sample Means KA Standard Error A The mean, the variance, and the std dev help estimate characteristics of the population from a single sample So if many samples were taken then the means of the samples would also form a normal distribution curve that would be close to the whole population. The larger the samples the closer the means would be to the actual value But that would most likely be impossible to obtain so use a simple method to compute the means of all the samples

صفحه 23:
A Simple Method for estimating standard error 5 سب < بر ‎fn‏ (Gtoodard error te the colculited stocdard deviatiza divided by the square root oF the Or ‏اه ام‎ te popubatios (Gtoodard error ‏و عم بل چاه‎ used to test the retobiiiy oP the cot ‏.سم‎ 1 here ore (0 core phan ‏ولج و لس‎ deviation ‏خا‎ 0.2 Ge, = 0.8] syrovick IO = 0.8/9.09 = 0.009 0.009 represeds vor std dev too scp oF 40 ‏ام‎ ‎AP there were (OO phrase the standard error would drop te D.DDE ky? erase whee we tohe horwer suxopkes, our ‏واه بو سوه وتو‎ to the fru weoo ude of the popukiica. Dhus, the dstrbuica of the ‏مروت‎ wero would be tess spread cut ced woud hove a luwer stoodacd deviation.

صفحه 24:
Probability Tests What to do when you are comparing two samples to each other and you want to know if there is a significant difference between both sample populations (example the control and the experimental setup) How do you know there is a difference How large is a “difference”? How do you know the “difference” was caused by a treatment and not due to “normal” sampling variation or sampling bias?

صفحه 25:
Laws of Probability The results of one trial of a chance event do not affect the results of later trials of the same event. p= 0.5 (a coin always has a 50:50 chance of coming up heads) The chance that two or more independent events will occur together is the product of their changes of occurring separately. (one outcome has nothing to do with the other) Example: What’s the likelihood of a 3 coming up ona dice: six sides to a dice: p = 1/6 Roll two dice with 3’s p = 1/6 *1/6= 1/36 which means there’s a 35/36 chance of rolling something else... Note probabilities must equal 1.0

صفحه 26:
Laws of Probability (continued) The probability that either of two or more mutually exclusive events will occur is the sum of their probabilities (only one can happen at a time). Example: What is the probability of rolling a total of either 2 or 12? Probability of rolling a 2 means a 1 on each of the dice; therefore p = 1/6*1/6 = 1/36 Probability of rolling a 12 means a6 anda 6 on each of the dice; therefore p = 1/36 So the likelihood of rolling either is 1/36+1/36 = 2/36 or 1/18 ۰ ۰ ۰

صفحه 27:
The Use of the Null Hypothesis Is the difference in two sample populations due to chance or a real statistical difference? The null hypothesis assumes that there will be no “difference” or no “change” or no “effect” of the experimental treatment. If treatment A is no better than treatment B then the null hypothesis is supported. If there is a significant difference between A and B then the null hypothesis is rejected... ۰ ۰

صفحه 28:
T-test or Chi Square? Testing the validity of the null hypothesis Use the T-test (also called Student’s T- test) if using continuous variables from a normally distributed sample populations (ex. Height) Use the Chi Square (X?) if using discrete variables (if you are evaluating the differences between experimental data and expected or hypothetical data)... Example: genetics experiments, expected distribution of organisms.

صفحه 29:
T-test ٠ T-test determines the probability that the null hypothesis concerning the means of two small samples is correct * The probability that two samples are representative of a single population (supporting null hypothesis) OR two different populations (rejecting null hypothesis)

صفحه 30:
GTOOEOTVGS ۵۵ “he studed's fest iso statsticd wethed thot ie weed to see Pi sete oP dot ‏یلار‎ ۱ thot that the rests Prius th A distibutica (also mule student's tdboinbuten) Pike od kuowkests t tr. “This cull hypothesis wil usury stiputate thot there te ‏ناملا تمه من‎ the weccs oF the tu dota sets. 4s best used to ry cod detector wheter there iso diPPereure betwee ta fodepeadedt sccople yrcups. (Por the test to be upplicuble, the sacophe yrmups oust be etal) ‏ای روط‎ i b 1 ‏طامرصد جد مجداننا‎ iw Hi se ‏این امه وه‎ “@ebore usieny his pe oF test te ttc ‏اصام جا‎ ۷ ope data Proc be to ۳ ad woke sure thot thes ‏مس رشاو امه راجت و‎ the stdout st test wil ot be sate. 8 ie deo desirable ty roedowy wssiqa suxoptes te the yroups, wherever possible. Qed wore! ‏سس انس‎ .experiweohresvurces col studeuts-Hest. hottie” DOKPC chi hig? euww.experioru-resvurces.co! studects-Hest hil

صفحه 31:
6060 سا ‎det Phere iso siaPicact dP oe kr‏ وا پر سا تن وله لاس وله روا مه لاه ‎oF‏ وج ‎juve‏ ‎“Phe cull hypoth wight state that there i on siqaPicodt dP Pereace tc test‏ oP he tuo sxnople roars ond rot coy dhRerecer dour ‏اه و‎ Phe otaleu's tent coc hen be wed try und doprove he ol hypokent. REOTRIOMOOG “Phe ar exo qrouns bern ested cant hove a rewerenby word detrbuton, 1 the Uetrbutod is chewed, hea the stded's | eat is they to ‏تنس‎ up ‏,تخد بلاط‎ ‎have vay oe seus pec (corde) wear the outer of the gro.‏ لاه قاط بو اج وه ول واه له ما ,جوم جر ‎SP the dota does oot achere ty the‏ ‎serded or, preterchhy, score complex Pore of cto ovate should be weed.‏ ‎exverkoeutresvurces.coco/ stdeuts-Hest htt” DOWLOP TL.‏ سس سنا نوی لو ‎hip: / eu w.expertirra-resvurces.co! studecis-Hest hil ‎ ‎ ‎

صفحه 32:
RECOLTS “Dhe stided!'s test coc let pou how P there iso stqaPicod dPPereore te the wero OP the tue snopes oroups ond disprove the oud kypohests. * bike ol stateticd tests, t vacant prove covikicg, us there & civays a chore oF ‏وه اور‎ comune. “ut the test con support o hypolesis. Wowever, itis stl eel Por wecsuriey soo saneple populicgs oad detercinicy P here ‏اممحخادواد د ط‎ dPPereare betes the ows. by ori Gutleworks (2008). Qead wore? hip!//ouw.expertorubresvurves.cowl studects-t ‏ج01 د اناوس‎ 00000 hip Wow w.experivedbresturces. co! studedis-Hest Atel

صفحه 33:
Ose Hest to detervice whether or ont sap popubiog (Boe ( ‏ی‎ Prow the ‏سلجم اممو ةلال عم جمد‎ عصد مد / << xl (bor x) = wear of Bj xG (bor x) = wear of © axl = std error of @; ox@ = std error of 02: ‏:اوه‎ Gaople ® wear =O Gaople ® wea 29 Gd enor oP dPPereuce oP popukticas =( (6-9/0 = © std devictiog ucts

صفحه 34:
Comparison of A and B 96 وی ‎0's wen tes‏ 0 ا اس موس purve oP poputatiza Repent Dull Wypotests

صفحه 35:
Octor calruktiors: ‎coe‏ ماوت هولج واه وولو نها مه ‎Por pow... ond a box plot‏ وی ‎hipt/huww.ropkrad col quichodcsltest( cP‏

صفحه 36:
ات و لول با موه وله مه وت با ارت با وا تاه ۳ وا[ لد لم ۹۰( عع سيق ۲۲ 0 1 ‎Mean sample ۹ N= # in sample |‏ < را ‎Xy= mean Sample Q N= in Sample 2‏ ‎Sample |‏ 5 ‎variance of Sample 2‏ = >33 ‎If samples are equal in Size see nak eal ‎t= XX | ‎y= +S 7

صفحه 37:
nt of O, Used by Germinating Seeds of Corn and Pea Plants ml OJhour | at 25°C Reading | Com Pea ALow to de his hin BXOCL Number 1 0.20 0.25 2 0.24 0.23 3 0.22 oat 4 0.21 0.27 5 0.25 0.23 6 0.24 3 7 0.23 0.25 8 0.20 0.28 9 0.21 0.25 10 0.20 0.30 Total 2.20 2.70 Mean 0.22 0.27 Variance | 0028 0106 (Cxxcet Pie tocated tc (Bor Bio Pile Potter

صفحه 38:
‎the chat che (he pelo‏ مود جاح سل ‎١‏ بط ۵ ام للم = زرا 0 اوه هجو مه سا ‎eu of‏ بط روا سل ‎aroun.‏ اه بط امه وج مت ‎he‏ ما سل ‎] 2 Albus Dumbledough vs. Pat Stat ‎ ‎| [See ‎40 ‎‘Arsenic (ng/g) ‎ ‎hip: //u.72 ‏صصح حاصو |صلطا )> صجدد ا س«دسحا/س دا محالت «مجطحجه دلت اوسص م لصيس‎ jy ‎ ‎

صفحه 39:
۲ Por (OD decrees of ‏مس‎ )60-4( ‏و و سل وه بل‎ pour tuche toe 2.860 ٩۱۳ ‏ای مر‎ ۱ chur te betwee +8. 06060 ud 0 “hea uocept the cull ] the wean ore sik AB your tuck Paks ‏سم‎ ‎+6 .600© ont ] 6 .66© ‏مه‎ 6 606( Pale rect the oul 1 ieble oP ches (8% = 0.06( degrees significance level of freedom 20% 10% 5% 2% 1% 01% 1 3078 6314 31821 63-657 636-619 2 1886 20 6965 9925 8 3 1638 3 4541 5841 12.941 4 ۰ 7 ۵ 3.747 4604 ۰ 0 ‏و‎ 1476 015 2.365 4092 9 6 14401943 ‏مرو‎ 370 5-959 7 5 5 2998 3-499 5405 8 1397 ۵ 2896 3355 504 9 1383 33 2.821 3250 47 ۱0 372 ۸ 2764 3169 ۰ 87 1 1363. 1-796 2718 3106 4-437 120 1356 2 2681 3055 . 8 13 ۰. 350 171 2650 3012 4221 14 1346 ۱ 2624 ۰ 2977 ۰ 0 15 . 1 ۶3 2602 2947 4.073 16 1337 1-746 2583 2921 4015 17 ‏تا"‎ 9 2567 2898 ۰ 65 18 330 1-734 2552 2878 ۰.22 19 1 1-729 2539 2864 3 2 13299 25 2528 2846 0 2 1-721 2831 3819 2 1717 2819 ۶2 23 na 2807 ۰.7 24 11۱ 2797 ۰ 45 25 108 2.787 ۰ 25 36 1706 719 = 707

صفحه 40:
Go P the wea of the core = 0.80 ved ‏عم خام مدب عدا‎ 2.8 Dhe vortrare (5°)oP the core is 0. DOOM ced the peur te DOOAIPO. Cock suvple popukiiva is equal tp tea. t- xX ‏وا(‎ ‎۳ نم 0۵0۵6۵ :0۵۵900 ۰۲| 0.660.62 0.06۱ ۱۷ ۵۵۵ 0.06۱۷۱ ۵9 (مه هه سم 0.0 حر WF = 604 - ‏(00)ت‎ 6-01© Obert ‏سل‎ 9 او سل و ها سا وا له سا ام ‎Oke fp higher hoo tude...‏ 3533032

صفحه 41:
he “2” test seed P pour ‏تسه اجه ولج‎ thats OD “Olso used Por ‏ینعی کات لور لصا موه‎ Porm! ote! “0” (eke) tweed keteud of the beter ‏لج"‎ 2 ween oP pop #0 — wero of por HO! \ oP varie of pop He + vorkrace of pop#tOliO Olsp cote that P you oly kod the stortard devidiog pou cos square thot udue aed substitute Por veraore N

صفحه 42:
“D table (sucople table with D probubilties 0 20/2[ (لنما مهم) 20 ۰ ‎tails)‏ 0.1 8 1.64 0.05 5 1.96 0.01 33 2.576 Ose ‏امس و‎ ool b show ۳۲2775 1792۳3 ‏عط ج) ما سس‎ tro) sap wen ©. Ose ‏وه وا ات بویا ه‎ ۰ skPirad dPRereure (ether ‏جا لصي‎ Or bee tron) betveen somrpke wrcn ® onl soaps wer © D kble wee! 6 < ‏)لو‎ probubly of) 0096, ©96 ‏لم‎ 0 96 وبصي جما خا يليت جلا جم من وا مه معط لو جا ‎dpk rePers‏ 2 0 41 ‎Blo your nd hypoheots‏ نمی ده تچ و موجه بط سا مت ما و ‎chercraiver kypetbests‏ و سوه ور ما اج یی با نا لو ات علج تم له سر وت موه مه ‎dpa ©: rePers to‏ 2 < ۵/9 ر1 ‎ww dPPereue bet the a OF th ic or the expericectal hypriests (\‏ ات + ‎dP Pereuce expenied). S/our chercaive hypothesis ts lookioy Por‏ ‎dPPereue‏

صفحه 43:
Example z-test * You are looking at two methods of learning geometry proofs, one teacher uses method 1, the other teacher uses method 2, they use a test to compare success. * Teacher 1; has 75 students; mean =85; stdev=3 * Teacher 2: has 60 students; mean =83; stdev= 2 - ‏و2‎ = (85-83)/V3*2/75 + 22 < 5 ات = 2/0.4321 = 4.629

صفحه 44:
Example continued 0 را ‎Detad (ts ont beter thon wetud 2‏ با ات جوا با ‎ALO = cherentie Wyzokeoty wink! be teat Drtheal (ts beter ‏دجا‎ sober © ‎Whe bw ove kid 2 tot (ower the «nll hypokeoty doe pride fra here wd be wo WPPereccr) Gp Por the probably oP 0.06 (O% onnPexnee or 08% vexP ecw) tht Deherd ove tort beter tho ‏لس‎ © ... hat ohort ke = Za 1.645 ‎So 4.629 is greater than the 1.645 (the null hypothesis states that method 1 would not be better and the value had to be less than 1.645; it is not less therefore reject the null hypothesis and indeed method 1 is better ‎ ‎table (secple table wits O probabstics) ‎3 Za (one tail) |Za/2 (two tails) ‎0.1 8 1.64 ‎0.05 5 1.96 ‎0.01 233 2.576 ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎

صفحه 45:
Chi square * Used with discrete values * Phenotypes, choice chambers, etc. * Not used with continuous variables (like height... use t-test for samples less than 30 and z-test for samples greater than 30) 2 * O= observed valu y2 aye + E= expected valu 6 مدا حمسي 5 )داه أعصمصام 5) | مجه ومجسمجوط. نمي ‎hep:‏

صفحه 46:
http://course1.winona.edu/sberg/Equation/chi- squ2.gif Observed individuals — Expected individuals with a given phenotype with a given phenotype Greek me 2 0-6( 2 ayes e Summation => add together a term for each condition

صفحه 47:
Interpreting a chi square Calculate degrees of freedom # of events, trials, phenotypes -1 Example 2 phenotypes-1 =1 Generally use the column labeled 0.05 (which means there is a 95% chance that any difference between what you expected and what you observed is within accepted random chance. Any value calculated that is larger means you reject your null hypothesis and there is a difference between observed and expect values.

صفحه 48:
How to use a chi Square 0.05 3.84 5.99 182 9.49 11.07 12.59 14.07 15.51 16.92 18.81 0.01 0.001 6.64 10.83 921 13.82 11.84 16.27 18.28 18.47 15.09 20.52 16.81 22.46 18.48 24,32 20.09 26.12 21.67 27.88 23.21 29.59 Significant 0.10 2.71 4.60 6.25 18 9.24 10.64 12.02 13.36 14.68 15.99 chart Probability 0.20 1.64 3.22 4.64 5.99 729 8.56 9.80 11.03 12.24 13.44 0.70 0.50 0.30 1.07 241 3.66 4.88 6.06 1,23 8.38 9.52 10,66 11.78 Nonsignificant “080 0.90 0.95 Degrees of Freedom اسان نله اجه زا جه "|| /نجاها.

صفحه 49:
‎Spare ۳‏ دا ‎Me a Expected ۳5‏ ‎Exanple 1 )۱۶-۵۵(< )28< :‏ ‎boo‏ ‎2, Bs BS, 225. = 8+ 4 ‘Goo G00 "40 ‎ ‎a 0.31540. S097 yas feedom = N-| = &suyo~

Statistics From BSCS: Interaction of experiments and ideas, 2nd Edition. Prentice Hall, 1970 and Statistics for the Utterly Confused by Lloyd Jaisingh, McGraw-Hill, 2000 r2 … is the fraction of the variation in the values of y that is explained by the least-squares regression line of y on x. Grades Example: If r2 = 0.61 in the graph to the left, this means that about 61% of one’s grade is accounted for by the linear relationship with attendance. The other 39% could be due to a multitude of factors. Class Attendance What is statistics? • a branch of mathematics that provides techniques to analyze whether or not your data is significant (meaningful) • Statistical applications are based on probability statements • Nothing is “proved” with statistics • Statistics are reported • Statistics report the probability that similar results would occur if you repeated the experiment Statistics deals with numbers • Need to know nature of numbers collected – Continuous variables: type of numbers associated with measuring or weighing; any value in a continuous interval of measurement. • Examples: – Weight of students, height of plants, time to flowering – Discrete variables: type of numbers that are counted or categorical • Examples: – Numbers of boys, girls, insects, plants Can you figure out… • Which type of numbers (discrete or continuous?) – Numbers of persons preferring Brand X in 5 different towns – The weights of high school seniors – The lengths of oak leaves – The number of seeds germinating – 35 tall and 12 dwarf pea plants – Answers: all are discrete except the 2nd and 3rd examples are continuous. Populations and Samples • Population includes all members of a group – Example: all 9th grade students in America – Number of 9th grade students at SR – No absolute number • Sample – Used to make inferences about large populations – Samples are a selection of the population – Example: 6th period Accelerated Biology • Why the need for statistics? – Statistics are used to describe sample populations as estimators of the corresponding population – Many times, finding complete information about a population is costly and time consuming. We can use samples to represent a population. Sample Populations avoiding Bias • Individuals in a sample population – Must be a fair representation of the entire pop. – Therefore sample members must be randomly selected (to avoid bias) – Example: if you were looking at strength in students: picking students from the football team would NOT be random Is there bias? • A cage has 1000 rats, you pick the first 20 you can catch for your experiment • A public opinion poll is conducted using the telephone directory • You are conducting a study of a new diabetes drug; you advertise for participants in the newspaper and TV • All are biased: Rats-you grab the slower rats. Telephone-you call only people with a phone (wealth?) and people who are listed (responsible?). Newspaper/TV-you reach only people with newspaper (wealth/educated?) and TV( wealth?). Statistical Computations (the Math) • If you are using a sample population – Arithmetic Mean (average) The sum of all the scores divided by the total number of scores. – The mean shows that ½ the members of the pop fall on either side of an estimated value: mean http://en.wikipedia.org/wiki/Table_of_mathematical_symbols Distribution Chart of Heights of 100 Control Plants Looking at profile of data: Distribution • What is the frequency of distribution, where are the data points? Distribution Chart of Heights of 100 Control Plants Class (height of plantscm) Number of plants in each class 0.0-0.9 3 1.0-1.9 10 2.0-2.9 21 3.0-3.9 30 4.0-4.9 20 5.0-5.9 14 6.0-6.9 2 Histogram-Frequency Distribution Charts This is called a “normal” curve or a bell curve This is an “idealized” curve and is theoretical based on an infinite number derived from a sample Mode and Median • Mode: most frequently seen value (if no numbers repeat then the mode = 0) • Median: the middle number – If you have an odd number of data then the median is the value in the middle of the set – If you have an even number of data then the median is the average between the two middle values in the set. Variance (s ) 2 • Mathematically expressing the degree of variation of scores (data) from the mean • A large variance means that the individual scores (data) of the sample deviate a lot from the mean. • A small variance indicates the scores (data) deviate little from the mean Calculating the variance for a whole population Σ = sum of; X = score, value, µ = mean, N= total of scores or values OR use the VAR function in Excel http://www.mnstate.edu/wasson/ed602calcvardevs.htm Calculating the variance for a Biased SAMPLE population Σ = sum of; X = score, value, n -1 = total of scores or values-1 (often read as “x bar”) is the mean (average value of Note the sample variance is larger…why? http://www.mnstate.edu/wasson/ed602calcvardevs.htm ights in Centimeters of Five Randomly Selected Pea Plants Grown at 8-1 Pla nt Height (cm) Deviations from mean Squares of deviation from mean (xi) (xi- x) (xi- x)2 A 10 2 4 B 7 -1 1 C 6 -2 4 D 8 0 0 E 9 1 1 Σ xi = 40 Σ (xi- x) = 0 Σ (xi- x)2 = 10 Xi = score or value; X (bar) = mean; Σ = sum of Finish Calculating the Variance Σ xi = 40 Σ (xi- x) = 0 Σ (xi- x)2 = 10 There were five plants; n=5; therefore n1=4 So 10/4= 2.5 Variance helps to characterize the data concerning a sample by indicating the degree to which individual members within the sample vary from the mean Standard Deviation • An important statistic that is also used to measure variation in biased samples. • S is the symbol for standard deviation • Calculated by taking the square root of the variance • So from the previous example of pea plants: The square root of 2.5 ; s=1.6 • Which means the measurements vary plus or minus +/- 1.6 cm from the mean What does “S” mean? • We can predict the probability of finding a pea plant at a predicted height… the probability of finding a pea plant above 12.8 cm or below 3.2 cm is less than 1% • S is a valuable tool because it reveals predicted limits of finding a particular value Pea Plant Normal Distribution Curve with Std Dev The Normal Curve and Standard DeviationA normal curve: Each vertical line is a unit of standard deviation 68% of values fall within +1 or -1 of the mean 95% of values fall within +2 & -2 units Nearly all members (>99%) fall within 3 std dev units http://classes.kumc.edu/sah/resources/sensory_processing/images/bell_curve.gif Standard Error of the Sample Means AKA Standard Error • The mean, the variance, and the std dev help estimate characteristics of the population from a single sample • So if many samples were taken then the means of the samples would also form a normal distribution curve that would be close to the whole population. • The larger the samples the closer the means would be to the actual value • But that would most likely be impossible to obtain so use a simple method to compute the means of all the samples A Simple Method for estimating standard error Standard error is the calculated standard deviation divided by the square root of the size, or number of the population Standard error of the means is used to test the reliability of the data Example… If there are 10 corn plants with a standard deviation of 0.2 Sex = 0.2/ sq root of 10 = 0.2/3.03 = 0.006 0.006 represents one std dev in a sample of 10 plants If there were 100 plants the standard error would drop to 0.002 Why? Because when we take larger samples, our sample means get closer to the true mean value of the population. Thus, the distribution of the sample means would be less spread out and would have a lower standard deviation. Probability Tests • What to do when you are comparing two samples to each other and you want to know if there is a significant difference between both sample populations • (example the control and the experimental setup) • How do you know there is a difference • How large is a “difference”? • How do you know the “difference” was caused by a treatment and not due to “normal” sampling variation or sampling bias? Laws of Probability • The results of one trial of a chance event do not affect the results of later trials of the same event. p = 0.5 ( a coin always has a 50:50 chance of coming up heads) • The chance that two or more independent events will occur together is the product of their changes of occurring separately. (one outcome has nothing to do with the other) • Example: What’s the likelihood of a 3 coming up on a dice: six sides to a dice: p = 1/6 • Roll two dice with 3’s p = 1/6 *1/6= 1/36 which means there’s a 35/36 chance of rolling something else… • Note probabilities must equal 1.0 Laws of Probability (continued) • The probability that either of two or more mutually exclusive events will occur is the sum of their probabilities (only one can happen at a time). • Example: What is the probability of rolling a total of either 2 or 12? • Probability of rolling a 2 means a 1 on each of the dice; therefore p = 1/6*1/6 = 1/36 • Probability of rolling a 12 means a 6 and a 6 on each of the dice; therefore p = 1/36 • So the likelihood of rolling either is 1/36+1/36 = 2/36 or 1/18 The Use of the Null Hypothesis • Is the difference in two sample populations due to chance or a real statistical difference? • The null hypothesis assumes that there will be no “difference” or no “change” or no “effect” of the experimental treatment. • If treatment A is no better than treatment B then the null hypothesis is supported. • If there is a significant difference between A and B then the null hypothesis is rejected... T-test or Chi Square? Testing the validity of the null hypothesis • Use the T-test (also called Student’s Ttest) if using continuous variables from a normally distributed sample populations (ex. Height) • Use the Chi Square (X2) if using discrete variables (if you are evaluating the differences between experimental data and expected or hypothetical data)… Example: genetics experiments, expected distribution of organisms. T-test • T-test determines the probability that the null hypothesis concerning the means of two small samples is correct • The probability that two samples are representative of a single population (supporting null hypothesis) OR two different populations (rejecting null hypothesis) STUDENT’S T TEST •The student’s t test is a statistical method that is used to see if to sets of data differ significantly. •The method assumes that the results follow the normal distribution (also called student's t-distribution) if the null hypothesis is true. •This null hypothesis will usually stipulate that there is no significant difference between the means of the two data sets. •It is best used to try and determine whether there is a difference between two independent sample groups. For the test to be applicable, the sample groups must be completely independent, and it is best used when the sample size is too small to use more advanced methods. •Before using this type of test it is essential to plot the sample data from he two samples and make sure that it has a reasonably normal distribution, or the student’s t test will not be suitable. • It is also desirable to randomly assign samples to the groups, wherever possible. Read more: http://www.experiment-resources.com/students-t-test.html#ixzz0Oll72cbi http://www.experiment-resources.com/students-t-test.html EXAMPLE •You might be trying to determine if there is a significant difference in test scores between two groups of children taught by different methods. •The null hypothesis might state that there is no significant difference in the mean test scores of the two sample groups and that any difference down to chance. The student’s t test can then be used to try and disprove the null hypothesis. RESTRICTIONS •The two sample groups being tested must have a reasonably normal distribution. If the distribution is skewed, then the student’s t test is likely to throw up misleading results. •The distribution should have only one mean peak (mode) near the center of the group. •If the data does not adhere to the above parameters, then either a large data sample is needed or, preferably, a more complex form of data analysis should be used. Read more: http://www.experiment-resources.com/students-t-test.html#ixzz0OlllZOPZ http://www.experiment-resources.com/students-t-test.html RESULTS •The student’s t test can let you know if there is a significant difference in the means of the two sample groups and disprove the null hypothesis. • Like all statistical tests, it cannot prove anything, as there is always a chance of experimental error occurring. •But the test can support a hypothesis. However, it is still useful for measuring small sample populations and determining if there is a significant difference between the groups. by Martyn Shuttleworth (2008). Read more: http://www.experiment-resources.com/students-ttest.html#ixzz0OlmGvVWD http://www.experiment-resources.com/students-t-test.html Use t-test to determine whether or not sample population A and B came from the same or different population t = x1-x2 / sx1-sx2 x1 (bar x) = mean of A ; x2 (bar x) = mean of B sx1 = std error of A; sx2 = std error of B Example: Sample A mean =8 Sample B mean =12 Std error of difference of populations =1 12-8/1 = 4 std deviation units Comparison of A and B B’s mean lies outside (less than 1% chance of being the normal distribution curve of population A Reject Null Hypothesis Online calculators: http://www.physics.csbsju.edu/stats/t-test_bulk_form.html online calculates for you… and a box plot also http://www.graphpad.com/quickcalcs/ttest1.cfm The t statistic to test whether the means are different can be calculated as follows: unt of O2 Used by Germinating Seeds of Corn and Pea Plants mL O2/hour at 25 °C Reading Number Corn Pea 1 0.20 0.25 2 0.24 0.23 3 0.22 0.31 4 0.21 0.27 5 0.25 0.23 6 0.24 0.33 7 0.23 0.25 8 0.20 0.28 9 0.21 0.25 10 0.20 0.30 Total 2.20 2.70 Mean 0.22 0.27 Variance .0028 .0106 Excel file located in AccBio file folder How to do this all in EXCEL Ho = null hypothesis if the t value is larger than the chart value (the yellow regions) then reject the null hypothesis and accept the H A that there is a difference between the means of the two groups… there is a significant difference between the treatment group and the control group. http://www2.cedarcrest.edu/academic/bio/hale/biostat/session19links/nachocurve2tail.jpg T table of values (5% = 0.05) For example: For 10 degrees of freedom (2N-2) The chart value to compare your t value to is 2.228 If your calculated t value is between +2.228 and -2.228 Then accept the null hypothesis the mean are similar If your t value falls outside +2.228 and -2.228 (larger than 2.228 or smaller than -2.228) Fail to reject the null hypothesis (accept the alternative hypothesis) there is a significant difference. So if the mean of the corn = 0.22 and the mean of the peas =0.27 The variance (s2)of the corn is 0.000311 and the peas is .001178. Each sample population is equal to ten. Then: 0.22-0.27 / √ (.000311+.001178)/10 -0.05/ √ 0.001489/10 -0.05/ √ .0001489 (ignore negative sign) t= 4.10 Df = 2N-2 = 2(10) -2=18 Chart value =2.102 Value is higher than t-value… reject the null hypothesis there is a difference in the means. The “z” test -used if your population samples are greater than 30 -Also used for normally distributed populations with continuous variables -formula: note: “σ” (sigma) is used instead of the letter “s” z= mean of pop #1 – mean of pop #2/ √ of variance of pop #1/n1 + variance of pop#2/n2 Also note that if you only had the standard deviation you can square that value and substitute for variance Z table (sample table with 3 probabilities α 0.1 Zα (one tail) Zα/2 (two tails) 1.28 1.64 0.05 1.645 1.96 0.01 2.33 2.576 Use a one-tail test to show that sample mean A is significantly greater than (or less than) sample mean B. Use a two-tail test to show a significant difference (either greater than Or less than) between sample mean A and sample mean B. Z table use: α = alpha (the probability of) 10%, 5% and 1 % Z α: z alpha refers to the normal distribution curve is on one side only of the curve “one tail” can be left of the mean or right of the mean. Also your null hypothesis is either expected to be greater or less than your experimental or alternative hypothesis Z α/2 = z alpha 2: refers to an experiment where your null hypothesis predicts no difference between the means of the control or the experimental hypothesis (no difference expected). Your alternative hypothesis is looking for a significant difference Example z-test • You are looking at two methods of learning geometry proofs, one teacher uses method 1, the other teacher uses method 2, they use a test to compare success. • Teacher 1; has 75 students; mean =85; stdev=3 • Teacher 2: has 60 students; mean =83; stdev= = (85-83)/√3^2/75 + 2^2/60 = 2/0.4321 = 4.629 Example continued Z= 4.6291 Ho = null hypothesis would be Method 1 is not better than method 2 HA = alternative hypothesis would be that Method 1 is better than method 2 This is a one tailed z test (since the null hypothesis doesn’t predict that there will be no difference) So for the probability of 0.05 (5% significance or 95% confidence) that Method one is not better than method 2 … that chart value = Zα 1.645 So 4.629 is greater than the 1.645 (the null hypothesis states that method 1 would not be better and the value had to be less than 1.645; it is not less therefore reject the null hypothesis and indeed method 1 is better Z table (sample table with 3 probabilities) α 0.1 Zα (one tail) Zα/2 (two tails) 1.28 1.64 0.05 1.645 1.96 0.01 2.33 2.576 Chi square • Used with discrete values • Phenotypes, choice chambers, etc. • Not used with continuous variables (like height… use t-test for samples less than 30 and z-test for samples greater than 30) • O= observed values • E= expected values http://www.jspearson.com/Science/chiSquare.html http://course1.winona.edu/sberg/Equation/chisqu2.gif Interpreting a chi square • • • • Calculate degrees of freedom # of events, trials, phenotypes -1 Example 2 phenotypes-1 =1 Generally use the column labeled 0.05 (which means there is a 95% chance that any difference between what you expected and what you observed is within accepted random chance. • Any value calculated that is larger means you reject your null hypothesis and there is a difference between observed and expect values. How to use a chi square chart http://faculty.southwest.tn.edu/jiwilliams/probab2.gif

51,000 تومان