Week 4 And 5: Delarna,John Loyd D. BSN 1-2

Get Started. It's Free
or sign up with your email address
Week 4 And 5: Delarna,John Loyd D. BSN 1-2 by Mind Map: Week 4 And 5: Delarna,John Loyd D. BSN 1-2

1. Data Management

1.1. Measures of Variations

1.1.1. Variance and standard Deviation

1.1.1.1. Sum of squares

1.1.1.1.1. is a quantity that appears as part of a standard way of presenting results of such analyses.

1.1.1.2. Sample variance

1.1.1.2.1. is the expectation of the squared deviation of a random variable from its mean

1.1.1.3. Sample Standard Deviation

1.1.1.3.1. measures the spread of a data distribution.

1.1.2. Range

1.1.2.1. The difference between the largest and smallest values of a data distributions.

1.1.3. Population Parameters

1.1.3.1. Population Mean

1.1.3.1.1. is an average of a group characteristic.

1.1.3.2. Population Variance

1.1.3.2.1. tells us how data points in a specific population are spread out.

1.1.3.3. Population Standard Deviation

1.1.3.3.1. the standard deviation is a measure of the amount of variation or dispersion of a set of values

1.1.4. Coefficient Variation

1.1.4.1. also known as relative standard deviation, is a standardized measure of dispersion of a probability distribution or frequency distribution.

1.1.5. Chebyshev's Theorem

1.1.5.1. Chebyshev's Interval

1.1.5.1.1. refers to the intervals you want to find when using the theorem.

1.1.5.2. for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean.

1.1.6. Grouped Data

1.1.6.1. Sample mean for a frequency Distribution

1.1.6.2. Sample Standard Deviation For a frequency Distribution

1.1.6.3. Computation Formula For the Sample Standard Deviation

1.2. Percentiles and Box and Whisker Plots

1.2.1. Percentiles

1.2.1.1. a value such that P% of the data fall at or below it and (100 -P)% of the data fall at or above it.

1.2.1.2. Quartiles

1.2.1.2.1. Are those percentiles that divide the data into fourths.

1.2.1.2.2. Q1

1.2.1.2.3. Q2

1.2.1.2.4. Q3

1.2.1.2.5. Interquartile Range

1.2.1.2.6. HOW TO COMPUTE QUARTILES. 1. Order the data from smallest to largest. 2. Find the median. This is the second quartile. 3. The first quartile Q1 is then the median lower half of the data; that is, it is the median of the data falling below the Q2 position (and not including Q2). 4. The third quartile Q3 is the median of the upper half of the data; that is, it is the median of the data falling above the Q2 position (and not including Q2).

1.2.1.2.7. Five Number Summary

1.2.2. Box-and-Whisker Plot

1.2.2.1. HOW TO MAKE A BOX-AND-WHISKER PLOT 1. Draw a vertical scale to include the lowest and highest data values. 2. To the right of the scale, draw a box from Q1 to Q3. 3. Include a solid line through the box at the median level. 4. Draw vertical lines, called whiskers, from Q1 to the lowest value and from Q3 to the highest value.

1.2.2.2. Provide another useful technique from exploratory data analysis (EDA) for describing data.

1.2.2.3. Whisker

1.2.2.3.1. are the two lines outside the box that extend to the highest and lowest observations.

1.2.3. Outlier

1.2.3.1. is a data point that differs significantly from other observations.

1.3. Measures Of Central Tendency

1.3.1. Mode

1.3.1.1. The value that Occurs most frequently.

1.3.2. Median

1.3.2.1. The central value of an ordered distribution.

1.3.2.1.1. Position of the middle value =n+1/2

1.3.2.2. HOW TO FIND THE MEDIAN The median is the central value of an ordered distribution. To find it, 1. Order the data from smallest to largest. 2. For an odd number of data values in the distribution, Median 􏰂 Middle data value 3. For an even number of data values in the distribution, Median 􏰂 Sum of middle two values 2

1.3.3. Weighted Average

1.3.3.1. Where x is a data value and w is the weight assigned to data value.

1.3.4. Mean

1.3.4.1. The average usually used to compute a test average.

1.3.4.2. HOW TO FIND THE MEAN 1. Compute ∑x; that is, find the sum of all the data values. 2. Divide the total by the number of data values. Sample statistic x– Population parameter m x􏰂 ax m􏰂 ax where n 􏰂 number of data values in the sample N 􏰂 number of data values in the population

1.3.4.3. Geometric Mean

1.3.4.3.1. When data consist of percentages, ratios, growth rates, or other rates of change, the geometric mean is a useful measure of central tendency.

1.3.4.4. Harmonic Mean

1.3.4.4.1. When data consist of rates of change, such as speeds, the harmonic mean is an appropriate measure of central tendency.

2. Normal Distribution & Regression and Corellation

2.1. Normal Distribution

2.1.1. The normal Curve

2.1.1.1. It is a very useful curve in statistics because many attributes, when a large number of measurements are taken, are approximately distributed in this pattern.

2.1.1.2. if X follows the normal distribution with mean μX and standard deviation σX we write this as X ∼ N (μX , σX2 ). The symbol σX2 is called the variance. It is equal to the square of the standard deviation.

2.1.2. is not really the normal distribution but a family of distributions. Each of them has these properties:

2.1.3. Shape of Distribution

2.1.3.1. Skewed Distribution

2.1.3.1.1. this is not symmetrical in shape but has a “tail” of high earners

2.1.3.2. Uniform Distribution

2.1.3.2.1. the long term pattern of outcomes would be uniform.

2.1.4. Population

2.1.4.1. The whole group of interest.

2.1.5. Sample

2.1.5.1. selection from the parent population.

2.1.5.2. The sampling distribution of the mean.

2.1.6. Central Limit Theorem

2.1.6.1. expresses that if a random variable is the sum of n, independent, identically distributed, non-normal random variables, then its distribution approaches normal as n approaches infinity.

2.1.7. Areas Under the Normal Curve

2.1.7.1. Scores

2.1.7.2. Horizontal axis

2.1.7.3. Proportion

2.1.7.3.1. area under the curve corresponding to an interval of scores represents the percentage

2.1.7.4. Probability

2.1.7.4.1. represented by the area under the curve above that interval

2.1.7.5. the area between the mean and the z score

2.1.7.6. the area beyond the z score, called the smaller portion

2.1.7.7. the area up to the z score, the larger portion.

2.1.8. Transforming Raw Scores

2.1.8.1. subtract the mean and divide by the standard deviation.

2.1.9. The Standard Normal Distribution Tables

2.1.9.1. provides the probability that a normally distributed random variable Z, with mean equal to 0 and variance equal to 1, is less than or equal to z.

2.2. Regression And Corellation

2.2.1. Regression

2.2.1.1. Dependent Variable

2.2.1.1.1. it depends on what independent value you pick.

2.2.1.2. Independent Variable

2.2.1.2.1. is the one that you used to predict what the other Variables is.

2.2.1.3. answers whether there is a relationship

2.2.1.4. Interpolating

2.2.1.4.1. The process of predicting the values using an x within the range of original x-values.

2.2.1.5. Extrapolating

2.2.1.5.1. Using an x-value that is outside the range of original x-values.

2.2.2. Corellation

2.2.2.1. answers how strong the linear relatonship is.

2.2.2.2. Linear Corellation Coefficient

2.2.2.2.1. is the number that describes the strenght of the linear relationship between the two variables.

2.2.2.3. Causation

2.2.2.3.1. indicates a relationship between two events where one event is affected by the other.

2.2.2.4. Coefficeient of Determination

2.2.2.4.1. r2=explained variation/total variation

2.2.3. Prediction Interval

2.2.3.1. Standard error of estimate

2.2.3.2. Range

3. -