POL221 Minnesota State University Mankato Sigma Summation Sign Questions – Please Answer the problem set paper questions and also try to explain each step you did and explain what you did please.

– I’ll attach the problem set paper and also I will attach the handouts that would help to understand the concepts. Political Science 221

Problem Set #1

Spring 2020

Please answer all of the following problems on a separate sheet of paper. Your answers may be

handwritten. Be sure to show all steps in your calculations. (30 points total)

1. (1 points each) Given that k = {11.2, -4, 17, 11, 0.6, -4.5, 0, -8, -31, 13, 15.9, -3.5}, find the

following:

a) k 3

b) k 2−k 6

11

c)

∑ ku

u=2

5

d)

e)

∑ ku

u=1

∑ ku

f)

g)

h) Mode ( k )

i) MD ( k )

j) An outlier.

2. (2 points each) Given that l = {4, 2, 1, 6, 8, 7}; solve for m in each of the following. Show

all of your work at each step:

6

a) m=∑ l i

i=3

b)

3. Suppose that v is the data of an entire population and w is a sample taken from v. Given that:

v = {7, 5, 4, 4, 9, -2, 11, 5, 2, 0, -2, 5, -4, 6, 9}

w = {4, 5, -2, 2, 6, 5, -4, 9}

Find each of the following (4 points each):

2

a) σ v

b) σ v

2

c) s w

d) s w

POL 221: Political Analysis

Scott Granberg-Rademacker

Handout #1

Measures of Central Tendency

Measures of central tendency are mathematical operations which supply information about the “typical” observation in a set or variable. There are

several measures of central tendency, each with different pros and cons: expected values (sometimes called expectations, means or averages), medians,

and modes. Expected values (usually denoted as E (X) or x̄) are most commonly used in practice, but there are applications where medians (denoted

x̃) or modes may prove to be a better indicator of what the “typical” observation is like.

Most of the time, the expected value is identical to the simple average,

which is nothing more than the arithmetic mean of a set or variable. Simple

averages, however, make the assumption that the probability of each observation is equal: P (x1 ) = P (x2 ) = · · · = P (xk ). If X is a discrete stochastic

variable, the simple average can be simply found as follows:

n

P

E(X) = x̄ =

xi

i=1

n

(1)

However, such an assertion may or may not be true. If the probabilities

assocatied with each observation are different, then the expected value is a

weighted average. Consider the expected value of a variable, x, where the

probability of each possible observation is different. In a case like this, the

expected value would simply be each observation times its probability:

E(X) = x̄ =

n

X

xi f (xi )

(2)

i=1

Though the problem with weighted averages in practice is that we often do

not know the exact probabilities that make up f (x) (remember that f (x)

is the probability density function of x). When these probabilities are not

known, the most common approach is to simply assume that the probabilities are all the same and use the simple average formula.

1

One of the main problems with using expected values is that the influence of

outliers is poorly mitigated. Basically, extreme values which are not “typical” of other observations may heavily skew the expected value. Consider

two variables:

a = {3, 4, −2, 4, 5, 3}

b = {3, 4, −2, 4, 5, 3, 170}

The only difference between the two is that B has one more observation than

A, but that single observation is clearly much different than the rest of the

observations. Such abnormal observations are outliers, which can badly skew

the expected value:

n

P

ā =

n

n

P

b̄ =

ai

i=1

i=1

n

=

3+4+(−2)+4+5+3

6

=

3+4+(−2)+4+5+3+170

7

bi

=

17

6

=

= 2.83

187

7

= 26.71

So, how can one consider extreme outliers while still getting a good idea

about the “typical” observation? Another possibility is to use the median.

The median of a set or variable is the value that has just as many values

greater than it as are less than it. When the set or variable has an even

number of observations, the median is the average of the two middle values.

When the set or variable has an odd number of observations, the median is

simply the middle value.

It is important to note for discrete variables that the median will always

satisfy the following condition:

P (X ≤ x̃) ≥ 0.5 ≤ P (X ≥ x̃)

(3)

Finding the median is quite simple. The first step is to arrange the values in

the variable(s) from least to greatest. Let us denote the arranged variables

as a∗ and b∗ .

a∗ = {−2, 3, 3, 4, 4, 5}

b∗ = {−2, 3, 3, 4, 4, 5, 170}

When the total number of observations is odd, the median can be found

using the following formula:

(4)

x̃ = x∗n+1

2

2

and when the total number of observations is even:

x̃ =

x∗n + x∗n +1

2

2

2

(5)

Since a has six observations (n = 6), it is necessary for us to use Equation 5

to find the median of a:

ã =

a∗n + a∗n +1

2

2

2

=

a∗6 + a∗6 +1

2

2

2

=

a∗3 + a∗3+1

a∗ + a∗4

3+4

7

= 3

=

= = 3.5

2

2

2

2

Finding the median of b is simply a matter of using Equation 4, since b has

an odd number of observations (n = 7):

b̃ = b∗n+1 = b∗7+1 = b∗8 = b∗4 = 4

2

2

2

When we compare the means and medians of a and b, one can see that they

are not the same:

ā = 2.83, ã = 3.5

b̄ = 26.71, b̃ = 4

However, both the mean and median are fairly “typical” of a, which is to

be expected since there is no extreme outlier in a. Note that the mean of b

has been heavily skewed by the outlier but the median of b easily mitigates

the impact of the outlier. This illustrates one of the nice properties of the

median–it tends to be resistant to outliers.

Another measure of central tendency which is not used very often is the

mode. The mode of a set or variable is simply the value that occurs most

frequently within that set or variable. It is possible that for any given set or

variable, there may be one mode, several modes, or no modes. For example,

the mode of a is simply:

Mode (a) = {3, 4}

Modes are seldom used in practice for good reason. They are often unreliable

and misleading, as illustrated in the following example:

c = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 902, 902}

Where the mode of c is:

3

Mode (c) = 902, which is hardly typical of c.

Consider another example:

d = {1, 2, 3, 4, 5, 6, 7}

In this instance, there is no mode of d, because there is only one instance of

each value.

Mode (d) = ∅, where ∅ denotes an empty set.

Measures of Variability are mathematical operations which measure the

amount of dispersion or spread in a given set or variable. While measures of

central tendency tell you what the “typical” observation is like, measures of

variability tell you hbow dispersed or spread out the data in a set or variable is. There are several measures of variability available to us, each with

advantages and disadvantages.

The most basic measure of variability is the range. The range of a set or

variable is simply the largest value minus the smallest value. The range can

be denoted as:

Range (x) = xmax − xmin

(6)

So if we have two variables:

e = {3, 5, 5, 7}

f = {4, 4, 6, 6}

Finding the ranges is quite simple:

Range (e) = 7 − 3 = 4

Range (f ) = 6 − 4 = 2

Ranges are nice but are only informative about the extreme values of a variable. This means that they are susceptible to outliers, and can ultimately

provide a badly skewed picture of the variability of a variable.

A better measure of variability is the mean deviation. The mean deviation

is the average distance an observation in a set or variable is away from the

mean. This makes for a nice interpretation about the “typical” observation.

4

The mean deviation can be found by using the following formula:

n

P

MD (x) =

|xi − x̄|

i=1

n

(7)

Absolute value bars || simply mean that after all operations in the absolute value are finished, negative numbers are turned positive. For example,

|5 − 8| = |−3| = 3. The absolute value of a positive number is a positive

number: |5| = 5.

Despite the nice interpretation, absolute values are not used all that often.

First of all, absolute values are problematic (particularly for computers) when

doing more complex operations. Secondly, it is possible for variables with different distributions to have the same mean deviation. Consider e and f once

again:

e = {3, 5, 5, 7}

f = {4, 4, 6, 6}

Clearly they are distributed differently, but the mean deviation will not reveal this to us. Observe how both mean deviations yield the same result

(keep in mind both ē and f¯ = 5):

n

P

|ei − ē|

|3 − 5| + |5 − 5| + |5 − 5| + |7 − 5|

=

n

4

|−2| + |0| + |0| + |2|

2+0+0+2

4

=

= =1

4

4

4

MD (e) =

i=1

n

P

=

fi − f¯

|4 − 5| + |4 − 5| + |6 − 5| + |6 − 5|

=

n

4

|−1| + |−1| + |1| + |1|

1+1+1+1

4

=

= =1

4

4

4

MD (f ) =

i=1

=

This is where the variance (commonly denoted σ 2 which is pronounced

“Sigma squared”) and standard deviation (denoted σ) can help out. The

formula for the variance is very similar to the mean deviation, but it avoids

the problem of taking the absolute value by simply squaring the deviations.

Additionally, it provides us with a measure that is more sensitive to variation

5

than the mean deviation. The formula for the variance is simply:

n

P

σ2 =

(xi − µ)2

i=1

n

The variance is simply the square root of the variance:

√

σ=

σ2 =

(8)

v

n

uP

u (xi − µ)2

t

i=1

(9)

n

All of these benefits do have a downside, however. Since the deviations are

being squared, the variance and standard deviation do not have a clean and

simple interpretation like the mean deviation does. It does have some nice

qualities which will be illustrated when we talk about distributions and hypothesis testing.

So how do the variance and standard deviation fare with e and f ? Let’s find

the variances:

n

P

σe2

=

i=1

n

P

σf2 =

(ei − µe )2

(3 − 5)2 + (5 − 5)2 + (5 − 5)2 + (7 − 5)2

=

n

4

(−2)2 + 02 + 02 + 22 4 + 4

8

=

+

= =2

4

4

4

(fi − µf )2

(4 − 5)2 + (4 − 5)2 + (6 − 5)2 + (6 − 5)2

n

4

2

2

2

2

(−1) + (−1) + 1 + 1

1+1+1+1

4

=

+

= =1

4

4

4

i=1

=

And the standard deviations:

σe =

p

σe2 =

√

2 = 1.41

q

√

σf = σf2 = 1 = 1

6

Notice that the standard deviations are close (or identical in the case of f )

to the mean deviations found, but are still different from each other–better

reflecting the true variability of e and f . In general, the larger the standard

deviation, the greater the variability.

All of what we have done so far assumes that we are dealing with populations. Populations are complete sets of all observations of interest. In

reality, true populations are often unknown. Most of the time, what we have

in social science is sample data. Samples are simply subsets of a population. Because we often deal with sample data, we need to account for the

uncertainty that needs to be accounted for in a sample. Think of it like a

currency: every observation in a sample is a currency unit, but whenever

an estimate is calculated, one unit of currency is “spent”. These “currency”

are known as degrees of freedom (referred to as “df” for short), and one

degree of freedom is lost when we “spend” it to calculate an estimate.

More technically, degrees of freedom are any of the unrestricted, random variables that constitute a statistic. In practicality, this means that we have to

make small adjustments to some of our formulas when dealing with samples.

The biggest change for us right now is to remember that the formulas for

variance and standard deviations need to be slightly corrected. The sample

variance can be found using the following formula:

n

P

s2 =

(xi − x̄)2

i=1

n−1

And the sample standard deviation is:

v

uP

u n

u (xi − x̄)2

√

t

s = s2 = i=1

n−1

(10)

(11)

You might ask, what really changed? The most noticeable change is that the

Greek letter σ is not used in either formula. Instead, the sample variance

is denoted as s2 and the sample standard deviation is denoted as s. These

are estimates which approximate the unknown population variance σ 2 and

population standard deviation σ. Since these are sample estimates, we lose

one degree of freedom, which is taken off of the denominator. So instead of

dividing by n, we divide by n − 1, when finding s2 and s.

7

Also of note is that the typical notation for the population mean and sample

mean are different. The population mean is usually denoted by the Greek

letter µ (pronounced “mu”), and the sample mean is usually denoted with

a bar over the variable name, x̄. Once again, in practice the true value of

µ is often unknown, and the mean of the observed sample data x̄ is only an

estimate of µ.

EXCEL Commands:

Average: =AVERAGE(number1,number2,…)

Median: =MEDIAN(number1,number2,…)

Mode: =MODE(number1,number2,…)

Range: =MAX(number1,number2,…)-MIN(number1,number2,…)

Mean Deviation: =AVEDEV(number1,number2,…)

Population Variance: =VARP(number1,number2,…)

Population Standard Deviation: =STDEVP(number1,number2,…)

Sample Variance: =VAR(number1,number2,…)

Sample Standard Deviation: =STDEV(number1,number2,…)

8

POL 221: Political Analysis

Scott Granberg-Rademacker

Handout #2

Normal Distribution

The shape of the normal distribution is the famous bell-shape shown below.

Figure 1: Normal Distribution

The normal distribution (also sometimes called the Gaussian distribution)

first appeared in print by Abraham de Moivre in 1733. It is easily the single

most important distribution ever.

The normal distribution has two parameters: mean (denoted µ) and variance

(denoted σ 2 ). The pdf of the normal distribution seems intimidating, but

fortunately we don’t really have to deal with it all that much in this class:

x−µ 2

[ σ ]

1

f x; µ, σ = √

e− 2

(1)

2πσ

for −∞ < x < ∞, where −∞ < µ < ∞ and 0 < σ < ∞. A normally
distributed random variable, X, is denoted: X ∼ N (µ, σ 2 ).
2
1
The importance of the normal distribution is in how it relates to most other
distributions. In fact, the central limit theorem states that if any given
distribution (normal or non-normal) has a finite mean µ and variance σ 2 ,
then the sampling distribution of the mean will approach the normal distri2
bution with a mean µ and variance σn where the sample size n increases and
approaches infinity n → ∞.
Tests of Hypotheses
What are hypotheses?
Hypotheses are sets of statements (usually two statements) which meet the
following criteria:
1. They are mutually exclusive, which means that it is not possible for
both statements to be true or false at the same time. If one is true
then the other must necessarily be false, and vice-a-versa.
2. They are collectively exhaustive, which means that all possibilities
must be accounted for.
3. There must be adequate data of sufficient quantity and quality by which
the statements in the set can be tested for truth or falsity.
Consider an example whereby you might be interested in knowing whether
the average age of children at your daycare center is significantly different
than the average age of daycare centers nationally. Let the national average
age of children at daycare centers be denoted as µ, and let the average age
of children at your daycare be denoted as x̄.
The relationship between µ and x̄ can be expressed in six possible ways:
1. µ 6= x̄
2. µ > x̄

3. µ < x̄
4. µ = x̄
5. µ ≥ x̄
6. µ ≤ x̄
2
Hypothesis sets are typically denoted as two different statements, H0 and H1 .
H0 is what is known as the null hypothesis (H0 is actually pronounced “H
not”) and H1 (which is pronounced “H one”) is the alternative hypothesis. It is important to remember when constructing a hypothesis set that
the equals sign (which could be expressed as =, ≥, or ≤) is always going to
be in H0 . Alternatively, H1 will never have an equals sign in it. Instead, H1
will be directly relatable to your suspicion about the relationship expressed.
For example, if you believed that than your daycare center had younger
children (on average) than daycare centers nationwide, your suspicion would
be:
Age of children at your daycare < Age of children at daycares nationwide
Which is the same as stating:
x̄ < µ
And since this is our suspicion, we can denote it as H1 :
H1 : x̄ < µ
Now that we have H1 , we need to construct H0 . We must include all other
possibilities and we must make sure that the equals sign is included in the
expression in H0 . In H1 , we stated our belief that children are on average
younger at your daycare than at daycares nationwide. If this statement is
false, then one of the following must be the case: children at your daycare
must be older or the same age as children at daycares nationwide. We could
express this formally as H0 :
H0 : x̄ ≥ µ
If we put H0 and H1 together, we have a hypothesis set that is both mutually
exclusive and collectively exhaustive:
H0 : x̄ ≥ µ
H1 : x̄ < µ
3
DIFFERENCE BETWEEN MEANS OF SAMPLE AND POPULATION WITH LARGE
SAMPLES (n > 30)

2-tailed test

Let’s say that you are interested in knowing whether or not your sample mean is different than

the known mean of your population 1.

Examples: Let’s say that you are interested in knowing whether the average age of children at

your daycare center are different than the average age of children at daycare nationally. Let x be

the ages of the children at your daycare, and your daycare has 30 children (n=30).

x = {5, 6, 6, 2, 4, 0, 9, 5, 5, 4, 4, 6, 7, 1, 2, 9, 0, 5, 6, 2, 2, 3, 8, 9, 9, 0, 0, 6, 5, 5}

The average child age at your daycare:

The variance: s2 = 8.05, and standard deviation: s = 2.84

Census Bureau data on daycares states that the average age of children at daycare is 5.7 years

old, and the population variance is 5.1 and population standard deviation is 2.26

So our population figures are:

µ = 5.7

σ 2 = 5.1

σ = 2.26

We then state our hypotheses (H0 must always contain an equal sign):

H0:

H1:

Or stated another way:

H0:

H1:

Since we have 30 or more observations, we can use the large-sample approximation to assume

that our sampling distribution is approximately normal. We then use the following formula to

calculate the test statistic:

So then we go through the actual calculation:

1

It is useful to know that most of the time, the true population mean (µ) of a sample is not known; neither is the true

population variance (σ2).

3

Once we have the z-score, we must determine whether or not our z-score is inside or outside of

the critical region.

We have to determine what our α-level is going to be. Think of this in terms of: how certain do

you want to be in your result? Most commonly, α = .05, though sometimes scientists want a

higher standard of proof, so they may choose a smaller α level. Basically, what this means is

that you are testing your hypothesis against a certain confidence level. This level of confidence

is:

1 – α = confidence level

So in our example, if we choose α = .05, then our confidence level is 95% (confidence level is

always 1-α, so if α = .05 like it does in this instance, 1-.05=.95, or 95% confidence).

Next we need to look our z-table.

If we are conducting a two-tailed test (which we are in this case), we would look to see if the

following statement is true or not:

, where

In this instance,

and

are found by looking at the z-table.

, so the trick is to find the values that most closely matches:

, which in our case is .475, since .5 – .025 = .475.

The closest match from the table is 1.96, with a value of .4749. To find 1.96, just look follow

straight across from .4749 on the table to arrive at 1.9, then follow straight up from .4749 to find

.06, put them together and your = 1.96

So in our instance the equation

is actually false, since the expression:

-1.96 < 2.31 < 1.96 is false. When this is false, we REJECT H0. Meaning that we can be 95%
is significantly different than the population mean of µ = 5.7.
confident that our mean of
1-tailed test
In the previous example, we used the two-tailed test because we didn’t know for sure whether
our mean was going to be smaller or larger than the population mean. 1-tailed tests are used
when you have a good idea which way you want to test.
4
Let’s say that the follow...
Purchase answer to see full
attachment

Don't use plagiarized sources. Get Your Custom Essay on

POL221 Minnesota State University Mankato Sigma Summation Sign Questions – Please Answer the problem set paper questions and also try to explain each step

Just from $13/Page

Why should I choose Homework Writings Pro as my essay writing service?

We Follow Instructions and Give Quality Papers

We are strict in following paper instructions. You are welcome to provide directions to your writer, who will follow it as a law in customizing your paper. Quality is guaranteed! Every paper is carefully checked before delivery. Our writers are professionals and always deliver the highest quality work.

Professional and Experienced Academic Writers

We have a team of professional writers with experience in academic and business writing. Many are native speakers and able to perform any task for which you need help.

Reasonable Prices and Free Unlimited Revisions

Typical student budget? No problem. Affordable rates, generous discounts - the more you order, the more you save. We reward loyalty and welcome new customers. Furthermore, if you think we missed something, please send your order for a free review. You can do this yourself by logging into your personal account or by contacting our support..

Essay Delivered On Time and 100% Money-Back-Guarantee

Your essay will arrive on time, or even before your deadline – even if you request your paper within hours. You won’t be kept waiting, so relax and work on other tasks.We also guatantee a refund in case you decide to cancel your order.

100% Original Essay and Confidentiality

Anti-plagiarism policy. The authenticity of each essay is carefully checked, resulting in truly unique works. Our collaboration is a secret kept safe with us. We only need your email address to send you a unique username and password. We never share personal customer information.

24/7 Customer Support

We recognize that people around the world use our services in different time zones, so we have a support team that is happy to help you use our service. Our writing service has a 24/7 support policy. Contact us and discover all the details that may interest you!

Try it now!

How it works?

Follow these simple steps to get your paper done

Place your order

Fill in the order form and provide all details of your assignment.

Proceed with the payment

Choose the payment system that suits you most.

Receive the final file

Once your paper is ready, we will email it to you.

Our Services

Our reputation for excellence in providing professional tailor-made essay writing services to students of different academic levels is the best proof of our reliability and quality of service we offer.

Essays

When using our academic writing services, you can get help with different types of work including college essays, research articles, writing, essay writing, various academic reports, book reports and so on. Whatever your task, homeworkwritingspro.com has experienced specialists qualified enough to handle it professionally.

Admissions

Admission Essays & Business Writing Help

An admission essay is an essay or other written statement by a candidate, often a potential student enrolling in a college, university, or graduate school. You can be rest assurred that through our service we will write the best admission essay for you.

Reviews

Editing Support

Our professional editor will check your grammar to make sure it is free from errors. You can rest assured that we will do our best to provide you with a piece of dignified academic writing. Homeworkwritingpro experts can manage any assignment in any academic field.

Reviews

Revision Support

If you think your paper could be improved, you can request a review. In this case, your paper will be checked by the writer or assigned to an editor. You can use this option as many times as you see fit. This is free because we want you to be completely satisfied with the service offered.