• We need your support!

    We are currently struggling to cover the operational costs of Xtremepapers, as a result we might have to shut this website down. Please donate if we have helped you and help make a difference in other students' lives!
    Click here to Donate Now (View Announcement)

Stats1 made easier! Just have a look at this!

Messages
19
Reaction score
3
Points
0
Ok guys, I am covering up the first three chapters in Statistics1 out of 9 chapters. I am doing this because I think I am good at this and want to provide you guys for a quick revision notes from these 3 chapters! So that you can have plenty of time studying the rest 6 chapters (which are of more significance ofcourse!) and also so that you guys save some time to practise the past papers!! Go through what I’ve written and trust me you’ll not have to waste time studying the 3 chapters!! And guys! Do not forget to pray for me if you get any help from any of my words/lines!! Al da best to each one of you appearing for statistics 1 (62) this thurday!

STEM AND LEAF
1. Advantage: Provides visual impression; preserves raw data
2. Disadvantage: Not suitable for a huge data
3. Notes on drawings:
A) Don’t forget key.
B) Leaf contains only one digit
C) Stem can consist of any number of digits
D) Equal class intervals
E) Whole numbers only used
4. Ordered stem and leaf: One dat is always arranged in ascending order (usually this being the case!)
5. Back to back stem and leaf: Consists of two data (side by side); usually given for comparison
6. E.g. of a data: 1.347, 1.351, 1.353. (1) Why would there be no point in drawing a stem n leaf for this data rounded to 3 s.f.? Ans: Because there would only be one stem which would have all the leaves! (2) What information is given by the raw data that does not appear in a stem n leaf? Ans: The raw data indicates the order in which it was taken but the stem n leaf doesn’t!

QUALITATIVE, QUANTITATIVE, CONTINUOUS AND DISCRETE DATA
1. Qualitative: non-numerical
2. Quantitative: numerical
3. A quantitative variable which can take any value in a given range is continuous (decimal values!!) Examples: length, speed, mass, time, marks
4. A quantitative variable which can take integer values and has clear steps between its possible values is discrete. Examples: age, number
 
Messages
19
Reaction score
3
Points
0
HISTOGRAM
1. Used to represent a grouped data

2. Notes:
A) Area is proportional to frequency
B) Frequency density is proportional to height
C) Frequency density=Frequency/class width

3. Drawings:
(A) Class interval must be changed to continuous before drawing a histogram (i.e. there should not be any gap between the bars!!)
(B) If the last class of continuous grouped data is open ended, a reasonable procedure for this type of situation is to take the width of the last interval to be twice that of the previous.
(C) No gaps between the bars while drawing
(B) Never forget to label the axes
(C) Uniform scale
(D) Use 75% of the graph paper
(E) If Class interval not equal, frequency density in the y axis. (normally this being the case!)
(F) If Class interval equal, frequency in the y axis
(G) Can put a kink on the x axis but the scale must be uniform

4. Why a class having the highest frquency may not be the modal class? Ans: Because the class may not have the highest frequency density.
5. Advantage: Can be used to represent a large set of data; Easy to look at the skewness; Not necessary for the data to have equal class width; modal class easily seen
6. Disadvantage: Some of the information of the original data is lost (i.e. we only know how many data are in a class rather than what exact values it takes); the values of the median, quartiles, mean, standard deviation are estimates rather than the exact values.
7. Modal class: The class with the tallest bar ( one with the highest frequency density and not the one with the highest frequency)
8. Skewness: If the distribution has a long tail of values to the right, it is said to be positively skewed. If there is a long tail of values to the left, it is said to be negatively skewed.
 
Messages
19
Reaction score
3
Points
0
CUMULATIVE FREQUENCY GRAPHS

1. It is usually drawn to estimate the median and quartiles of a grouped data.
2. Before drawing, first calculate the cumulative frequencies. U just have to add up the frequencies etc !! Hope you know how to do that!!
3. If you’re asked to draw a polygon, join the points by a straight line. This means that the data is evenly distributed.
4. If you’re asked to draw a curve, join the points by hand. This means that the data is not evenly distributed.
5. If you’re asked to draw a percentage cumulative graph then first calculate the percentage cumulative frequencies. (Dividing the cf u obtain by the total frequency and multiplying it by 100. Repeat this with every class)
6. Advantage: easy to draw a box n whisker plot (as we’ll have our values of median and quartiles from the graph); the box n whisker will then allow us to see the skewness!; huge data can be shown due to class intervals
7. Disadvantage: The values we get for median and quartiles are just estimates and not exact since we assume even distribution in each class; no visual impression; raw data is lost due to allocation of classes
8. Which things in a cumulative frequency graph has to be overcomed to draw a histogram out of the same data? Ans: Class width is unequal and Frequency density is not given. So, they’re to be calculated!!
9. If you’re asked for problems causing inaccuracy in your calculated value from a cf graph then mention: Data is not a raw data and is grouped. And the data might not be evenly distributed.
10. Example of a data:
x 60-75 75-90 90-105 105-120 120-150
cf 44 100 172 192 200
The cf values lie in the y axis and the classes in the x axis. Then, the first point to be plotted is (60,0). Then, plot (75,44); (90,100); (105,172); (120,192) and (150,200).
11. You’re frequently asked questions like find the number of people MORE THAN bla bla bla… or the time GREATER than etc etc and also LEAST, SMALLER, LOWER etc… In such cases:
(A) If your’re asked for the LEAST, SMALLER, LOWER then just see the corresponding value from the graph. U do not have to do anything else.
(B) If you’re given with a value from the y-axis and asked GREATER,HIGHER, MORE THAN etc then substract the value from the total frequency and find the corresponding value from the x axis.
(C) If you’re given with a value from the x axis and asked GREATER, HIGHER, MORE THAN etc then first see the corresponding value and then substract. (opposite to that described in (B))
If you did not understand what I just wrote, then just stick to your way of doing it. It’s just that I do it this way and it does not require a lot of thinking!!
 
Messages
19
Reaction score
3
Points
0
PREDICTION OF SKEWNESS OF DATA USING MEAN AND MEDIAN

1. First, find the mean as well as the median of the data.
2. Then, if the mean is greater than the median, just say that the data is positively skewed.
3. If the median is greater than the mean, say that the data is negatively skewed.
4. However, if both tend to be equal(apprx.) then the data is roughly symmetrical!!
 
Messages
19
Reaction score
3
Points
0
OTHER IMPORTANT NOTES:
1. When you are asked to calculate the median,Q1, Q3 from an ungrouped data( a simple data that is just separated by commas) DO NOT FORGET TO ARRANGE THE DATA IN AN ASCENDING ORDER!!!! This was one of my terrible mistakes in an exam.
2. If it’s a grouped data(data with class intervals- also known as continuous data), then FIND THE CUMULATIVE FREQUENCIES first.
3. THE FOLLOWING FORMULAE ARE NOT GIVEN IN THE BOOKLET. MEMORISE THEM!!
(A) Median for an UNGROUPED data=((N+1)/2) th term
(B) Median for a GROUPED data=(N/2)th term
(C) Combined Mean= Ex+Ey/N+n
(D) Combined Variance= (Ex^2 + Ey^2/N+n)-(Combined Mean)^2
(E) Variance=(Standard Deviation)^2
(F) First quartile for an UNGROUPED data(Q1)=(N+1/4)th term
(G) Third quartile for an UNGROUPED data(Q3)=3(N+1/4)th term
(H) First quartile for a GROUPED data(Q1)=(N/4)th term
(I) Third quartile for a GROUPED data(Q3)=3(N/4)th term
(J) Interquartile range= Q3-Q1
(K) Range= Highest Data-Lowest Data
(L) Mode=Data with the highest frequency
(M) Modal class= Class with the highest frequency (unlike the case in histograms where the modal class is the one with the highest frequency density)
(N) Second Quartile refers to median
(O) In terms of percentile:
Q1=25 percentile
Q2=Median=50 percentile
Q3=75 percentile
(P) Outliers:
Upper fence=Q3+1.5(Q3-Q1)
Lower fence=Q1-1.5(Q3-Q1)
Thus, any value which is bigger than the upper fence or smaller than the lower fence is considered an outlier.
 
Messages
19
Reaction score
3
Points
0
CHOICE BETWEEN MEAN AND MEDIAN TO DISCUSS AVERAGE TENDENCY WITH REASONS
1. If the data consists of outliers(their presence is known by the method described in (P)) then,choose median to calculate the central tendency. The reason is: Median isn’t affected by extreme values. It only uses frequency! But Mean is affected by outliers(extreme values). Because we have to add up the data to calculate mean and in presence of a too high value, the sum used to calculate mean will unnecessarily be high.
2. However, if the data does not consist of any outlier, then choose mean. The reason is: Mean uses all the data (i.e. while calculating mean we have to add up all the data, so it uses all the data!!). But median does not use all the data. It just uses frequency.
3. If you did not understand what I just wrote, then refer to the formulae for mean n median, u’ll understand!!
 
Messages
19
Reaction score
3
Points
0
AN IMPORTANT QUESTION
1. Calculate the total mean when E(x-5)=226, N=38
Mean(x-5)=226/38
=5.95
Therefore, Mean(x)=5.95+5
=11.O

2. Calculate the total variance when E(x-2)^2=200, E(x-2)=22,N=5
Variance(x-2)=(2OO/5)-(22/5)^2
=20.6
Therefore, Variance(x)=20.6
Note: In this case we do not have to add 2 like in question 1.
So, Var(x)=Var(x-a)=Var(x+a) unlike the case in mean

Also,S.D(x)=S.D(x-a)=S.D(x+a)

3. The standard deviation of the three numbers a,b,c is 3.2
a. State the standard deviation of the three numbers 3a,3b,3c.
b. State the standard deviation of the three numbers a+2,b+2,c+2
c. State the standard deviation of the three numbers 2a+5,2b+5,2c+5

Ans: a. 3.2 X 3 = 9.6 b. 3.2 c. 3.2 X 2 = 6.4

Hope, you’ve understood.
 
Messages
19
Reaction score
3
Points
0
4. In one company the mean age of workers is 37.5 and S.D is 11.9 and an another company with the same number of workers has mean age of 28.4 and S.D. of 9.9, compare the two data.
Ans: In the second company, there are younger workers whereas the first company has older workers (on the basis of mean!) Also, The spread of age of the second company is comparatively less than that of the first company (on the basis of median!).
 
Messages
19
Reaction score
3
Points
0
Ok, so I’m done! This is all I can offer! And I don’t think I’ll be in XPF untill next Friday! So if you guys have any queries, then discuss among yourselves and sort it out! Sorry for that! Its just that I need to study! All bcoz I procrastinate a lot! Ok, hope I’ve helped!! Once again, al da very best! :)
 
Messages
1,532
Reaction score
478
Points
93
@Princessazru: I have compiled all your notes above into a word file that I am posting here for the convenience of XPF users. I hope you don't mind. :)
 

Attachments

  • S1 notes.docx
    17.2 KB · Views: 344
Messages
118
Reaction score
11
Points
28
Someone post some notes about Permutation and Combination. I'm really bad at that. =(
 
Messages
760
Reaction score
10
Points
0
princessazru said:
AN IMPORTANT QUESTION
1. Calculate the total mean when E(x-5)=226, N=38
Mean(x-5)=226/38
=5.95
Therefore, Mean(x)=5.95+5
=11.O

2. Calculate the total variance when E(x-2)^2=200, E(x-2)=22,N=5
Variance(x-2)=(2OO/5)-(22/5)^2
=20.6
Therefore, Variance(x)=20.6
Note: In this case we do not have to add 2 like in question 1.
So, Var(x)=Var(x-a)=Var(x+a) unlike the case in mean

Also,S.D(x)=S.D(x-a)=S.D(x=a)

3. The standard deviation of the three numbers a,b,c is 3.2
a. State the standard deviation of the three numbers 3a,3b,3c.
b. State the standard deviation of the three numbers a+2,b+2,c+2
c. State the standard deviation of the three numbers 2a+5,2b+5,2c+5

Ans: a. 3.2 X 3 = 9.6 b. 3.2 c. 3.2 X 2 = 6.4

Hope, you’ve understood.

wat do u mean by S.D(x=a)?
did u mean S.D(x+a) ?
 
Messages
154
Reaction score
0
Points
0
Dear Hamid Ali,
firstly infringement of IPR.
But if you are allowed, then please at least edit it properly so that it can be read by the students more easily.. :)
I would have made a copy in a better form but it would be so much against the princessazru's property right.

As a moderator I hope you'll take permission before doing any such thing rather than taking permission at the end.
Have a nice day!
 
Messages
760
Reaction score
10
Points
0
JamesSmith said:
Dear Hamid Ali,
firstly infringement of IPR.
But if you are allowed, then please at least edit it properly so that it can be read by the students more easily.. :)
I would have made a copy in a better form but it would be so much against the princessazru's property right.

As a moderator I hope you'll take permission before doing any such thing rather than taking permission at the end.
Have a nice day!
so u bothered talking crap to a moderater but wasnt bothered to anser my question above?
 
Messages
1,532
Reaction score
478
Points
93
JamesSmith said:
Dear Hamid Ali,
firstly infringement of IPR.
But if you are allowed, then please at least edit it properly so that it can be read by the students more easily.. :)
I would have made a copy in a better form but it would be so much against the princessazru's property right.

As a moderator I hope you'll take permission before doing any such thing rather than taking permission at the end.
Have a nice day!

If you read the topic in the word file I posted, it says S1 notes (by princesszaru). I just compiled them, with no intention of stealing her property. I just copy pasted the exact same words of princesszaru without any editing, since changing the format would have been infringement of property rights.
Lastly, the person whose notes were compiled by me did not have any problem with it, hence, nobody else should have a problem either.
 
Messages
128
Reaction score
6
Points
28
hamidali391 said:
JamesSmith said:
Dear Hamid Ali,
firstly infringement of IPR.
But if you are allowed, then please at least edit it properly so that it can be read by the students more easily.. :)
I would have made a copy in a better form but it would be so much against the princessazru's property right.

As a moderator I hope you'll take permission before doing any such thing rather than taking permission at the end.
Have a nice day!

If you read the topic in the word file I posted, it says S1 notes (by princesszaru). I just compiled them, with no intention of stealing her property. I just copy pasted the exact same words of princesszaru without any editing, since changing the format would have been infringement of property rights.
Lastly, the person whose notes were compiled by me did not have any problem with it, hence, nobody else should have a problem either.

Xactly hamid is absolutely ri8..!!! Btw it is for our convinence, not his..!!
 
Top