CIST 2500 – Course Project Description
The goal of the project is to do a statistical analysis on
any topic of your choice. My topic is
Ronaldo vs Messi, two of the greatest soccer players in the world. The key is,
you are performing the analysis. As you
collect your data, make sure to include
many categorial variables. One project like this, it is easy to just
compare “Ronaldo vs. Messi,”
however, you will find that this will increase the challenge for this
project toward the end. You still have a ways to go on descriptive
statistics. Remember, descriptive statistics is about center and spread of our
data. So, in your case, you are looking at in game times where a goal is
scored. However, you what is the average game state for each person? (does one
tend to score when the their team is more up or down? I would also do basic
descriptive statistics on goals, assist, games played, minutes played if you
have it accessible.
Project Structure
Introduction
1- Introduction ( dataset description )
2- Descriptive Statistics
3- Inferential Statistics
4- Regression
5- Discussion
6- Conclusion
Definitions
Research question- The intention behind the analysis.
Example: Are there gender differences in income? Does number of bedrooms impact
home prices? Are there differences between states when looking at soda
preferences?
Extra Credit Opportunities
Using software other than Excel for analysis (R, Tableau,
SAS, Minitab, etc.). For extra credit, you must submit code, files, etc. Regression
From Professor-
Project Introduction: There is a plethora of data on the
Internet. Some ready to go, other data needing manipulation/collection to make
useable. In this course, you may use data that is already compiled or that you
compile yourself to ask/answer a research question/topic and explore that data
in that lens.
This question/topic can be anything. It could be a business
related question such as: what market segments will be more effective for a
marketing campaign? It could be a sports related question such as: why did the
Golden State Warriors win the 2017 NBA Championship? It could also be a social
question such as: what types of companies make better corporate citizens?
If you are struggling to find a dataset of use, bring it to
my attention early. The earlier you obtain your data, the less stressful this
project will be. Kaggle.comLinks to an
external site. can be used to find a dataset. They have a plethora of data sets
that may be of value. As you let the data guide you, your topic will move and
change. THIS IS OK.
The grading rubric will be as follows:
Component
Points
Problem Statement
10 – States research question clearly and concisely
5 – States research question in an ambiguous way
0 – Does not state research question
Dataset Selection
10 – Selects Dataset that consists of a variety of data
elements
5 – Selects a data set with limited number of data elements
0 – Fails to Select Dataset
Descriptive Statistics – Measures of Central
Tendancy/Variation
20 – Includes measures of Central Tendency and Variation and
includes them in discussion surrounding research question
15 – Includes Measures of Central Tendency and Variation but
does not include them in discussion surrounding research question
10 – Includes Measures of Either Central Tendency or
Variation and includes them in discussion surrounding research question
5 – Includes Measures of Either Central Tendency or
Variation but does not include them in discussion surrounding research question
0 – Does not include numeric descriptive statistics
Descriptive Statistics – Visual Observations
20 – Employs Visualization that further enhance conclusion
on research topic not discussed in the textbook.
15 – Employs visualizations only in the textbook to enhance
conclusion about research topic.
10 – Employs visualization techniques textbook, but does not
relate it to research topic.
0 – Does not include charts, graphs, etc. in analysis.
Inferential Statistics
20 – Uses inferential statistics, explains technique
assumptions and uses the results of the inferential analysis to further enhance
conclusion on research topic.
15 – Uses inferential statistics to further enhance
conclusion on research topic
10 – Uses inferential statistics but does not tie it back to
conclusion on research topic.
0 – Does not include inferential statistics topic.
Regression Analysis
Up to 10 Points Extra Credit – Include a Regression Analysis
as part of research analysis.
Discussion
0 -10 – Discusses in depth how statistical analysis supports
formulation of a conclusion about research topic.
Conclusion
10 – States conclusion and uses analysis to support through
inferential statistics and descriptive statistics
5 – States conclusion
0 – Does not form an end result from the analysis
Total
100
Dataset Description
Descriptive Statistics (Chapters 2-3)
Inferential Statistics (Chapters 6-13)
Regression (Chapters 14-15) – Extra Cred Discu ConclusioDefinitions
Research question(s) – the intention behind the analysis. Example: Are there gender differences
Using software other than Excel for analysis (R, Tableau, SAS, Min Regression