Menu
s
0 Comments

Cricket Match Winnig Prediction
A
Mini Project Report Submittedby
Mr.Harsh Sadashiv Swami 1841064
Mr.Darshan Shiva ji Waman 1841007
Mr.Deepak Arvind Khamkar 1841032
In partial fulfillment for the requirement of Laboratory Practice-II of
Ba…elor of Computer Engineering
Under the guidance of
Prof.Mr. Digambar Padulkar (Assistant Professor) Department of Computer Engineering
Vidya Pratishthan’s Kamalnayan Ba ja j Institute of Engineering and Technology
Bhigawan Road, Vidyanagari Baramati-413133
2018-2019

Vidya Pratishthan’s
Kamalnayan Ba ja j Institute of Engineering and Technology, Baramati
Department of Computer Engineering
Certificate
This is to certify that following students Mr.Harsh Sadashiv Swami 1841064
Mr.Darshan Shiva ji Waman 1841007
Mr.Deepak Arvind Khamkar 1841032
have successfully completed their project work on Cricket Match Winning Prediction
during the academic year 2018-2019in the partial fulfillment towards
the completion of Laboratory Practice-II inComputer Engineering.
Pro ject Guide HoD Deptt. of Comp. Engg.
(Mr. Digambar Padulkar) (Prof. Mrs. S. S. Nandgaonkar)
Principal
( Dr. R. S. Bichkar)
Internal Examiner External Examiner

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

Acknowledgments
The success and nal outcome of the pro ject which we have implemented required a lot
of guidance and assistance from many people and we are extremely privileged to get this
all along with completion of our pro ject. All that we have done is only due to such
supervision and guidance and I would not forget to thank them.
We respect and thank Prof.Mr.Digambar Padulkar, for providing us an opportunity
to do the pro ject work in Laboratory Practice-II and giving us all support and guidance
which made us complete the pro ject duly. We are extremely thankful to him for providing
such a nice support and guidance, although he had busy schedule.
We are thankful to and fortunate enough to get constant encouragement, support and
guidance from all Teaching stas of Computer Department which helped us in successfully
completing our pro ject work. Also, I would like to extend our sincere esteems to all sta
in laboratory for their timely support.
Mr.Harsh Sadashiv Swami
Mr.Darshan Shiva ji Waman
Mr.Deepak Arvind Khamakr
i

Abstract
Winning has become the goal in any sport. Cricket is one among the frequently
watched sport now a days. Winning in Cricket depends on various factors like home crowd
advantage, performances in the past, experience which the player brings in matches,
performance at the specic venue, performance against the specic team,toss decision
and the current form of the team and the player. During the past few years lot of work
and research papers have been published which measure the performance of the player
and their winning predictions. In this work a model has been given which is predicting
the winning team. We maintain few information like number of matches they have played
between them,toss winner,venue where the match was played,city.who were the umpires.
The prediction mainly depends on the teams which are playing the match,who wins the
toss and what the team decision is to do after winning the toss.It also depends upon
the venue where the match is played and the city. The prediction method have been
implemented using Logistic Regression ,K Nearest Neighbour and Gaussian Naive Bayes
Classier.
ii

Contents
Acknowledgmentsi
Abstract ii
List of Figuresv
1 Introduction1 1.1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
1.2 Brief Description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
1.3 Problem Denition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2
2 Literature Survey3
3 Dataset Description4 3.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
3.1.1 Purpose. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
3.1.2 Pro ject Scope. . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
3.1.3 Design and Implementation Constraints. . . . . . . . . . . . . .5
3.1.4 Assumptions and Dependencies. . . . . . . . . . . . . . . . . . .5
4 Data Preprocessing and Visualization6 4.0.1 Steps in Data Preprocessing:. . . . . . . . . . . . . . . . . . . . .6
4.0.2 Visualization. . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
5 Classication7 5.1 Logistic Regression. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
5.2 KNN Classier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
5.3 Gaussian Naive Bayes Classier. . . . . . . . . . . . . . . . . . . . . . .7
6 Confusion Matrix9 6.0.1 Analysis of Confusion Matrix. . . . . . . . . . . . . . . . . . . .9
6.0.2 Compare Classier. . . . . . . . . . . . . . . . . . . . . . . . . .10

CONTENTS
7 Result Analysis11
7.0.1 Result for Logistic Regression. . . . . . . . . . . . . . . . . . . .11
7.0.2 Result for KNN Classier. . . . . . . . . . . . . . . . . . . . . .11
7.0.3 Result for GNB Classier. . . . . . . . . . . . . . . . . . . . . .11
8 Conclusion and Future Work12 8.0.1 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
8.0.2 Future Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
Bibliography13 Cricket Match Winning Prediction
ivVPKBIET, Baramati

List of Figures
1 Histogram.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vi
2 Pie Chart.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vi
3 Bar Graph.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vii
4 Bar Graph.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vii
4.1 BarGraph of KKR v/s RR. . . . . . . . . . . . . . . . . . . . . . . . . .6
6.1 Confusion Matrix of Logistic Regression. . . . . . . . . . . . . . . . . .9
6.2 Confusion Matrix of KNN Classier. . . . . . . . . . . . . . . . . . . . .10
6.3 Confusion Matrix of GNB Classier. . . . . . . . . . . . . . . . . . . . .10
Figure 1 shows count of wins of teams
Figure 2 shows a Pie Chart of winning toss,winning match and winning toss and
loosing match Figure 3 shows a Bar Graph of performance of two teams which are CSK and RCB
Figure 4 shows a Bar Graph of performance of two teams which are KKR and RR
v

LIST OF FIGURES
Figure 1: Histogram.
Figure 2: Pie Chart.
Cricket Match Winning Prediction
viVPKBIET, Baramati

LIST OF FIGURES
Figure 3: Bar Graph.
Figure 4: Bar Graph.
Cricket Match Winning Prediction
viiVPKBIET, Baramati

1
Introduction
1.1 Overview
As a sport cricket is played globally across 106 member states of the International Cricket
Council (ICC), with an estimated 1.5 billion fans worldwide (ICC, 2012-2013). However,
much of the global nance and interest is focused upon the 10 full ICC member nations
and more specically upon the big three of England, Australia and India as there league
are very famous. Specially Indian League that is Indian Premier League(IPL) has gain
lot of popularity over the years.
1.2 Brief Description
In this pro ject we develop a model in order to predict outcomes of the Indian Premier
League over the years 2008-2016. We used a multi-step approach to analyze the data that
produced over 500 records.There are dierent attributes used in the pro ject which are id
number,Season in which the match had been played which ranges from 2008 to 2016,city
where the match had been played,the date on which the match was played,names of two
teams participating in the match,the toss winner,toss decision,result of match whcih can
be normal,tie or no result.No result can be found due to some interruption in the game
,the main reason for these could be due to rain.The other attributes are whether D/L
method is applied or not which stands for Duckworth—Lewis which comes into play
when rain has occured.The other attributes are the Winner of the game,win by how much
runs if the winning batted rst and and win by how much wickets if the winning team
batted second.Several other attributes are player of the match or man of the match,venue
or the name of stadium where the match was played and name of the two standing um-
pires present on the eld.
The prediction of the match is made by eliminating some of the features which is Data
Cleaning method.For the prediction of the match we have used some classiers which
are Logistic Regression,K Nearest Neighbour Classier and Gaussian Naive Bayes Classi-
er The prediction of the match is made mainly on teams participating,toss winner,toss
1

CHAPTER 1. INTRODUCTION
decision,city and venue
1.3 Problem Denition
The Indian Cricket Fans have seen the growing popularity of Indian Premier League(IPL).
There is always some sort of discussion going on about IPL in World Cricket.There is
always a prediction made on who will the winner.These predictions are from common
people,media,celebrities etc.There is always a chat whose prediction is more likely to be
correct.So we have decided to do the same type of prediction using some statistical records
and some classiers. Cricket Match Winning Prediction
2VPKBIET, Baramati

2
Literature Survey
Cricket is the most popular sport in India from earlier days. To make a combination
of cricket and entertainment BCCI started IPL(Indian Premier League). Nowadays, the
popularity of IPL is on the peak. Every business tycoon, bollywood actor wants to invest
in IPL team. Every team has a large amount of sponsers with them.It became a dream
for for almost all millionaire to have a IPL team on his/her name.There is lot of craze
and buzzer for IPL in India.Every Indian is a part of these IPL event this or the other
way.They might be right from childrens to thier parents and to their parents its only
IPL.
Every team has larger fanbase cheering and supporting them. IPL has became a big
event. So, everyone likes to guess the results of IPL match. Many times news channels
organize debates on predictions of IPL matches. So, primary motivation behind this
pro ject is increasing popularity of IPL. This pro ject will be interesting for biggest IPL
fans and those who always like to guess the results of matches.
3

3
Dataset Description
3.1 Introduction
The dataset contains more than 500 records and more than 15 attributes.The attributes
are as follows:
1)id number
2)season:Season in which the match was played.
3)city:City in which the match was played.
4)date:Date on which the match was played.
5)team1:First Team participating in the match.
6)team2:Second Team participating in the match.
7)toss_winner:Winner of the toss.
8)toss_decision:Decision of the toss which is being made.
9)result:Result of the match whether it is normal or tie or interuptted due to some reasons
that is no result.
10)dl_apllied:Whether Duckworth-Luis (D/L) method is applied or not.
11)winner:Winner of the match
12)win_by_runs:Winning the match by how many runs.
13)win_by_wickets:Winning the match by how many wickets.
14)player_of_match:Player of the match.
15)venue:Venue or Name of Stadium where the match has been played.
16)umpire1:First Standing Umpire in the match.
17)umpire2:Second Standing Umpire in the match.
3.1.1 Purpose
The purpose is to Predict the Winner of the Match using some Statistical records and
some classiers.
4

CHAPTER 3. DATASET DESCRIPTION
3.1.2 Pro ject Scope
The Cricketing World will start to believe in Prediction which will be based on some
statistical records rather than some theoretical concepts.It will be easier to Predict the
winner.
3.1.3 Design and Implementation Constraints
The Prediction is depended upon few Attributes other than that attributes it is dicult
to Predict the Winner.
3.1.4 Assumptions and Dependencies
Asumptions : 1) In the Pro ject we have assumed that the form of the player is temporary
so we have not shown our dependency on one respective player.We have believed on the
statement said in the Cricketing world that is Form is Temporary but Class is permanent.
2)We have assumed that the third umpire will not play a vital role in the matches even
though in real scenarios it is been observed that the role played by the third umpire
is very crucial.The third Umpire can completely change the course of the match by his
decision.
Dependencies:The pro ject is completely depended on the few attributes which are Two
Teams playing the match,city,venue and the Toss winner team and the Decision of winning
toss team. Cricket Match Winning Prediction
5VPKBIET, Baramati

4
Data Preprocessing and
Visualization
4.0.1 Steps in Data Preprocessing:
1.Import the libraries
2.Import the dataset
3.Check out the missing values
4.See the Categorical Values
5.Splitting the data-set into Training and Test Set
4.0.2 Visualization Figure 4.1: BarGraph of KKR v/s RR
6

5
Classication
5.1 Logistic Regression
Logisticregressionisastatisticalmethodforanalyzingadatasetinwhichthereareoneor more in-
dependent variables that determine an outcome. The outcome is measured with a dichoto-
mous variable in which there are only two possible outcomes. The dependent variable is
binary or dichotomous, i.e. it only contains data coded as 1 or 0. The binary logistic
model is used to estimate the probability of a binary response based on one or more
predictor variables . The goal of logistic regression is to ï¬nd the best ï¬tting model
to describe the relationship between the dichotomous characteristic of interest and a set
of independent (predictor or explanatory) variables. Logistic regression equation – Here p
is the probability of presence of the characteristic of interest. The logistic transformation
is deï¬ned as the logged odds:
Odds = p/(1-p) and Logit(p) = ln(p/(1-p))
The logistic transformation is dened as the logged odds: Odds = p/(1-p) and Logit(p)
= ln(p/(1-p))
5.2 KNN Classier
In the classiï¬cation setting, the K-nearest neighbor algorithm essentially boils down
to forming a ma jority vote between the K most similar instances to a given unseen
observation. Similarity is deï¬ned according to a distance metric between two data
points. A popular choice is the Euclidean distance given by q P
n
i =1 (
x
i
y
i) 2
5.3 Gaussian Naive Bayes Classier
Naive Bayes Algorithm is a classiï¬cation technique based on Bayes Theorem with an
assumption of independence among predictors. In simple terms, a Naive Bayes classiï¬er
assumes that the presence of a feature in a class is unrelated to the presence of any other
7

CHAPTER 5. CLASSIFICATION
feature. Naive Bayes model is easy to build and particularly useful for very large data
sets. Along with simplicity, Naive Bayes is known to outperform even highly sophisticated
classiï¬cation methods. Formula: P(A | B) = P(B | A)P(A) In decision analysis, a decision tree can be used to visually and explicitly represent
decisions and decision making. In data mining, a decision tree describes data (but the
resulting classication tree can be an input for decision making). Cricket Match Winning Prediction
8VPKBIET, Baramati

6
Confusion Matrix
Figure 6.1: Confusion Matrix of Logistic Regression
6.0.1 Analysis of Confusion Matrix
Logistic Regression
Accuracy:27.58%
Precision:0.25
Recall:0.28
KNN Classier
Accuracy:40%
Precision:0.38
Recall:0.40 GNB Classier
9

CHAPTER 6. CONFUSION MATRIX
Figure 6.2: Confusion Matrix of KNN Classier
Figure 6.3: Confusion Matrix of GNB Classier
Accuracy:17.24%
Precision:0..36
Recall:0.17
6.0.2 Compare Classier
The Accuracy Percentage of all the Classiers is very dierent.The result of Accuracy
Percenatge of KNN Classier is very high and can be found as useful. Cricket Match Winning Prediction
10VPKBIET, Baramati

7
Result Analysis
7.0.1 Result for Logistic Regression
The Accuracy of Logistic Regression is found to greater than GNB Classier and less
than KNN Classier.Accuracy is 27.58%.
7.0.2 Result for KNN Classier
The Accuracy of KNN Classier is found to be highest and is 40%.
7.0.3 Result for GNB Classier
The Accuracy of GNB Classier is found to be least among KNN and Logistic Regres-
sion.Accuracy is 17.24%.
11

8
Conclusion and Future Work
8.0.1 Conclusion
Our Cricket Match Winning Prediction Pro ject will be very useful in the coming time.The
Prediction will be made on statistical records and using some proposed model specically
KNN Classier so it will help the people to have a look over it.Our results also show that
our prediction will be almost correct.
8.0.2 Future Work
The main focus will be on increasing the accuracy of the model.We also to consider some
of the main factors which we have not considered yet in this pro ject.
Like in this pro ject we have not considered the role umpires play in the match but in
reality there role is very crucial.Also we can include the role played by Third Umpire.
12

Bibliography
1Brooks, R. D., Fa, R. W., Sokulsky, D. (2002). An ordered response model of test cricket performance. Applied Economics , 34 (18), 2353-2365. ICC. (2012-2013). ICC
Annual Report.
2Bandulasiri, A. (2008). Predicting the winner in one day international cricket. Journal of Mathematical Sciences Mathematics Education , 3 (1), 6-17.
13

x

Hi!
I'm Anna!

Would you like to get a custom essay? How about receiving a customized one?

Check it out