Add more sample given sample data
This commit is contained in:
151
examples/data/poker/poker-hand.names
Normal file
151
examples/data/poker/poker-hand.names
Normal file
@@ -0,0 +1,151 @@
|
||||
1. Title: Poker Hand Dataset
|
||||
|
||||
2. Source Information
|
||||
|
||||
a) Creators:
|
||||
|
||||
Robert Cattral (cattral@gmail.com)
|
||||
|
||||
Franz Oppacher (oppacher@scs.carleton.ca)
|
||||
Carleton University, Department of Computer Science
|
||||
Intelligent Systems Research Unit
|
||||
1125 Colonel By Drive, Ottawa, Ontario, Canada, K1S5B6
|
||||
|
||||
c) Date of release: Jan 2007
|
||||
|
||||
3. Past Usage:
|
||||
1. R. Cattral, F. Oppacher, D. Deugo. Evolutionary Data Mining
|
||||
with Automatic Rule Generalization. Recent Advances in Computers,
|
||||
Computing and Communications, pp.296-300, WSEAS Press, 2002.
|
||||
- Note: This was a slightly different dataset that had more
|
||||
classes, and was considerably more difficult.
|
||||
|
||||
- Predictive attribute: Poker Hand (labeled <20>class<73>)
|
||||
- Found to be a challenging dataset for classification algorithms
|
||||
- Relational learners have an advantage for some classes
|
||||
- The ability to learn high level constructs has an advantage
|
||||
|
||||
4. Relevant Information:
|
||||
Each record is an example of a hand consisting of five playing
|
||||
cards drawn from a standard deck of 52. Each card is described
|
||||
using two attributes (suit and rank), for a total of 10 predictive
|
||||
attributes. There is one Class attribute that describes the
|
||||
<20>Poker Hand<6E>. The order of cards is important, which is why there
|
||||
are 480 possible Royal Flush hands as compared to 4 (one for each
|
||||
suit <20> explained in more detail below).
|
||||
|
||||
5. Number of Instances: 25010 training, 1,000,000 testing
|
||||
|
||||
6. Number of Attributes: 10 predictive attributes, 1 goal attribute
|
||||
|
||||
7. Attribute Information:
|
||||
1) S1 <20>Suit of card #1<>
|
||||
Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
|
||||
|
||||
2) C1 <20>Rank of card #1<>
|
||||
Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
|
||||
|
||||
3) S2 <20>Suit of card #2<>
|
||||
Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
|
||||
|
||||
4) C2 <20>Rank of card #2<>
|
||||
Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
|
||||
|
||||
5) S3 <20>Suit of card #3<>
|
||||
Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
|
||||
|
||||
6) C3 <20>Rank of card #3<>
|
||||
Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
|
||||
|
||||
7) S4 <20>Suit of card #4<>
|
||||
Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
|
||||
|
||||
8) C4 <20>Rank of card #4<>
|
||||
Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
|
||||
|
||||
9) S5 <20>Suit of card #5<>
|
||||
Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
|
||||
|
||||
10) C5 <20>Rank of card 5<>
|
||||
Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
|
||||
|
||||
11) CLASS <20>Poker Hand<6E>
|
||||
Ordinal (0-9)
|
||||
|
||||
0: Nothing in hand; not a recognized poker hand
|
||||
1: One pair; one pair of equal ranks within five cards
|
||||
2: Two pairs; two pairs of equal ranks within five cards
|
||||
3: Three of a kind; three equal ranks within five cards
|
||||
4: Straight; five cards, sequentially ranked with no gaps
|
||||
5: Flush; five cards with the same suit
|
||||
6: Full house; pair + different rank three of a kind
|
||||
7: Four of a kind; four equal ranks within five cards
|
||||
8: Straight flush; straight + flush
|
||||
9: Royal flush; {Ace, King, Queen, Jack, Ten} + flush
|
||||
|
||||
|
||||
8. Missing Attribute Values: None
|
||||
|
||||
9. Class Distribution:
|
||||
|
||||
The first percentage in parenthesis is the representation
|
||||
within the training set. The second is the probability in the full domain.
|
||||
|
||||
Training set:
|
||||
|
||||
0: Nothing in hand, 12493 instances (49.95202% / 50.117739%)
|
||||
1: One pair, 10599 instances, (42.37905% / 42.256903%)
|
||||
2: Two pairs, 1206 instances, (4.82207% / 4.753902%)
|
||||
3: Three of a kind, 513 instances, (2.05118% / 2.112845%)
|
||||
4: Straight, 93 instances, (0.37185% / 0.392465%)
|
||||
5: Flush, 54 instances, (0.21591% / 0.19654%)
|
||||
6: Full house, 36 instances, (0.14394% / 0.144058%)
|
||||
7: Four of a kind, 6 instances, (0.02399% / 0.02401%)
|
||||
8: Straight flush, 5 instances, (0.01999% / 0.001385%)
|
||||
9: Royal flush, 5 instances, (0.01999% / 0.000154%)
|
||||
|
||||
The Straight flush and Royal flush hands are not as representative of
|
||||
the true domain because they have been over-sampled. The Straight flush
|
||||
is 14.43 times more likely to occur in the training set, while the
|
||||
Royal flush is 129.82 times more likely.
|
||||
|
||||
Total of 25010 instances in a domain of 311,875,200.
|
||||
|
||||
Testing set:
|
||||
|
||||
The value inside parenthesis indicates the representation within the test
|
||||
set as compared to the entire domain. 1.0 would be perfect representation,
|
||||
while <1.0 are under-represented and >1.0 are over-represented.
|
||||
|
||||
0: Nothing in hand, 501209 instances,(1.000063)
|
||||
1: One pair, 422498 instances,(0.999832)
|
||||
2: Two pairs, 47622 instances, (1.001746)
|
||||
3: Three of a kind, 21121 instances, (0.999647)
|
||||
4: Straight, 3885 instances, (0.989897)
|
||||
5: Flush, 1996 instances, (1.015569)
|
||||
6: Full house, 1424 instances, (0.988491)
|
||||
7: Four of a kind, 230 instances, (0.957934)
|
||||
8: Straight flush, 12 instances, (0.866426)
|
||||
9: Royal flush, 3 instances, (1.948052)
|
||||
|
||||
Total of one million instances in a domain of 311,875,200.
|
||||
|
||||
|
||||
10. Statistics
|
||||
|
||||
Poker Hand # of hands Probability # of combinations
|
||||
Royal Flush 4 0.00000154 480
|
||||
Straight Flush 36 0.00001385 4320
|
||||
Four of a kind 624 0.0002401 74880
|
||||
Full house 3744 0.00144058 449280
|
||||
Flush 5108 0.0019654 612960
|
||||
Straight 10200 0.00392464 1224000
|
||||
Three of a kind 54912 0.02112845 6589440
|
||||
Two pairs 123552 0.04753902 14826240
|
||||
One pair 1098240 0.42256903 131788800
|
||||
Nothing 1302540 0.50117739 156304800
|
||||
|
||||
Total 2598960 1.0 311875200
|
||||
|
||||
The number of combinations represents the number of instances in the entire domain.
|
||||
|
||||
Reference in New Issue
Block a user