ID3 is an algorithm to generate a decision tree from a set of examples. It is used in machine learning.
Many legal codes are some sort of function of several parameters and are of a nature that can be defined as a decision tree. For example, one is making the gun control law. Some of the parameters might be the type of gun. (Is this a BB-gun, an antique gun, a non-functional gun, an assault gun, a pistol, etc?) Is the weopon concealed? And there are the characteristics of the owner of the possessor. Are they mentally ill, convicted of a crime, have an order of protection against them, etc. Was the person's life threatened? Does the person job need it, e. g. security guard. And lastly, what is the place: a school, bar, a place where beer is sold such as some supermarkets.
The answer could be a simple boolean, "permitted" or "not permitted" Or we could get a classification such that the possession might be a misdemeanor or a felony. Many states have categories for felonies. For example, in the State of New York, a class A felonly, the term is life imprisonment, for class B, the term for a class B felony does not exceed twenty-five years, etc. (There are other special rules for dispositions, such as those who have prior convictions or are youthful offendors.) Then the bulk of the penal code defines offenses to be a class D felony, a class B felony, etc.
A possible decision tree
Gun Type | Location | Age | Personal Status | Penalty, if any |
antique | No Offense | |||
BB Gun | N Offense | |||
Pistol | 15-20 | A Class Misdemeanour | ||
Pistol | 20-30 | B Class Felony | ||
Pistol | 30-50 | Felony | ||
Pistol | 50-70 | A Class Misdemeanour | ||
Rifle | Bar | Class B Felony | ||
Rifle | Schol | Class C Felony | ||
Rifle | Place where Beer is Sold | Class A Misdemeanour | ||
Rifle | other | No Offensse | ||
Assualt | Class B felony | |||
Rifle | Other | mentally ill | Class D Felonly | |
Rifle | Other | Convicted of a Felony | Class C Felony | |
Rifle | Other | Other | Class B Misdemeanor |
An ID3 algorithm would take a series of examples such as the one in the table above. It would create a decision tree that would give the same answers as the examples. It needs to pick which parameter to use for the root node--in the above tree, it was Gun Type. It picks one that divides a lot of examples between thee categories. Often the formula from information theory is used. In the above example, the tree was precisely determined from the input data. That is not always true. In a diagnosis application, there would likely be lines that contradict each other. Or there would might be several lines with very similar attributes. All might have the same classification or penalty, except one. Is that due to a real difference, or just a random error?
A | B | C | D | Result |
1 | 2 | 3 | 44 | ii |
1 | 2 | 4 | 44 | ii |
1 | 2 | 5 | 44 | ii |
1 | 2 | 3 | 45 | ii |
1 | 2 | 4 | 45 | ii |
1 | 2 | 5 | 45 | ii |
1 | 2 | 3 | 46 | ii |
1 | 2 | 4 | 46 | ii |
1 | 2 | 5 | 46 | iii |
Mr. Yeluri set up an example to illustrate these ideas. Since it is long, I put it at the end in an appendix.
In participatory democracy, we do it a little bit differently. For each node in the tree, the users vote on what the next parameter should be. Thus, at the start the individuals would vote on which of the above questions regarding place, type of gun and person should be asked first. Then, the group for each resulting category would be classified again. The individuals can vote to decide for any node when it comes up to classify it once and for all. E. G., if the category of assault weopon shows up, individuals might vote to always ban it and for the category of BB-gun, they may vote not to consider any other factors and always permit it. This is a leaf node and the people will then vote on the classification from the possibilities available.
We established the following terminology, for the input to our experiment:
- title
- a thing on which one chooses, like gun type
- attribute
- one of the possibilities for a title
thus gun type could be
antique
,bb
,assualt
,rifle
,assault
orshot gun
- node
- this corresponds to series of pairs of parameter
and type.
for example, the voting might bring us to making a decision on what to do with those caught with an antique guns, a history of mental illness, and who are sixteen to twenty years old. This is a list of title (nature of gun) and type, antique, mental illness status, and type of
history of mental ilness
and age, which is divided into types or ranges. this is defined as - of root node
- age top of the tree which coresponds to no titles and parameters selected so far.
After the root node, the users can assign a penalty. Observe in the figure that we immidetely assign "no offense" to those possessing antique guns. on the other hand, there arthree levels until we get to gun type ="rifle",m location="other" and "mental health status" = "felony conviction." at each node, the users have the option to "vote now" and if that gets more votes than any title, the users vote on a penalty for same.
How would we apply this to tax law? The ID3 algorithm is not designed to deal with cases that produce a continuous function, like the amount of tax that the taxpayer should payer. (Note, ID3 is used when there are input parameters that are numeric. The algorithm sorts the examples in numeric order of the parameter. It then finds the decision pont that gives the greatest information theory gain.)
Thus, for a tax system by ID3, at each leaf node, there would per be a formula. The titles in a tax problem might be various income ranghe atues, the number of children, marital status. And then the users might vote for those unmarried head-of-household with three children and in the range $20,000 to $40,000 to make the tax = $500.00 + 0.10*(Income - $20,000). That is $500.00 on the first $20,000 plus ten percent of the remaining income. They might put other components of the formulae in for the amount of state tax they paid, the mortgage interest, etc.
Reference
, Quinlan Ross, Programs for Machine Learning, Morgan Kaufman, San Mateo CA 1993Appendix, The Information Theory Formula
We have three titles,A
, B
and C
.
The possible choices, or parameters for each of these
are:
A | F , G , H
|
B | I , J , K |
C | M , N , P |
A | B | C | Classification Yes /No |
F | I | M | N |
F | I | N | Y |
F | I | P | N |
G | I | M | N |
G | I | P | Y |
G | I | M | N |
G | I | P | Y |
G | I | N | Y |
G | J | M | Y |
G | J | P | Y |
F | J | M | N |
F | J | N | Y |
F | J | P | Y |
H | I | N | N |
H | I | N | N |
H | J | N | N |
H | I | P | Y |
H | J | M | Y |
H | J | N | N |
H | J | P | Y |
H | K | M | N |
H | K | N | Y |
H | K | P | Y |
F | K | M | N |
F | K | N | Y |
F | K | P | Y |
A | B | C | Classification Yes /No |
F | I | M | N |
F | I | N | Y |
F | I | P | N |
F | J | M | N |
F | J | N | Y |
F | J | P | Y |
F | K | M | N |
F | K | N | Y |
F | K | P | Y |
A | B | C | Classification Yes /No |
G | I | M | N |
G | I | P | Y |
G | I | N | Y |
G | J | M | Y |
G | I | M | N |
G | I | P | Y |
G | J | P | Y |
A | B | C | Classification Yes /No |
H | I | M | N |
H | I | N | N |
H | J | N | N |
H | I | P | Y |
H | J | M | Y |
H | J | N | N |
H | J | P | Y |
H | K | M | N |
H | K | N | Y |
H | K | P | Y |
By comparison, the sample sets for classifying by C are:
C="M"
A | B | C | Classification Yes /No |
F | I | M | N |
G | I | M | N |
G | J | M | Y |
F | J | M | N |
H | I | M | N |
H | I | N | N |
H | J | M | Y |
H | K | M | N |
F | K | M | N |
G | I | M | N |
A | B | C | Classification Yes /No |
F | I | M | N |
F | J | M | N |
F | K | M | N |
A | B | C | Classification Yes /No |
G | I | M | N |
G | J | M | Y |
G | I | M | N |
A | B | C | Classification Yes /No |
H | J | M | Y |
H | K | M | N |
A | B | C | Classification Yes /No |
F | I | N | Y |
G | I | N | Y |
F | J | N | Y |
H | I | N | N |
H | J | N | N |
H | K | N | Y |
F | K | N | Y |
A | B | C | Classification Yes /No |
F | I | N | Y |
F | J | N | Y |
F | K | N | Y |
A | B | C | Classification Yes /No |
G | I | N | Y |
A | B | C | Classification Yes /No |
H | I | N | N |
H | J | N | N |
H | K | N | Y |
Or we can split it by B, giving us:
A | B | C | Classification Yes /No |
F | I | N | Y |
G | I | N | Y |
G | I | M | N |
H | I | N | N |
A | B | C | Classification Yes /No |
F | J | N | Y |
G | J | P | Y |
H | J | N | N |
A | B | C | Classification Yes /No |
H | K | N | Y |
F | K | N | Y |
No comments:
Post a Comment