Thursday, September 17, 2009

Genetic Algorithms (Thoughtful Thursday Post)

In genetic algorithms, one represent information as a Chromosome, a series of values. E. G., assume we are optimizing a gas pipeline, that has five pumping stations each with a pressure differential and we have five pipe diameters between them. Thus we would have for two possibilities. I will refer to them as Configuration A and Configuration B.
P(1) P(2) P(3) P(4) P(5) D(1) D(2) D(3) D(4) D(5)

300  200  150  75    80   1.2  1.5  1.3  2.2  3.3

P(1) P(2) P(3) P(4) P(5) D(1) D(2) D(3) D(4) D(5)

275  300  125  140   140  1.8  1.2  1.4  2.1  3.1
This might represent the following real world configuration where the p(1)..p(5) in boxes are the pressure of the pumping stations and the d(1) to d(50 are the diameter of the pipes:
                           d(3)
_____      ______    _____   _____     _____
|    |d(1) |    |d(2)|    |  |    |d(4)|    |d(5)                               
|P(1)|==== |p(2)|=== |p(3)|==|p(4)|=== |p(5)|===
|    |     |    |    |    |  |    |    |    |                             
_____      ______    _____   _____     _____
The most important operation is crossover, where two possible pipleine configurations "mate" and produce two offspring. The system randomly chooses a point in the chromose, and the two mates get switched at that point. Assume the random number generate says cross over after D(2). The two children in the next set would be:
P(1) P(2) P(3) P(4) P(5) D(1) D(2) D(3) D(4) D(5)

300  200  150  75    80   1.2  1.5  1.4  2.1  3.1

P(1) P(2) P(3) P(4) P(5) D(1) D(2) D(3) D(4) D(5)

275  300  125  140   140  1.8  1.2  1.3  2.2  3.3
The second important operation is fitness evaluation and preferring high fitnesses. In the pipeline scenario, this fitness would probably the inverse cost. Thus, assume the two original configurations had costs of $100.00 and $75.00. Assume that there were two other configurations, call them Configuration C and Configuration D, with costs of $300.00 and $150.00.

The genetic algorithm uses random number generators to determine the chance of each configuration mating and thus contributing to the next set of chromosomes, the population in the terms of the Illigal algorithms. Then configuration B would have four times the chance of mating as C and D would have half the chance of mating as configuration C.

Application to taxation and budgeting

In our project, the chromosome will include the various tax rates. These are the marginal tax rates for each quintile of income (with a few special ones for the richest.

R(1) - tax rate for those in the bottom twenty percent of income
R(2) - tax  rate for those in the twenty to fourty percent of income
R(3) - tax rate for those in the fourty to sixty percent of income
R(4) - tax rate for those in the sixty to eighty percent of income
R(5) - tax rate for those making between eighty to ninety-five percent
       of income
R(6) - tax rate for those making between ninety-five percent and
       ninety percent of the income
R(7) - tax rate for those in the top one percent of income
And then, we have the budget amounts, how much to be spent on:
R(8) - social security
R(9) - defense
R(10) - education
R(11) - medicare
R(12) - transportation
R(13) - agriculture
R(14) - aid to low income families
R(15) - training labor and unemployment
R(16) - international affairs

Schemata

David Goldberg's book very nicely explains schemata (Genetic algorithms in search, optimization, and machine learning / by David E. Goldberg. Published: Reading, Mass. : Addison-Wesley Pub. Co., 1989, which is where I got the information in this article ). Schemata represent that we have chosen several fields. The other's have not been chosen yet. In our first example, the schemata
P(1)=300 * * * * * D(1)=1.2 * * * * *
represents a specific value for the pressure in pumping station one and a specific value for the pipe to its right. The schemata,
P(1)=300 P(2) = 200 * * *  * * * * *
represents all the possible configurations with pumping station one having pressure of 300 and pumping station two having pressure of 200.

In crossover, which schemata is most likely to be disrupted by the crossover which occurs each cycle, it is the first with about a fifty percent chance of getting disrupted. The second schemata has only a ten percent chance of getting disrupted as of the ten possible divisions only one will disrupt the schemata.

(These are approximations as most genetic algorithms use scaling to get better performance to prevent random variations that happenn to fit from dominating in the early stages and to magnify differences for fine tuning at the end of the optimization, but that doesn't concern the main idea of what we are doing in social choice.)

In taxes, schemata might represent

R(1) = 0.1 R(2) = 0.2 * * * *  * * * * *  * 
The tax rates forthe two lowest income quintiles thus constitute a schema, and since they are next to each other on the chromosome, they would be most likely preserved. On the other hand, a relationship between the tax rate on the lowest quintile and the money for welfare (aid to low-income persons) would not likely to be preserved. , e.g.
R(1) = 0.1 * * * *   * * * * *  * * * R(14) = 10.0E6 *  *
That is because the first is on R(1) and the second is on R(14). Practically every cross-over will ruin this connection.

There are two other operators, mutation, which is randomly changing one value in the chromosome and inversion which is changing the order of operations. In this space, we could allow individuals to make proposals of new chromosomes or changes. Mutations thus introduce new possible budgets into the mix.

An interesting twist would be to proceed the genetic algorithm by a process where individuals give their degree of associativeness between two characteristics. This would be a pseudo-inversion.

MR. Lessard, a graduate student in our department, has started two programs in that direction: He implemented a web system for the above genetic algorithm.

  1. After logging in and do "informed consent" stuff, the participants enters their ideal budget. Thus, the initial population consists of everyone's ideal budgets.
  2. When the system has everyone's ideal budget (or at least from enough people), it will cross them over. Each of these will be presented for rating.
  3. The users will rate five budgets each.
  4. The system will do cross over on the results. These will be based on the fitness ratings in step three.
  5. Return to Step Two
Simulation of the strategic behavior will start from the software provided The second step will be a simulation starting from the University of Illinois Library. I have marked the modifications so that we can test strategic behavior. What happens if one or two sets of raters don't give their honest rating, but attempt to give (strategic behavior in the game theory and algorithmic mechanism design literature) a rating that is more likely to have them end up with a budget more to their liking.

Of course, a more basic question is whether the genetic algorithm will converge to a Lindahl Equilibrium. Thus, the first experiment will be to run the genetic algorithm with no strategy. The fitness chromosome will simply the taxation for each group. Each group will have a linear fitness function to give the d U / dTax = M * Tax + b where U is the utility. Each group will be assumed to rate the tax level proportional to the total benefit it receives from the total tax revenue. Thus, the benefit is the integral of the above function up to the total tax on all groups. Then, we will try strategic voting and a genetic algorithm where the chromosome represents a tax which is a piecewise linear function of the Income.

(I really appreciated the lecture of Dr. Warren Jones of this University on Lindahl Equilibrium when I took his Summer Public Finance course. It is on my to do list to write a more detailed Thoughtful Thursday piece on Lidnahl Equilibrium. The next best is the presentation in David N. Hyman, Public Finance, Fifth Edition page 136 to 144,)

No comments:

Post a Comment