A Q-Learning snail?
How does the unsupervised learning process work? At the beginning, the robot lives its childhood, and explores randomly its action-space. Some actions will produce a reward in terms of forward movement, others will produce a punishment, when the robot goes backwards by mistake. This reward/punishment mechanism is fed by the sensorial input provided by the wheels, to which the encoder is attached. When a certain amount of time has passed, the robot becomes adult, and moves using the acquired experience, rather than random. The algorithm used is known as Q-learning. We can say that every time you start the program, a new life is born. And I dare to say that quitting the program without saving the brain of the creature before, is like killing it! I have to give credit to Dr. Frank Vanden Berghen, who wrote the Robot Java Applet that inspired me.