Kman , one step closer to a release

Warning: Bad English , proceed with caution

KMan started as a toy for one of my personal projects. What I needed was something that could read rss feeds and notify me when something interesting shows up. In time , I also wanted him to learn from the feeds. For example , kman reads feeds and after a while I ask him about mvc, and he sums up some facts. The bot should also direct me to various sources of information based on some keywords(mvc for examle) . The implementation evolved, and now you can implement various applications based on it.

Potential - Sample Applications

This bot is not just about fun , it's intended to be useful. For example , suppose you have an internal jabber server. You can customize this bot to the extent that your team can log the progress without the web browser or get a list of available tasks , etc. This can be easly be implemented using commands(which are described later).

Another example , if your computer is behind a firewall ,you can control it via a bot connected to a gmail account(of course ,you have to be careful so that it will listen only to you). You can also index your documents , movies , etc and query the bot for them (using Lucene or something like that)

And , as I stated allready ,you can create an application that monitors some rss feeds and announces you when new feeds are available. Before reading them , you will be able to ask the bot few things about the posts and decide if the post is worth reading. This feature is not complete yet, but you can get a glimpse of it as described later.

How can the bot be used like this?(Key Features)


Megahal

First of all ( in time it will be the least important) it will be provide a 100% megahal implementation. For the moment , it provides only a partial implementation.

Megahal implementations are based on Markov chains (math stuff),which enable you to generate content based on previously parsed content. For example , if megahal encountered the word A followed by the word B , and at some point , the word B followed by the word C , it will be able to generate A B C , or A B, or B C. Being based on math , rather than on rules , it's language independent.

Here is an example after feeding the bot with two wiki pages related to aliens,predators and leonidas:


A trio of Predators seen aboard their spacecraft  is available  under the 
assumption that Leonidas gave the Predator's face, as shown in Predator.

Here is what you can get by feeding the bot with an irc log and some dune quotes and then providing a keyword that will be the base for the generated responses:


you>feed http://en.wikiquote.org/wiki/Dune
kman>Mmmmm , that was good
you>feed http://www.irclog.org/freenode/php/2006-05-10/page1.html
kman>Mmmmm , that was good
you>CMS
kman>There is no substitute  on CMS' s 'try to remember, 
there is no substitute for intelligence.
you>CMS
kman>Seeing this and telling of it, he made the audacious leap  based CMS  
for editing and maintaining around 50 pages, with support for password 
protected areas, printer friendly versions 
of all was the price we paid was the basic 
trust that he made the audacious leap of decision.

A final example , here is what I can get after feeding the bot with my homepage:


Php Megahal 
you>feed http://www.webuml.com
kman>Mmmmm , that was good
you>webuml
kman>You can find here my blog ,some  experiments (various tests that i do for
 personal use) and most of the intersting stuff i found on the net and became
 part of my never ending training, which covers: 
PHP5 Zend Framework Adobe Flex More Links     2007 webuml .
you>blog
kman>Until then , this site will act as my  my blog , some  experiments (various
 tests that i do for personal use) and most of the intersting stuff i found on 
the net and became part of my never ending training, which covers: PHP5 Zend
 Framework Php Frameworks(in general) OOP/OOD Until time will allow me to focus
 on webuml, this site will act as my personal site.

In the future , I will have a Term Extractor that will tie your sentence with the next response, right now I am using random stuff : "What can you tell me about CMS?" can generate sentences based on any word from the question. The Term Extractor will be based on Markov Chains as well , so it can be used in any language. This will be a key feature for previewing your rss feeds via your bot.

Flexibility

Megahal is not everything about this bot , there will be several other 'brains'(infobots are scheduled) available or combinations. You will be able to use any of them , under any supported protocol. Here is how you can setup a cli bot:


$kman  = new Kman_Communicator_Cli();
$brain = new Kman_Megahal_Brain();
$kman->setBrain($brain);
$command = new Kman_Communicator_Command_Demo();
$kman->addCommand($command);

We will talk about commands later , for now here is how you would setup a gmail bot:

$kman = new Kman_Communicator_Xmpp('talk.google.com', 5222, 'user', 
'password', 'xmpphp', 'gmail.com');
$command = new Kman_Communicator_Command_Demo();
$kman->addCommand($command);
$command = new Kman_Communicator_Command_Say();
$kman->addCommand($command);
$megahal = new Kman_Megahal_Brain();
$kman->setBrain($megahal);
$kman->connect();

As you can figure , it doesn't matter what kind of protocol or brain you are using, you can achieve several results by using the same code base. You can build several different bots while sharing some of the code.

Extensibility

You can customize or add new features to the bot without having to know about how all is made. You have seen in the previous snippets something about a command, here is what say command does:


Php Megahal 
you>say moo
kman>moo

and here is how it is implemented:

class Kman_Communicator_Command_Say implements SplObserver 
{
    public function update(SplSubject $subject)
    {
        $message  = trim($subject->getMessage());
        if(!strpos($message," ")) {
            return;
        }
        list($command,$what) = explode(' ',$message);
        if($command == 'say')
            $subject->setResponse($what);        
    }
}

You will receive something that implements the SPLSubject interface. As a side note , this tehnique is a design pattern , called the Observer Pattern, and you can read about it here. The full interface will be available in the future , but right now , I don't want extensions to be strongly coupled to my code(even my own).

If you haven't figured it already, whenever Kman receives a message , your command will be called and you can do whatever you want there , based on the received message or other factors.

So far, you have seen that you can use cli and xmpp. You can add any other protocols you may desire, and the bot will work the same . To do this , you have to extend an abstract class, called Kman_Communicator_Abstract , or you can implement something from scratch (You can use Kman_Communicator_Interface as a guideline or better , implement it.) Here is how the cli engine is implemented:


class Kman_Communicator_Cli extends Kman_Communicator_Abstract 
{
    
    public function connect()
    {
        $handler = fopen("php://stdin","r");
        $message = "Hello";
        echo "Php Megahal ";
        $bye     = null;
        while ($message != $bye) {
            echo "you>";
            $message  = fgets($handler);
                        
            $response = $this->getResponse($message);
            
            if($response) {
                $this->send($response);
            } else {
                $this->send('moo moo baa baa , me stupid');
            }
            
        }
    }
    
    protected function send($message)
    {
        echo 'kman>',$message ;
    }
}


Not so difficult , is it? In the future(next weekend perhaps) , I will also hook the irc protocol.

How can you get the code


svn checkout http://kman.googlecode.com/svn/trunk/ kman-read-only

There is no documentation at the moment, only some examples found in the scripts directory. The API is subject to changes(until the first release) , but not major changes. The project uses some code from Zend_Framework, New Bsd License and xmpphp, which is under GNU License.

That's it for now, I will come back with 0.1 .