23 Sep

Markov Chains

So I have been doing a fair amount of work on Markov Chains.  They are really cool for modelling of random process. Earlier this year I wrote a program that would take a book (it was the “Starship Titanic” by Douglas Adams) and convert it into a JSON stored Markov chain. All it did is said if the current word is this, the next word will probably be that. The really interesting component here is that if you fed it enough text and then ignored/hashed the words, you would be left with a data structure that would be a fingerprint of the author/language/time period which could then be matched to other bodies of work/languages/time periods. Think about it . . . you would be able to identify text’s authors or date it was written or the language purely based on the structure and not necessarily the words.

I quickly realized that there are caveats and limitations to standard Markov chains.

Firstly, as languages go the next would that appears may be very different based on context. I managed to get readable text between 5 and 10% of the time with what I will now call a first order Markov chain. What happens if you increase the order? readability goes to 30 to 50%. The trick is that the map becomes an order of magnitude more complicated because now your chain requires the current word AND the previous word to know the probability of the next word. This is extensible to 3rd and 4th and so on orders as well. This could make your Markov chain into a neural net with the addition of the next part.

Secondly, if you are using Markov chains as neural nets, you need to vary your probabilities based on either time decay or some other feedback function. This allows the chains to learn positive feedback behaviours and start to ignore negative feedback. There is a catch to this as well, your network has to consider 0 probability links as being linked, where as standard Markov chains allow for 0 probability links to be ignored (or effectively not there). This increases the storage space required is it becomes a Pn problem. n nodes has n x n connections (remember that a Markov chain has direction) and can loop to itself. This scenario does not lend itself to JSON storage which is essentially an efficient sparse matrix but rather to a dense matrix storage method. A second order of this would require (n x n) x n connections so would not lend itself to higher order chains, each order would add a dimension. 1st order would be 2 dimensions, 2nd order 3, 3rd order 4 etc.

The second case I have not yet done work one as I have no use-case at this time, but I’m sure I’ll get around to running a few tests in the few months.

15 Sep

Internet of things and the Raspberry Pi

I had an interesting discussion today that I got really impassioned about. It happens sometimes. It was around the Raspberry Pi. Was it supposed to just be an educational tool? Maybe. Maybe not.

So there is a little project that I have built most of the components for and will probably put together over Christmas, but first a bit of background.

At the end of May my phone was stolen through my bedroom window. Really not a great experience. It was a good learning curve and I lost nothing save for one voice recording and the phone. Everything else was in the cloud. It also got me to thinking about how I would build an alarm system from the ground up. I did some work with D-Latches and Shift Registers a while back so I was confident I could build the electronic side that would need to read 32 inputs or control 32 outputs using a serial bus. This would have turned my Arduino UNO into a PLC which is nifty in its self. How would I control it? What transport mechanism would be reliable enough to deliver messages through it? It had to be standard stuff and have good encryption. What messaging service supports reliable delivery of short messages anywhere in the world over the internet and support bi-directional communication? Any guesses?

Well, as time passed, we upgraded the electric fence, body corporate have installed passive sensors along the perimeter so my alarm idea went out the window.

Then we get some tropical fish. How do we monitor the temperature? Could we control it over the internet?

Each of the components I am about to describe I have built. I have put all the building blocks together but that is a matter of time.

  1. I built an interface from a PT100 temperature probe to and Arduino Uno, really easy and really accurate. One LM324 (a bit of overkill but I couldn’t get the LM741 to behave) and 4 resistors. Check the Art of Electronics for a differential Op-Amp circuit This output a conditioned signal that allowed me to read the temperature within a few tenths of a degree. There is a trick I learned long ago, you can build very expensive analogue electronics to get the right signal or you do all the processing once digitized, guess which one is cheaper these days?
  2. You can use the I<sup>2</sup>C interface between a Raspberry Pi and an Arduino Uno. The Arduino is nice because of the build in 10 bit Analog to Digital converter. The Pi is nice because of the next few pieces.
  3. The Pi supports Wifi and 3G. Mine is Wifi enabled using a R89 dongle. So my Pi connects over the internet regularly for updates etc.
  4. A really great piece I built was a Raspberry Pi Twitter bot that could send and respond to direct messages. You can send and received commands to and from the Pi over twitter.

Now the really cool reason I settled on twitter as the message mechanism is two fold:

  1. Twitter allows searching, automatically archives data (think logging), time stamps and geo-codes (if enabled).
  2. Twitter integrates into If This Then That which supports anything else web based. If you don’t know about this, IFTTT will change your world, you don’t even need to know how to program.

The second item is where this gets really interesting. You can start to time base your commands, tweets can trigger SMS messages or emails. You can geofence your phone and get your Pi to do things.

The essence of this project was to solve a simple problem but it illustrates the power of the internet of things. Imagine every house about to tweet? Could you stop a crime wave if you had enough location information? Could you include the police on tweets for faster response? If you used this to measure a fish tank could you provide live data sets for ichthyologists the same way FitBit does for humans? This is the source of the real-world data avalanche.  This is the data that describes the world we live in. Why should financial institutions have all the fun?

One thing is certain, we live in an interesting time where you can follow spacecraft that have landed on comets on twitter and drones deliver ice-creams. Could your pet fish order their next meal by drone? It is no longer science fiction.