Converting Gatherer to a Bunch of Uncoupled Micro-Services

I’m currently creating a version of Gatherer, the engine behind nerdlynews, in an uncoupled micro-service based design. It’s going pretty smoothly, and the benefits of creating and internally consuming an API are pretty obvious. It’s interesting how fast development starts moving after getting some API functionality up and running. I’m starting work on the bit that will be choosing what is interesting, and I’m using the chance to expand the knowledge base and the way I use it to choose what’s interesting in the news.

Fiddling With the Intelligence Behind NerdlyNews

I’ve spent a bit of time messing around with the algorithms behind NerdlyNews. It was doing what I wanted, picking out interesting articles from a large amount of noise, but given the large number of choices, it was still too noisy itself. I’ve tweaked a couple of parameters over the last couple of days and I’m hoping it’s now going to be a pretty solid but not too busy of stream to keep up with.

PageLoadStats is now on GitHub

PageLoadStats is a tool I wrote to grab performance and stats about web pages and chart the data in a simple and useful way. It also has the ability to send alerts when the page load times moving average gets past a configurable alert level. It’s written in Python using Django. If you have any interest in digging into the source code, grabbing a copy for ideas, or to start a new project, have at it!

A Bit of Modular Web Design in Django

I found myself creating a web page intended to display a set of data objects, each object similar in format. A pretty common need. The simple thing to do would be to simply iterate over the list of data in the django template, for example: {% for o in some_list %} <div>#display data here#</div> {% endfor %} I want to be able to re-use and centrally control how the data is displayed, anywhere on the site.

Naive Bayesian Probability is very cool…add bi-grams for extra coolness.

I’ve written a Django web-app that I’m still tinkering with. I have it slowly gathering information from multiple sources and classifying each piece (corpus) for me. I’m really happy with the progress. NLTK made implementation pretty straight forward, though there was a definite learning curve for me. I have no background in this field, so I had to learn a bit. For someone approaching this problem that already has the right linguistics and some python background, I’ll bet that it’s amazingly easy to get started.