python

Timely News. My Latest News Auto-Gathering Experiment

I’ve been experimenting with Bayesian selection to gather news on http://nerdlynews.32hours.com for a while now. It works pretty well. I decided to have a try at using google trending to create a more ‘pop’ type of creation, just to see how well it would work. I’ve setup Timely, using Python3, Flask, and the help of Python modules newspaper3k and beautifulsoup. I’ve run into a problem with Flask khandling the launching of external processes, but a minor workaround of running the news gathering from cron as a direct call to my python methods is working now.

Rabot, My Personal Daemon, Shared on GitHub

I’ve been messing around with a bit of code that will aggregate a bunch of little task handlers, each intended to take care of little things I’d like to automate for myself. Today, the primary functions in Rabot revolve around getting important weather updates to me. It’s storing data locally in MongoDB and sending me tidbits about the weather that I find important via Twitter DMs, data specifically interesting as a motorcyclist since more extreme temps and rain are a big deal and require special actions for me.

Site Migrated to Pelican Static Site Generator

Yesterday I updated this site from wordpress to Pelican static site generation. I made the change partly as a learning process and partly to move away from an unnecessarily heavy CMS hosting app. Was it a surprise to me that the migration made my site fast? No. Was it a surprise how much faster? Oh, yes. Look at that chart! Sub-100ms times. And on a tiny little server hosting several other sites.

Converting Gatherer to a Bunch of Uncoupled Micro-Services

I’m currently creating a version of Gatherer, the engine behind nerdlynews, in an uncoupled micro-service based design. It’s going pretty smoothly, and the benefits of creating and internally consuming an API are pretty obvious. It’s interesting how fast development starts moving after getting some API functionality up and running. I’m starting work on the bit that will be choosing what is interesting, and I’m using the chance to expand the knowledge base and the way I use it to choose what’s interesting in the news.

Fiddling With the Intelligence Behind NerdlyNews

I’ve spent a bit of time messing around with the algorithms behind NerdlyNews. It was doing what I wanted, picking out interesting articles from a large amount of noise, but given the large number of choices, it was still too noisy itself. I’ve tweaked a couple of parameters over the last couple of days and I’m hoping it’s now going to be a pretty solid but not too busy of stream to keep up with.

PageLoadStats is now on GitHub

PageLoadStats is a tool I wrote to grab performance and stats about web pages and chart the data in a simple and useful way. It also has the ability to send alerts when the page load times moving average gets past a configurable alert level. It’s written in Python using Django. If you have any interest in digging into the source code, grabbing a copy for ideas, or to start a new project, have at it!

Waiting for elements on dynamic webpages Selenium / Webdriver

Below is my current favorite method to wait for an element to appear or become useful on a dynamic web page. In this case, my example is avoiding the exception thrown when webdriver fails to find an element by using the ‘find_elements’(plural) method rather than a ‘find_element’(singular) The ‘find_elements’ methods always return a list, even if empty, rather than throw an exception. Both are useful, but in this case I like the more readable code without the try/except requirements.

Python ErrorList object for use in Webdriver Testing

Here’s a bit of code from a post that was lost when my old site went down, data and all. I don’t recall if the original post was this python version or my original Java version (sorry if that’s what you’re here for, ask in comments and I can find and post that too) It’s an implementation of an ‘assert’ statement that allows for the test to continue on failure, storing the error.

A Couple Personal Projects: NerdlyNews and PageLoadStats

I have worked on several web based projects. I recently created NerdlyNews, which uses Bayesian logic to grab interesting news from sites that I really like. I’m using a wordpress front-end for that one, and the JetPack extension so I can have the output of the Bayesian algorithm posted to the site using WordPress API’s. It’s really a nice way to go since I’m not a UI designer. I also created PageLoadStats.

A Bit of Modular Web Design in Django

I found myself creating a web page intended to display a set of data objects, each object similar in format. A pretty common need. The simple thing to do would be to simply iterate over the list of data in the django template, for example: {% for o in some_list %} <div>#display data here#</div> {% endfor %} I want to be able to re-use and centrally control how the data is displayed, anywhere on the site.