Creating With Code

A blog by Robert (Marty) McGuire

Face Detection in Static Images with Python

February 20th, 2009 · 4 Comments

One of the things I’ve been longing to do with my mobile photo-sharing site Camura is to offer image annotations, like objects and faces.  Over the last couple of years I have been increasingly frustrated by the appearance of face tagging on services like Facebook, and the recent addition of face recognition to iPhoto has brought this frustration to the surface once again.  I don’t even want to do something as complex as face recognition - I just want to find faces in an image.

Googling for things like “open source face detector” doesn’t come up with much.  The landscape seems to be comprised of mostly expensive for-pay libraries written for Windows, abandoned research projects, and lots of research papers full of equations — but no code that I could get to run.

To make a long post short, it turns out that Intel’s OpenCV computer vision library comes with a face detector example that should work out of the box.  Better yet, there are now some decent Python bindings for OpenCV that come pre-packaged with OpenCV for Ubuntu and Debian.  You can install them with:

$ sudo apt-get install python-opencv

Now, it seems that most OpenCV face detector examples are meant to be run “live”, usually taking the image from a webcam and highlighting faces with a red box in real-time.  However, I have a large database of static images that I want to consider individually, and I simply want to save the face coordinates for later use, rather than altering the picture.

So, with a bit more Googling, I found a Python script that I could chop up and use for this purpose, and here is what I came up with:

An example run of the script looks something like this:

$ python face_detect.py marty_mcguire.jpg
[(50,36) -> (115,101)]

You can overlay that rectangle on an output image with ImageMagick’s “convert”:

$ convert marty_mcguire.jpg -stroke red -fill none -draw "rectangle 50,36 115,101" output.jpg

And the output might look something like this:

My face, it has been detected.

Pretty fun stuff!

→ 4 CommentsTags: HowTo

Getting the Boarduino working with OS X Leopard

June 12th, 2008 · 2 Comments

I’ve been into hardware hacking on and off for much of my life, but I’ve never really had the time and confidence to design and build my own hardware. In recent years, projects like the open source Arduino have been slowly convincing me that I just might be able to do this stuff.

When a recent Make blog article showed nearly step-by-step instructions for building a breadboard-friendly Arduino clone called the Boarduino, I felt compelled to order one and try to get it working with my new MacBook.

I ordered the DC Boarduino kit from LadyAda’s website, along with the USB TTL-232 cable that would connect to my laptop’s USB port. I also ordered the 9V power supply on the site to power the Boarduino.

I found building the Boarduino to be pretty easy (=ahem= with only one screwup on my part) thanks to LadyAda’s detailed instructions. It was the first soldering project I had done in awhile, so I am happy that it went so well.

I next looked to the Arduino OS X guide for the software download and installation instructions. I ended up grabbing version 11 of the Arduino software. After unpacking the .zip file from the site, I installed the FTDI driver and rebooted, then plugged in the Boarduino’s power and connected it to the laptop via USB.

Next, I double-clicked the Arduino application to start it, and nothing happened. After messing around with it on the command line for awhile, I determined that it didn’t like the version of Java I’m using (I have the 64-bit version of Java 6 as my default).

So, to run the Arduino software, I had to temporarily set my preferred Java version to 5.0 via the Java Preferences panel in /Applications/Utilities/Java/. Once I had changed my Java version, the Arduino app started right up with a double-click. Once I was done with the app, I could set my Java preferences back to Java 6.

In the Arduino app, I set it up to communicate with the Boarduino by selecting “Arduino NG or older w/ ATmega168″ under the Tools | Board menu. I then opened up the Blink test program under File | Sketchbook | Examples | Digital | Blink. To load the program onto the Boarduino, I pressed its reset button, then quickly clicked the “Upload to I/O Board” button on the interface.

The red LED on the Boarduino blinked rapidly as data was received, then it started blinking slowly as it ran the Blink program! Hooray!

I’m excited that it’s so easy to get something as powerful and versatile as the Boarduino up and running in just a couple of hours. I’m going to try and think of a couple of projects for it. Whatever I do, I’ll be sure to post about it here.

→ 2 CommentsTags: HowTo

A couple of Leopard configuration tricks

June 12th, 2008 · 1 Comment

I find Google to be a more and more valuable resource as time goes on, especially when seeking knowledge about how to make this new MacBook act like I want it to.

One thing I noticed early on was that pressing Tab to move around various interfaces, from Finder dialogs to web pages, would only tab to text entry fields. Coming from a primarily Windows background, I am used to tabbing my way to checkboxes, dropdowns, and buttons in my interfaces, so this crippled Tab navigation quickly became annoying.

Thanks to this blog post, I found the option to make Tab move through “All Controls” in System Preferences | Keyboard & Mouse | Keyboard Shortcuts

Another big issue I came up against is the weird Java support that comes with OS X. Because Apple releases their own versions of Java, we OS X users are kind of at their mercy with respect to what we can use and how we can configure it. You can install Java 6 from Apple’s download page, but Java 1.5 will still run by default.

Apple provides a “Java Preferences” configuration utility in /Applications/Utilities/Java/ which lets you change the default version of Java that will be used in browsers and when double-clicking to launch Java apps. However, this utility doesn’t change which version of Java will be found by command-line apps in the terminal, such as Apache Ant.

Another blog post to the rescue! It turns out that you can change your default version of Java for command line apps freely by changing the CurrentJDK and Current symlinks in /System/Library/Frameworks/JavaVM.framework/Versions/. For example:

    cd /System/Library/Frameworks/JavaVM.framework/Versions
    sudo ln -fhsv 1.6 CurrentJDK
    sudo ln -fhsv 1.6 Current

Hopefully this post will be able to help others (or at least my future self).

→ 1 CommentTags: HowTo

Setting up Rails and MySQL on OS X Leopard

June 12th, 2008 · No Comments

I recently purchased a MacBook to use as my primary development system. One of my first plans was to get up and running with Rails on this new machine. Given that Leopard (OS X 10.5) comes with Ruby version 1.8.6 and RubyGems 1.0.1, I thought I would be in good shape, but there were a couple of snags. Thanks to the power of Google and other people with similar problems, I was able to get it sorted out. Here’s what I did:

  1. Updated rubygems to 1.1.1 with sudo gem update --system
  2. Installed the 32-bit (x86, NOT x86_64) version of MySQL (community edition, of course) via the installer on the MySQL download page.
  3. Installed the native MySQL driver for Ruby (this was tricky, see below)
  4. Finished by installing Rails and its friends via gem install rails

Getting the speedier native MySQL driver for Ruby installed was tricky for two reasons. It seems that by default the installer attempts to compile for 4 architectures, so you have to set an environment variable and pass along a parameter to point to the installed copy of MySQL when calling gem:

sudo env ARCHFLAGS=”-arch i386” \
   gem install mysql -- \
   --with-mysql-config=/usr/local/mysql/bin/mysql_config

If you’ve never seen it before, the ‘–’ after ‘gem install mysql’ means that gem should pass along the next arguments to the programs it uses to build the driver.

The reason for installing the 32-bit version of MySQL (on your fancy 64-bit OS and machine!) is that the version of Ruby that ships with Leopard is apparently 32-bit only. Yikes.

I’ve noticed several little annoying details like this in getting other things to work “like I like” on OS X. I’ll probably end up posting more about them in the next couple of posts.

I know that “how to” posts of the form “I wanted to do X, but Y happened. Finally I fixed it by Z,” are pretty boring. Still, I hope they are useful for preserving the knowledge of how to work around these problems, both for myself and for others.

→ No CommentsTags: HowTo

Podcast 3: Are we getting better at this, yet?

June 1st, 2008 · No Comments

I have acquired a fancy microphone to match Loren’s, and we have created another podcast. We started with a list of topics to talk about this time, though of course we used those items more as suggestions than as guidelines. Here are some things we talked about this week along with a bundle of links to keep you busy:

  • Cheap beer seems to follow us around for some reason, but we drink it anyway.
  • Google’s I/O Conference happened this week, and many interesting (though not life-changing) announcements were made, including
    • A new Google Earth API/Plugin which integrates with their existing Maps API and brings all of those KML files and third-party 3D models and data to the web.
    • Google AppEngine went ahead and let everyone in from the waitlist, announced new pricing competitive with Amazon’s EC2, and a new Memcache API which should allow for faster (and cheaper!) performance. There doesn’t seem to be any improvement to ease of development, however.
  • Adobe has announced 3D hardware support in Flash 10, but it’s Windows-only at the moment. For a “universal” platform like Flash, what will come of this feature and other Flash 3D engines like Papervision3D?
  • Loren has finally upgraded many of his projects to Rails 2.0, encountering a few challenges
  • I’ve been exploring lots of Rails deployment options (such as Nginx, Mongrel, and Thin) but I’m most excited about using the new Passenger (a.k.a. mod_rails) plugin for Apache. Dreamhost seems pretty excited, too.
  • Loren has upgraded his video-on-the-web service from Justin.TV to Stickam, and has been working on new ways of bringing video to his investment club, Ben Hur Investments. Look for them, soon!
  • Embedding and remixing are becoming new themes for text, audio, and video on the web. We talk about several services their impacts including RSS, Yahoo! Pipes, and Twitter, and discuss how Loren is using these tools to create a new service to track the Atlanta web entrepreneur community.  We also talk a bit about Twitter’s recent service “brownouts”, and what they might mean for the community.

We’re still sticking with an under-the-radar, links-on-blogs format for sharing this podcast, so you may download this week’s episode here, or listen below:

Please leave us some feedback, either in the comments here or on Loren’s post for this episode. We look forward to hearing your thoughts!

→ No CommentsTags: Podcasting

Extracting projects from a shared Subversion repository

May 28th, 2008 · 2 Comments

I recently had the need to migrate a project from a Subversion (SVN) repository that was shared among many other projects and groups to a fresh repository where it would be the first of many projects.

My first instinct was to simply use svnadmin dump to dump out the contents of the whole shared repository, transfer that to the new machine, use svnadmin load to load the data into the new repository, and then delete out the projects that I did not want.

The first pass at this created a 2.5GB dump file, something which I did not want to send over the network! After poking around at the options for svnadmin dump, I found that I could shrink this down to about 1GB by using the --deltas flag, which saves space by dumping only the differences between each revision in the repository. 1GB was still pretty big, but we have a fast network, so it wasn’t that painful. I transferred it to the new server, created a new repository, and ran svnadmin load to load the dump into the repository.

All I had to do next was delete the directories from the repository that I didn’t want. I knew this would be a little tricky because I didn’t want to keep any code from those projects around, and simply running svn delete on each directory would have kept the other projects in the repository’s history.

As it turns out, you can’t just remove all traces of something from a Subversion repository. The reasons for this are many, but they simply haven’t gotten around to implementing svn oblitherate, yet. The current solution is to create a dump with svnadmin dump, and then process that file with a tool called svndumpfilter.

The docs for svndumpfilter are pretty straightforward, so I tried using it on the dump file I had already created, but no matter what I did I kept getting this error:

svndumpfilter: Unsupported dumpfile version: 3

What the docs (and error message) don’t tell you is that svndumpfilter only works on full dump files, and doesn’t support dump files made with the --deltas flag.

Long Story Short (Too Late)

In the end, what I wanted was simple, but not obvious. On the original server, I ran:

svnadmin dump /path/to/original/repository | \
    svndumpfilter include my_project \
               --drop-empty-revs \
               --renumber-revs > dump_file

I was then able to copy the resulting (much, much smaller!) dump file to the new machine, blow away and re-create the new repository, and load it with svnadmin load.

And now maybe you can learn from this example instead of having to figure it out yourself through trial-and-error!

→ 2 CommentsTags: HowTo

Podcasting about the cloud

May 25th, 2008 · 2 Comments

Despite a lack of comments from the peanut gallery, Loren and I have created another podcast. You can download it here (~11.4MB) or listen below:

We did this podcast with a little less preparation, so the topics wander a little bit more. Some things you can expect to hear:

  • Justin.tv is great, but it may not be the be-all for live web video broadcasting because of the way it prevents you from embedding “episodes” and live chat in your own pages. Given alternatives like Ustream and Stickam, and thanks to the power of embeddable apps, like Backnoise, will it matter what Justin.tv does and doesn’t support?
  • Features aside, content on Justin.tv is crazy! Loren and friends spent Saturday broadcasting themselves brewing beer, playing games, and getting rowdy all on Loren’s “come watch” page where their friends could play along via chat provided by Backnoise. Also, Marty missed the first game of the Pens vs. Red Wings game, but apparently could have caught it from Justin.tv!
  • Google announced FriendConnect, with exciting demos and no other documentation to speak of. Is it vaporware? Is it evil?
  • I talk in way too much detail about my experiences playing with the Google App Engine and its (lack of) support for the excellent Python web framework Django. Long story short: it’s great if you’re willing to write a bunch of infrastructure yourself, but lousy for banging out a weekend project.
  • Also, there’s this new thing called the Dash, an always-connected GPS unit built on the OpenMoko Linux-based cell phone platform. They did a media blitz this week and Loren met them at startupriot just before it happened.

Please have a listen and give us some feedback! What’s interesting? What’s not? What would you like to hear us talk about? Or tell us we’re wrong about? Leave comment here or on Loren’s post for this entry.

→ 2 CommentsTags: Podcasting

A minor example of podcasting

May 20th, 2008 · 1 Comment

I’ve recently become interested in forms of “live” online media such as podcasting. Specifically, I am interested in trying to extract the maximum possible value out of podcasts, such as transcripts, time-based tags and links, excerpts, etc. This interest, along with a recent outpouring of Internet media from my friends over at Snowcap Labs, has convinced me to take a look at creating a “podcasting hub” with tools for letting podcasters and their listeners to easily create, share, and link this information.

As a first step into understanding this problem, my friend Loren and I have decided to try creating our own podcast. Namely, we recorded an episode of an as-yet-unnamed podcast which you can listen to here:

Using the free version of the Pamela Call Recorder for Skype, we recorded 3 15-minute segments to create a 45 minute podcast. During that time we talked about many things, from the ins and outs of getting started with podcasting tools to discussing the opportunities and challenges that come with putting oneself online “live” to the power that technology gives us to make meaningful artifacts from our lives. You can read Loren’s description of the podcast on his blog.

Give it a listen if you have a chance. You can listen via the embedded Flash player above, or you can download the ~10.8MB mp3 file here.

If you do listen, please leave us some feedback! I am interested to know your thoughts on the podcast itself, on what you’d like to see from a “podcast hub” site, on what information we talked about in the podcast you’d like to see made easily accessible (such as links or products that we mention), or parts you’d like to see excerpted (if any). Please remember that this is a rough “one-off” recording. After another episode or two we will want to try and choose a name and a general direction for the podcast before we go syndicating ourselves with RSS feeds, Atom feeds, and iTunes subscriptions.

→ 1 CommentTags: Podcasting

Simple online storage with Amazon’s S3

May 20th, 2008 · 4 Comments

If you’ve been following all of the latest buzz about data and applications “in the cloud”, then you’re probably already familiar with Amazon’s Simple Storage Service (S3). If you’re not, or if you’ve never had the chance to play with it, then this post should help you get up, running, and able to easily store your files online, where they can be accessed from anywhere, and won’t be lost of your computer breaks or goes missing.

What is Amazon’s S3?

To put it simply, S3 is a pay-as-you-go service for storing data on Amazon’s servers. Roughly speaking, you pay for bandwidth in and out of the service as well as for data stored on the servers. For light use, bandwidth costs are about $0.10 per Gigabyte uploaded to S3, $0.17 per Gigabyte downloaded, and storage is $0.15 per Gigabyte per month. So, for example, if you uploaded 5 Gigabytes worth of pictures to S3 on May 1st and did nothing for the rest of the month, you would owe about $1.25 come June 1st (that’s $0.50 for the upload, $0.75 for the storage). Data stored on S3 is actually stored across many different servers all along Amazon’s infrastructure, making it highly unlikely that anything put there will become lost or corrupted. You can learn more at Amazon’s info page for S3.

One of the great things about S3 (and Amazon’s other web services) is that they have a well-documented API for allowing people to write programs to interact with it. In fact, many great tools have already been written to allow you to manage data on S3. The rest of this post will help you get signed up for S3 and introduce you to an excellent tool for using S3: the S3 Firefox Organizer (S3Fox).

Getting an S3 Account

If you have an Amazon account (and who doesn’t?) it’s easy to sign up.

  1. If you don’t have an Amazon Web Services (AWS) developer account, sign up for one by visiting http://aws.amazon.com/ and clicking “Sign Up for AWS”. All you need is your Amazon account information and a credit card to use for billing at the beginning of every month.
  2. Once you’ve created your AWS developer account, you can sign up for S3 by visiting http://aws.amazon.com/s3/ and clicking the big shiny “Sign Up For This Web Service” button. After agreeing to the terms of service, your S3 account will be all set up and ready to store your data.

Your S3 Credentials

Before you can actually use any software with your S3 account, you’ll need to get your “AWS Access Identifiers”. These identifiers (called your “Access Key ID” and your “Secret Access Key”) are long strings that verify you (or a piece of software acting on your behalf) as the owner of your S3 account. You can retrieve these by visiting http://aws.amazon.com/, clicking on “Your Web Services Account”, and selecting “AWS Access Identifiers”. Copy your identifiers somewhere safe, as you will need them for the next section.

The S3 Firefox Organizer (S3Fox)

If you use Firefox (and you should!), you can use the excellent S3Fox plugin to browse and manage data stored on one or more S3 accounts. Once you’ve set up your account in S3Fox (by entering the identifiers described above), you can view your S3 account almost like a remote drive or FTP client.

  1. Install the S3Fox plugin either from the S3Fox page on the Mozilla add-ons site or from the S3Fox home page.
  2. Once you’ve got the plugin installed, open it up in Firefox by opening the Tools menu and selecting S3 Organizer.
  3. With the organizer open, select the Manage Accounts link in the upper-left-hand corner of the interface.
  4. Give your account a name (for convenience), then enter the identifiers for your S3 account.
  5. You’re ready to use S3Fox to manage files on S3!

Making your first “bucket”

Now that you’ve set up your account with S3Fox, you can begin storing data there. One of the first things to do before uploading files to S3 is to create at least one “bucket” in which to store your data. While S3Fox gives us a nice file-system-like or FTP-like interface, the fact remains that the root directories shown in S3Fox (those that appear in the right-hand pane for the “/” directory) are actually “buckets” to S3. I’ll explain more about those in a later blog post, but for now just keep in mind that an S3 bucket must be globally unique. That means that only one S3 user anywhere can have a bucket with a given name. This is similar to the concept of domain names.

You can create your first bucket by clicking the Create Directory button (the little shiny blue folder) on the right-hand pane, and giving the new directory a unique name. Some examples of a unique name could be your own domain name, or a subdomain like backups.mydomain.com, or even something random like awesome-files-for-jane-smith. You can read more about working with buckets from the S3 technical docs here.

Enter this new directory by double-clicking on it. You can now use S3Fox to browse around your local files in the left-hand pane, and select any files or directories for upload. Similarly, you can set up S3Fox on other machines you might have (like a laptop or a desktop) and download files or directories to those machines from S3.

Another nice feature of S3Fox is the ability to synchronize local folders to S3. Setting up synchronized folders allows you to easily backup directories by uploading only the files that have changed. This saves both time and money, which is a good thing.

Limitations of S3Fox

S3Fox is excellent, but has a couple of major downsides. The most painful downside is that S3Fox is not written to handle very large numbers of files at one time. If you try to upload or synchronize a directory with a huge number of files, for instance, you will notice that S3Fox basically freezes up your browser for several seconds (maybe minutes!) while it figures out which files need updating. S3Fox is also not the fastest tool for transferring files, as it pauses for a fraction of a second between each upload or download. For these reasons, I think S3Fox should only be used for dealing with a few files at a time, or for browsing to understand what is in your S3 account.

In a future post I’ll talk about S3Sync, an excellent command-line tool for uploading and downloading large numbers of files to and from S3 which has neither of these limitations (though it doesn’t have the pretty interface).

Support the Developer!

If you like S3Fox, why not support the developer by donating via the PayPal link on his page? Perhaps, with your donations and feedback, we will see S3Fox continue to improve!

→ 4 CommentsTags: HowTo