Adventures With AirTunes

April 28, 2008

The main addition to Signal 1.1 was the ability to control AirTunes speakers, something I was very happy to finally be able to offer as it was by far the most requested feature over the life of the application. Of course, with this implemented one of the new most requested features is “show which speakers are actually active!”. This is something I’d very much like to see added myself, and the reason it’s not supported stems from the same limitation of the iTunes programming interface that caused AirTunes control to be delayed for so long.

When the AirPort Express first came out, I was incredibly excited. Here was a way to put together a whole-house audio system that integrated seamlessly with iTunes, all that was missing was a slick way to control it via Wi-Fi. Of course, I soon discovered that there was in fact no way to control AirTunes through the iTunes API. Ok, no problem, I’ll just file a feature request (rdar://problem/3821346 for any Apple folks). I’m sure they’ll add it soon.

That was nearly four years ago. It’s still not there.

Meanwhile, the requests for AirTunes support kept pouring in, sometimes on a daily basis. Finally I had to relent. So with no support from Apple, and no clever back-door into iTunes available, how can we add support for this feature? Really, there’s only one way: Simulate the user’s input in the iTunes interface. In other words, send mouse clicks exactly at the positions where you would make them to select a speaker through iTunes. This is known as UI scripting and should make every developer who hears those words cringe.

At first glance UI scripting doesn’t seem like such a bad thing. Simulate a click on the speaker selection drop-down, how hard can it be? There’s just one problem. That drop-down? It moves. Try clicking around between different playlists and source types and you’ll see how its position changes. What’s more, in order to send those mouse clicks iTunes has to be the foreground window and there are all kinds of reasons why that might not be the case. Here are just a few of the things that Signal has to account for in order to make those simulated clicks work reliably:

  • The user's language
  • The type of playlist selected
  • The iTunes window being hidden
  • The iTunes window being minimized
  • The Mini Player view being selected
  • The screen saver being active

And of course, my personal favorite:

  • The user's language not being English, with the radio tuner playlist selected, with iTunes in the Mini Player view, minimized, with the screen saver active, plus some random process that just happens to act up right then and slow everything down.

You have no idea how much cursing was involved in the creation of this feature.

The worst part is that despite all the efforts to make this work as smoothly as possible, it’s still based on a broken model. If Apple decides to update the iTunes interface and move the speaker selection drop-down to a new location it will break. If there is some odd mouse or application behavior that Signal doesn’t account for it will break. And of course, although Signal can send the mouse clicks it can’t get any information back about whether those clicks worked or what speakers are actually turned on.

Sure, Signal could start taking little screenshots of iTunes and try to figure this out from the image. “Well, that kind of looks like ‘Computer’ in Simplified Chinese, I guess we can check the box”. But then I would go insane and then there would be no further product development.

Applications will always have bugs, but a good application tries to be as reliable as possible so that it “just works”. When the limitations of an API prevent your application from doing that, it’s incredibly frustrating. All of which is to say to Apple: Please, please, for the love of shiny plastic, put AirTunes control in the iTunes API.