Cloud Computing at Amazon

Wednesday, October 22nd, 2008

Blending into the skyI attended a few talks on cloud computing last week, including an overview of Amazon Web Services. My previous view of Amazon’s mission was that they’re out to commoditize everything, from books to products to computing. In fact, they actually do have 3 lines of business:

The extent of their web services is testimony to it’s standing as a main pier of the company: it includes a variety of inexpensive a la carte services for data storage, messaging, and raw computing power. Web service traffic actually surpasses that of the retail site. Of course, the real proof comes in some of their success stories:

  • SmugMug stores and serves 700 TB of data through Amazon (case study)
  • The New York Times TimesMachine split and completed optical character recognition on 130 years work of newspaper scans in less than 24 hours for a few hundred dollars
  • Animoto launched their viral video application on Facebook, and quickly scaled from 50 virtual servers in Amazon’s cloud to 3500


Data Recovery, Round 2

Wednesday, August 27th, 2008

Data Recovery, Round 1 left my drive on it’s way to ESS Data Recovery with “symptoms of a severe head crash” that were beyond the capabilities of Aero Data Recovery. ESS turned around an initial evaluation within a few days:

Recovery Chance: Good (70-84%)

Medium Failure Details:

Disk Head Failure

Rotational Scoring Present (moderate)

Evaluated Cost: $995.00 - 10% = $895.50


Data Recovery, Round 1

Monday, August 18th, 2008

For all the computers and crashes I’ve been through, I’ve never lost anything I couldn’t bear to loose, but a few days after my MacBook’s hard drive failed, I decided I really wanted my lost photos from Colorado. A few days before, someone in my delicious network bookmarked a data recovery site, and it turned out he had already found the best deal.

So I opened a case with Aero Data Recovery and shipped them both the dead drive and a new external USB-powered drive for the recovered data. Traditionally, data recovery has cost thousands of dollars and only been affordable to businesses with very valuable data. Now companies seem to be filling their downtime with $279 flat-rate, free-estimate jobs for consumers willing to bite the bullet.

About a week after the drive arrived, I got an email that the it “exhibits symptoms of a severe head crash”. That was beyond their capabilities, but they recommended two other companies that could continue the investigation once the drive was direct shipped. The prognosis is still hopeful, though I have no idea what this next tier will cost after the free estimate - I shudder to think what it would cost to have somebody with an electron microscope transcribe the 1s and 0s that make up my photos of Rocky Mountain

Continue to Data Recovery, Round 2

Apple Ate My Photos!

Thursday, July 31st, 2008

While Time Machine handily restored my MacBook after it’s original Apple hard drive failed after two years, it missed one thing: 10 GB of Colorado photos, half of which I hadn’t published yet!

The reason is infuriatingly simple, yet was kept completely obscured:

Time Machine doesn’t back up your Aperture library while the application is running. This is due to the way Aperture utilizes a MySQL Lite database which needs to be open for reading and writing all the time.


Sometimes I Hate Being Right

Thursday, July 31st, 2008

I came home to a crashed MacBook Monday and rebooted to ominous clicking sounds and a flashing question mark - smelled like hard drive trouble. A bit of troubleshooting and a quick visit to the Apple store later, and I was shopping for a new hard drive and crossing my fingers that Time Machine saved my data and unpublished photos from Colorado.

The hard drive was a total loss; neither I or Apple could see it when booting from another location, and the clicks hinted that something mechanical broke after two years. Apple did well on several counts: their troubleshooting article was easy to find and follow, I was able to get the last tech support slot the same evening, and the guy at the Genius Bar was honest enough to tell me that for the $250 they’d charge to replace the 80GB drive, I could pay half and install a bigger one myself.

They missed on one small thing, and one very big one: their reservation form still doesn’t work in Firefox, and there’s a nasty, unadvertised limitation on Aperture and Time Machine that meant many of my recent photos were lost!

Home Depot Does It Old School

Sunday, April 20th, 2008

Finished tileI went into Home Depot to order new countertops this week, and was impressed to see they have nice widescreen LCD monitors for their kitchen designers. I was less impressed to see them running a massively oversized terminal window into some archaic back-end ordering system.

Still, I figured they must be nice for sketching out layouts and calculating all the costs. Not quite; it’s all still done with worksheets, graph paper, and a hand calculator. And sometimes it’s done twice since different materials are priced by the linear or square foot.

I wonder if Home Depot has an opening in their CIO office; I’d be happy to take a set of granite countertops as a signing bonus ;)

The Technical Resume

Wednesday, April 2nd, 2008

I critiqued a resume for a friend of a friend today and it made me realize how we as technical people always slant our resumes. We write them as exhaustive specifications, with perfect chronology and humble detail of our roles, occasionally indulging our egos by describing a really cool project or solution. The managers and recruiters who read them, though, are looking for something a little different.

Sure, they want to see that your experience and qualification that fits the position, but more so they want an overall sense that you’ve got the right personality and technical mindset to join their group. And since most of them write job descriptions as a bit of a wish list, they want to see some skills in tangential areas and in a pay grade above the one they’re actually offering.

The solution? Succinctly emphasize your general technical strengths, cover your bases with a keyword-loaded technology list, and round it out with some management-friendly language that shows you have the “extra” skills to make you the perfect candidate.

It’s All Just About Communication and Respect

Wednesday, February 6th, 2008

We’re doing a week of agile, scrum, and associated tool training at work this week for our whole team, and it’s really eye opening. As I mentioned before, despite a year’s experience with all of this, I’ve always seen it colored though other’s eyes and modified implementations.

At its core, most of it has nothing to do with software development or manufacturing, but just communication in general. Communication amongst a team, with their customers & users, and with the other stakeholders in a project. It’s a basic element, but one that easily becomes ignored or overly virtualized, losing the most valuable type of communication: face-to-face discussion. A good proof is that people use these methods for running a variety of non-technical projects and they do it with technology as simple as index cards and post-it notes.

Respect - and trust - are other core values. Businesses tends to treat people as resources, and get similarly unimpassioned work in return. Treating people as full members of a team with a stake and say in the project’s outcome has more than doubled productivity and quality for those who are willing to make a leap. And it is quite a leap: let a self-organizing team run their own work, replace a hierarchy of team leads with scrum masters who serve the team by removing impediments to facilitate work, and give the team the power to select and dismiss that person as necessary.

It’s a big cultural change to sell and execute, but given that “culture eats strategy for breakfast”, it offers potential for truly effective increases in productivity, quality, and morale.

Revelations from the Netflix Prize Winners

Tuesday, January 15th, 2008

Brian pointed out that the AT&T Research Labs team that won this year’s Netflix Progress Prize ($50k out of $1M) for improving movie recommendations had published a number of papers on their winning strategy. It’s interesting reading, and this paper is fairly approachable if you skip the statistics in the middle.

Their final approach combined 107 different model, and though the majority provided only incremental improvement, the total effect propelled them to an 8.43% improvement over Netflix’s own proprietary algorithms. (Wikinomics fan can take a moment to cheer another success for open collaboration.)

One interesting sidenote:

The distribution of movies-per-user is quite skewed. Figure 4 shows this distribution (on a log scale). Ten percent of users rated 16 or fewer movies and one quarter rated 36 or fewer. The median is 93. But there are some very busy customers, two of which rated over 17,000 of the 17,700 movies!

This confirms my previous findings on participation and the 1-9-90 rule. The team also made use of additional models which considered simply the presence or absence of a rating for a movie from a particular user.

Overall, it seems that Netflix and the recommendation community have gotten a lot of mileage out of the prize in an area that will continue to grow:

Because good personalized recommendations can add another dimension to the user experience, e-commerce leaders like and Netflix have made recommender systems a salient part of their web sites.

If such systems interest you, I’d recommend O’Reilly’s Programming Collective Intelligence. Reading - and working through it’s examples - really opens your eyes to how simple these algorithms can be and how commonplace they’ve become.

More Storage, But Never Enough

Friday, December 28th, 2007

No matter how much space you have, you manage to fill it. This old maxim applies equally to the physical and digital realms, as I learned while cleaning up my computers yesterday. With bigger internal and external drives, it’s easy to keep accumulating photos, videos, and downloads until your drive, backup solution, or both run out of room.

In my own case, mirroring data between computers as well as to an external backup drive is one of the culprits, though for as long as it’s feasible, this provides a nice redunancy. The growing threat to that extra space is my photos. It’s not so much the good, “published” photos in my gallery, which grow modestly as I become more discriminating, but the outtakes, which are larger and more numerous with the new camera. These don’t seem worth sorting through or deleting given how plentiful storage space is and the small but non-zero chance I’ll actually have use for some of them. (In practice, I don’t revisit the good shots that often, let alone the ones that didnt’ make the cut.)

Along with some other data like emails and documents more than a year old, it’d be nice to relegate them to a lower tier of backup with perhaps less redunancy, frequency, and accessiblity. Apple’s Time Machine does a bit of this based on age; will a more user-defined extension of this idea ease our perpetual storage crunch?