The Anti’s Cloud

29 04 2009

Personally, I’ve always been interested in distributed computing. It’s one subject I’ve done a lot of reading on. After seeing a bunch of posts on various Google blogs, such as this one, I’ve decided I want to work on making my own little cloud. I know I don’t have 500,000+ computers to work with like Google, but I do have 5 (or so…I think I can scrap up a few more out of my extra parts).

I’m starting to plan out everything but this would consist of a “manager” computer, and then slave computers to do the actual processing. I’m going to be building a web server specifically for this purpose and using some undecided DBMS (I’m thinking about SQLite).

At least to start with, each request will only be handled by one computer, although I do plan to eventually spread the processing of one request over multiple computers. I’d like to start with smaller goals. The database however will store information in sort of a “striping” manner, where each database computer will have the same structure, data will be separated over multiple computers, but not mirrored. There will be an application inside the manager to layer on top of SQLite (or whatever I decide on) to manage the returned results from all the combined computers.

Again, I’m going to start small and start with a basic web server that serves basic HTML, and then move up and support a dynamic language (Most likely Python or PHP, not sure which yet though, or hell, maybe both). After I get all this working I will make it distributed between computers and reporting to its “manager”.

The only problem with this plan is that it seems like there would be a bottle neck on the “manager” computer. I guess eventually I’ll have to support several of those, and use a load balancer to switch traffic between them.

I haven’t decided what language I’m going to use yet, but I’m leaning towards Python. One might argue that something of this magnitude should be done in a language like C or C++, but the problem is I think those languages would make something like this too complicated, and would require far more code to do less. Personally I think python would be great for the job, however I’m still deciding because I might also want to use C#.

Will definitely keep updating as I work through the project. Feel free to make suggestions on how I should go about things.