I have been working on building cloud computing infrastructure as a developer for a few years. Overtime I came to realize that distributed systems play a vital role in cloud computing. In some sense cloud computing is really about building distributed systems and making them available to the public in the model of utility computing. Once I realized that I started to spend more time on researching on large scale distributed systems.
It turned out with no surprise that distributed systems are pretty complicated and the learning curve is steep. It really takes time and effort to understand the fundamentals and the real production distributed systems. I always believe that sharing my learning experience will help people who want to learn and also help me learn more. This is why I want to write blogs regularly to share my experience.
Distributed systems and cloud computing are huge topics and there is a lot to talk about. Here is my plan for sharing my learning experience. First I would like to talk about the basics and fundamentals of distributed systems like consensus problem and Paxos algorithm, CAP theorem, DHT (distributed hash table), distributed file system and so on. The reason is that without a good understanding the fundamentals it would be very hard to understand distributed systems. If you don’t believe me try to read any of the three papers (azure storage, dynamo, spanner) to see how much you can grasp. If you can understand all of them you don’t need to waste your time on reading my blog. Otherwise I believe that you can learn something from my blog. One of my goals is that you will be able to at least understand all the 3 papers I mentioned. After talking about the basics and fundamentals of distributed systems I will talk about real production distributed systems like Big Table, Chubby, Spanner, Dynamo, Azure Storage, Cassandra, Zookeeper, Hadoop, Storm and so on. Then you can understand how real systems are built on top of the fundamentals, what are the engineering challenges and how they get solved. Believe me that there is a big gap between theory and practice in distributed systems area. The real systems need to fill the gap so you will see many interesting things by looking into the production systems.
Once we know enough about distributed systems I will talk about cloud computing which is a pretty hot area. I will probably not follow the exact order of finishing talking about distributed systems fundamentals first, then real distributed systems and then cloud computing. Instead I will talk about these topics in an indeterminate order. I will try to blog on a regular basis.
In the end people who follow on my blog should have a very good understanding of distributed systems and cloud computing. It should be helpful for job interviews, your projects at work, your curiosity about technologies and many other more cases.