I decided to have some fun with this one and use a restaurant kitchen to explain some parts of git đź¤
why use git?
In a restaurant kitchen with different chefs there is one recipe book to make sure everyone knows of each other’s recipes. The recipe book is the “source of truth” for everyone in the kitchen. Any recipes that are changed or new recipes that are developed are updated in the recipe book.
A customer ordering any dish in the restaurant can always expect the dish to be prepared the same, taste the same, and be presented the same when they order it. Thank you recipe book!
The same principle ought to be applied to us as data analysts. As the central team providing data and insights to different customers, we need to make sure that our customers receive the same data - prepared and presented the same when they request it from any person in the team.
Of course we can just save our queries in bigquery or even in a word document but that becomes a pain when we want to share what we create with each other. This is where a git repository comes in.
A git repo(sitory) acts as a central place for the team where we can reference tables and document how they are combined with each other to define metrics. Git makes sure that any changes made to any of these formations are tracked and updated in the repo.
how we use git
Let’s say a chef wants to develop a new recipe. Developing a new recipe usually happens at home and when the recipe is done, the chef brings it to the restaurant to add to the recipe book. But to know what recipes already exist, they need to know the contents of the recipe book. They make a copy of the recipe book and take it home. The chef develops their recipe at home, and when they’re happy with it they bring the recipe to the restaurant and add it to the original recipe book.
A few days have passed, some of the other chefs have also developed some of their new recipes and brought them back to the restaurant. No worries though, since everyone is just working on their own recipes, the chefs can just add them to the recipe book and it just keeps growing naturally. Notably though, the copy of the recipe book that the chefs took home is not the same as the original recipe book.
Back to git and our work as data analysts. When I want to work on a new metric, I create a copy of the repo on my own computer and create the metric in there. When I’m done I add my changes to the team repo so everyone can access what I created. Check out the image below to see what this looks like.
While I created my new metric (total sales), my workmate Rashid also worked on a new metric (total orders) and added it to the team repo. By the time I add my “total sales” metric to the team repo, the version of the team repo is different to the repo when I copied it 2 days ago because of Rashids new “total orders” metric. This is great though, because now our repo has 4 metrics in total that everyone can access!