Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How many languages is Git actually written in? I had the impression that it was just C. What is each language used for?


Git's core is C. This has the packfile and object store code, the diffing and patching code, reflog code, generic commit-walking libraries, etc. I read it a few years ago when making a Git packfile reader, and the C code is very understandable. I grokked it much quicker than HG's core, the only other one I read.

Most of Git's functionality is built with scripts. Lots of jobs, like octopus merging, are done in Bash scripts that dispatch all of the real work to the C libraries. The Git source distribution has a contrib section that has Python, elisp, and Perl (among others). It also includes a few subprojects like Gitweb, each done in random languages. All of this causes the overall source tree feel very unorganized.


As of 1.7.5.2, git-core is 119 binaries, 29 shell scripts, 9 perl scripts, and 1 python script.

The intention is to eventually replace the remaining shell scripts with C.

Perl is used for add-interactive, most of the import/export functionality (arch, cvs, svn), gitweb, and the send-email script.

Python is a relatively recent addition, and is being used for the so-called remote helpers, which will eventually allow git to push/pull changes from non-git servers.

TCL is used for gitk only.


sloccount says the following for the master branch (391b142):

    Totals grouped by language (dominant language first):

    ansic:       122029 (46.81%)
    sh:           88481 (33.94%)
    perl:         23931 (9.18%)
    tcl:          20351 (7.81%)
    python:        4021 (1.54%)
    lisp:          1785 (0.68%)
    asm:             98 (0.04%)

    Please credit this data as "generated using David A. Wheeler's 'SLOCCount'."


This isn't accurate is it includes a lot of deprecated code which is no longer installed with git. Many of the git commands which were originally written as shell scripts have been re-written in C, but the original scripts were left in the source tree as examples of how to use the git plumbing commands.

lisp is there as an emacs mode.

tcl is gitk and git-gui.

asm is under compat, so it's portability stuff for (likely braindead) platforms.


Yes, you're right, the previous results are for everything. Unfortunately the core doesn't seem to be in a clear separate directory which confirms a bit Zed's opinion about git. Though I tried to include only the core and got the following results:

   $ sloccount *.* block-sha1 builtin

   Totals grouped by language (dominant language first):
   ansic:       100854 (83.83%)
   perl:         12704 (10.56%)
   sh:            6608 (5.49%)
   python:         143 (0.12%)


Here's something that I have been curious about - the biggest perceived disadvantage of git is that it is not written in a single language - this leads to the problem that it is more difficult to set up on Windows than mercurial.

Isn't this a problem that is easily solved ? I mean Github is a pretty large company built on top of git. I know that Github is sponsoring work on libgit2, but JGit is pretty good as well (and so is the pure python implementation - Dulwich).

Would it cost too much for Github to sponsor a bounty to make a pure python/ruby/java based implementation and keep it in sync with Junio's git ?

Though it is not the intention, I'm pretty sure if you had a 100% Github-supported Dulwich git... it would be very hard for mercurial to stand up as an alternative. Which is why it is hard for me to understand why isn't such an obvious business lead being taken advantage of. I genuinely want to understand whether it is a very hard technical problem - which will make me respect Linus even more.


GitHub is investing heavily in libgit2 because we think it's the future for git for all the reasons listed at http://libgit2.github.com/


That just makes me cry. Absolutely no sense of style, just a bunch of obscene hacks that only work because a billion people use it.


... and you have just described the state of almost every single successful product and industry.


You must be using a lot of crap then.


So do you:

x86 architecture, HTTP, Microsoft Windows, Adobe Flash, Water based urinals, internal combustion engine based cars, the modern English language, mobile phones.

All of these disappear into the background once you get used to them - until they get replaced by something better or worse (to the point that you can't find the previous one) - and then they are visible again, either with an appreciative or nostalgic look, but only for a short while.

Case in point for improvement - almost all pre-iphone phones are clunky (by today's standards), and almost all post-iphone touch screen phones are copies of the iphone. The clunky phones were just as clunky in 2005, but until the world was aware of the alternative, it wasn't obvious _how_ clunky and style free they are.

Case in point for degradation - my 1280x1024 17" 4:3 matte LCD monitor that I got in 2005 still looks brilliant and useful on my desk - and I have a working 2001 17" CRT monitor that can do 1600x1200 without blurring or breaking a sweat. Both were not the cheapest available at the time I bought them, but were far from high end. I was recently looking for a replacement and can't find a freaking 4:3 LCD monitor; and all the 16:9 ones are glossy, 17" don't go above ~800 horizontal lines. Where's the style in that?

And zed - since you like programming languages like Python and Lua (which both have a very well defined style) - you might like K (commercial version at kx.com ; open at https://github.com/kevinlawler/kona ) it's the ultimate stylish language. (And you can't look at software engineering seriously again after you've mastered it - everything looks so convoluted and unneeded. Think R on steroids, to the point that plain R seems as verbose as Ada)


I guess everyone is using a lot of crap then, and they're all completely wrong and you're right.

This is possible, of course, but instead of snarky dismissals you could perhaps say why you think what you do. I respect you, but also happen to think git kicks ass, a few minor warts notwithstanding. Your dismissive appraisal as "crap" is basically worthless.


Zed, have you contemplated that git may be written in good old Unix paradigm, as a set of multilayer tools where each does smallest possible subset of things, and they are interconnected with scripts.

As in ESR's "TAoUP".

Now, you may dislike this paradigm, but it's hardly "bunch of obscene hacks with no sense of style".

[edit] Having said that, thanks for saving the infrastructure for you projects, that was cool move of you.


There's bash in there, and I think there's some Python; some plugins (git-svn) are in Perl. I also wasn't under the impression that there were 10 languages there.


I believe python is required only for building the docs (html/man) as they are written in asciidoc. There are precompiled docs (git-manpages / git-htmlpages) in the kernel.org git dir[1], so most often that isn't needed.

[1]: http://www.kernel.org/pub/software/scm/git/

edit: Looks like there are a few importers written in python as well, but those are entirely optional.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: