libgit2: Checkout

So you’ve got this git repository, and it’s got a bunch of stuff in it – refs, trees, blobs, commits – and you want to work with that stuff. One way to think about that stuff is by thinking about how it’s organized into three trees, and moving stuff between those trees. In libgit2, the way you get stuff from a commit into the index and the working tree is by using the checkout API.

This isn’t ”git checkout

The first thing to realize is that libgit2 isn’t just a reimplementation of the git command line tool. That means that some terminology is reused, but doesn’t necessarily work the same way. In libgit2, checkout is all about modifying the index and/or working directory, based on content from the index or a tree in the object database.

Libgit2’s checkout API has (as of this writing) three modes:

None of those deal with actually moving HEAD around, which is most of what I use git checkout for, but hey. If you want to move refs around, try the refs API.

Wholesale

The general form for calling a checkout API is this:

git_repository *repo;
git_checkout_opts opts = GIT_CHECKOUT_OPTS_INIT;
// customize options...
int error = git_checkout_head(repo, &opts);

That opts structure is where all the good stuff happens. The default mode of operation is to

  1. Check every file in the tree that’s being read for differences with the index and/or working directory, and
  2. do absolutely nothing to the working directory.

By design, you have to be very explicit when you’re writing stuff to the working directory. To specify what strategy you want the checkout to use, you modify opts.checkout_strategy, usually to one of these values:

There are some other behavior flags you can include in this field as well:

That’s just a sampling; the header comments, are pretty helpful for using the rest.

Progress and notification callbacks

The git_checkout_* calls are blocking. If you want to know how things are going, or display progress to the user, you have to register callbacks. There are two types.

Progress

The progress callback notifies you as checkout actually writes files to the working directory. Here’s how one might look:

static void checkout_progress(
  const char *path,
  size_t current,
  size_t total,
  void *payload)
{
  printf("checkout: %3d%% - %s\n",
    100*current/total,
    path);
}

// ...
git_checkout_opts opts = GIT_CHECKOUT_OPTS_INIT;
opts.progress_cb = checkout_progress;
int error = git_checkout_head(repo, &opts);

The output looks something like this:

checkout:   0% - (null)
checkout:  12% - a/a1
checkout:  25% - a/a1.txt
checkout:  37% - a/a2.txt
checkout:  50% - b/b1.txt
checkout:  62% - b/b2.txt
checkout:  75% - c/c1.txt
checkout:  87% - c/c2.txt
checkout: 100% - master.txt

“Notifications”

The other callback you can specify is more specific about what’s going on with the files in the working directory. Checkout actually uses diff to do its work, so it doesn’t always overwrite every file in the working directory. If the contents match, no work is done at all. That little bit of understanding might make it easier to see this callback in action:

static int checkout_notify_cb(
  git_checkout_notify_t why,
  const char *path,
  const git_diff_file *baseline,
  const git_diff_file *target,
  const git_diff_file *workdir,
  void *payload)
{
  printf("path '%s' - ", path);
  switch (why) {
  case GIT_CHECKOUT_NOTIFY_CONFLICT:
    printf("conflict\n");
    break;
  case GIT_CHECKOUT_NOTIFY_DIRTY:
    printf("dirty\n");
    break;
  case GIT_CHECKOUT_NOTIFY_UPDATED:
    printf("updated\n");
    break;
  case GIT_CHECKOUT_NOTIFY_UNTRACKED:
    printf("untracked\n");
    break;
  case GIT_CHECKOUT_NOTIFY_IGNORED:
    printf("ignored\n");
    break;
  default:
  break;
  }

  return 0;
}

// ...
git_checkout_opts opts = GIT_CHECKOUT_OPTS_INIT;
opts.checkout_strategy = GIT_CHECKOUT_SAFE;
opts.notify_flags =
  GIT_CHECKOUT_NOTIFY_CONFLICT |
  GIT_CHECKOUT_NOTIFY_DIRTY |
  GIT_CHECKOUT_NOTIFY_UPDATED |
  GIT_CHECKOUT_NOTIFY_UNTRACKED |
  GIT_CHECKOUT_NOTIFY_IGNORED;
opts.notify_cb = checkout_notify_cb;
git_checkout_head(repo, &opts);

Here’s some example output. I’ve created the .gitignore file so that foo will be ignored, and changed the contents of master.txt.

path '.gitignore' - untracked
path 'a/a1.txt' - dirty
path 'foo' - ignored
checkout:   0% - (null)

I’ve left the progress callback as-is, so you can see how these two features interact – notifications happen as checkout is determining what to do, and progress callbacks happen as checkout is doing the things.

That’s when the checkout strategy is set to GIT_CHECKOUT_SAFE_CREATE. Watch what happens when I change it to this:

opts.checkout_strategy =
  GIT_CHECKOUT_FORCE |
  GIT_CHECKOUT_REMOVE_UNTRACKED;
path '.gitignore' - untracked
path 'a/a1.txt' - dirty
path 'a/a1.txt' - updated
path 'foo' - ignored
checkout:   0% - (null)
checkout:  50% - .gitignore
checkout: 100% - a/a1.txt

You can see that a/a1.txt was updated in the index, and if we had specified a progress callback, you’d see it being written in the working directory.

We also asked checkout to remove untracked files (but not ignored ones), so it deleted the .gitignore file, leaving foo as untracked instead of ignored. If we run it again:

path 'foo' - untracked
checkout:   0% - (null)
checkout: 100% - foo

… it removes the foo file as well.

One other capability that the notification callback gives you is the ability to cancel the checkout before any changes have been written to disk. Just return something other than 0, and the process will simply be aborted.

One file at a time

What if you don’t want to check out the entire working directory? What if you just want to discard the changes made to one file? The options structure has a field for you – it’s named paths, and it’s of type git_strarray.

Despite the name, it’s actually an array of fnmatch-patterns, like "foo/*" – the same format as you’d use in a .gitignore file. Continuing our earlier example, if I wanted to limit the files checkout is looking at to just the files in the a directory, I could do this:

char *paths[] = { "a/*" };
opts.paths.strings = paths;
opts.paths.count = 1;

And the output would look something like this:

path 'a/a1.txt' - dirty
path 'a/a1.txt' - updated
checkout:   0% - (null)
checkout: 100% - a/a1.txt

Note there’s no mention of .gitignore or foo; they’re filtered out by path matching before any of the diff logic is even applied.

Not HEAD

All of the examples we’ve seen so far use git_checkout_head. What if you want to pull out content that isn’t from HEAD? We saw in the beginning that you can easily pull content out of the index by doing this:

git_checkout_index(repo, NULL, &opts);

This gets content from the index and writes it to the working directory. It’s similar to doing git checkout [file] without specifying a branch or revision. That NULL parameter could also refer to a separate index, which is a bit beyond the scope of this post.

You can also pull content from elsewhere in the history. For instance, to replicate something like git checkout HEAD~~ master.txt, you could do this:

char *paths[] = {"master.txt"};
opts.paths.strings = paths;
opts.paths.count = 1;

// Get "HEAD~~"
git_commit *commit;
git_revparse_single((git_object*)&commit, repo, "HEAD~~");

// Do the checkout
git_checkout_tree(repo, commit, &opts);

// Clean up
git_commit_free(commit);

That’s about it

NOTE: You should do error checking. You should also check out the documentations comments in the git2/checkout.h header – they’re really well-written, and they cover more than what I’ve got here.

What now?

I dunno. What are you trying to do? You could always check out my other libgit2 posts for some ideas. Or look for help everywhere else.

libgit2: The Repository

In libgit2, the git_repository object is the gateway to getting interesting stuff out of git. There are several ways to get your hands on one.

Clone

If your repository exists on a remote but not on the local machine, you can get it using git_clone, and once it’s done with all the network stuff, it spits out a repository object. Check out my post on cloning for more on that.

Discover

If you know a particular directory is a git repository, you can just hand the path to git_repository_open. The path can be to a bare repository, a .git folder, or a working directory.

git_repository *repo;
int error = git_repository_open(
  &repo,
  "/path/to/repository/on/disk");

In classic C fashion, libgit2 APIs generally return 0 on success, and a negative error code on failure. Occasionally the API documentation will mention the specific error codes that will come back, but you can always check the error header for the values.

If all you have is a path that you think is controlled by git, you can let libgit2 walk the directory structure to find it’s owning repository (if there is one). This approach works well if your application is dealing primarily with documents, like a text editor.

char path[1024];
if (0 == git_repository_discover(
  path, 1024,                       // buffer & size
  "/path/to/a/controlled/file.md",  // where to start
  true,                             // across filesystems?
  "/path"))                         // where to stop
{
  git_repository *repo;
  error = git_repository_open(&repo, path);
}

Initialize

If you want to create a new repository, git_repository_init is the call for you.

git_repository *repo;
int error = git_repository_init(
  &repo,                // output
  "path/to/new/repo",   // path
  false);               // bare?

This is kind of like running git init from the command line. If you need more control, you’ll use git_repository_init_ext:

git_repository *repo;
git_repository_init_options options =
  GIT_REPOSITORY_INIT_OPTIONS_INIT;
// ... (configure options)
int error = git_repository_init_ext(
  &repo,                // output
  "/path/to/new/repo",  // path
  &options);            // options

The signature itself looks similar to the simpler version, but that options structure exposes lots of behavior. Things like:

Unfortunately, as of this writing the documentation parser doesn’t output structure-field comment-docs, but the header is pretty helpful.

What now?

I dunno. What are you trying to do? You could always check out my other libgit2 posts for some ideas. Or look for help everywhere else.

libgit2: Cloning

Libgit2 aims to make it easy to do interesting things with git. What’s the first thing you always do when learning git? That’s right, you clone something from GitHub. Let’s get started, shall we? Let’s get some of the boilerplate out of the way:

#include "git2.h"
#include <stdio.h>

int main(int argc, char **argv)
{
    const char *url, *path;

    if (argc < 3) {
        printf("USAGE: clone <url> <path>\n");
        return -1;
    }

    url = argv[1];
    path = argv[2];
    return do_clone(url, path);
}

What does the do_clone method look like? Let’s start simple:

static int do_clone(const char *url, const char *path)
{
    git_repository *repo = NULL;
    int ret = git_clone(&repo, url, path, NULL);
    git_repository_free(repo);
    return ret;
}

git_clone takes some information, and fills in a pointer for us with a git_repository object we can use to do all manner of unholy things. For now, let’s ignore the repository itself, except to be good citizens and release the memory associated with it.

That NULL parameter? That’s for a git_clone_options structure, which defaults to some sensible stuff. The way our code is written right now, these two commands will have the same results:

./clone http://github.com/libgit2/libgit2 ./libgit2
git clone http://github.com/libgit2/libgit2

… except that git tells you what it’s doing. Let’s fix that.

One of the things you can do with git_clone_options is have libgit2 call a function when there is progress to report. A typical callback looks like this:

static void fetch_progress(
        const git_transfer_progress *stats,
        void *payload)
{
    int fetch_percent =
        (100 * stats->received_objects) /
        stats->total_objects;
    int index_percent =
        (100 * stats->indexed_objects) /
        stats->total_objects;
    int kbytes = stats->received_bytes / 1024;

    printf("network %3d%% (%4d kb, %5d/%5d)  /"
            "  index %3d%% (%5d/%5d)\n",
            fetch_percent, kbytes,
            stats->received_objects, stats->total_objects,
            index_percent,
            stats->indexed_objects, stats->total_objects);
}

That stats object gives you lots of useful stuff:

So let’s rewrite our do_clone function to plug that in:

static int do_clone(const char *url, const char *path)
{
    git_repository *repo = NULL;
    git_clone_options opts = GIT_CLONE_OPTIONS_INIT;
    int ret;

    opts.fetch_progress_cb = fetch_progress;
    ret = git_clone(&repo, url, path, &opts);
    git_repository_free(repo);
    return ret;
}

If you run this now, the program will tell you what it’s doing! You can watch the network transfer happening, and notice that the indexer is doing its job at the same time.

[...]
network  73% (   7 kb,    51/   69)  /  index  71% (   49/   69)
network  75% (   7 kb,    52/   69)  /  index  72% (   50/   69)
network  76% (   7 kb,    53/   69)  /  index  73% (   51/   69)
network  78% (   7 kb,    54/   69)  /  index  75% (   52/   69)
[...]

If you try this with a large repository, you’ll notice a significant pause at the end. All the data has been moved, what’s going on? It turns out that doing a checkout can take a non-trivial amount of time. It also turns out that libgit2 will let you report that progress as well!

But that’s part of checkout, which warrants its own blog post. In the meantime, check out the clone header to see what git_clone can do. If you want to, you could even use the code from this post as a starting point for your own experiments.

What now?

I dunno. What are you trying to do? You could always check out my other libgit2 posts for some ideas. Or look for help everywhere else.

2012: Year in Review

My 2012, through the GitHub lens. Inspired by Tim Clem.

Annotated GitHub Contributions Chart

I guess it’s not that surprising, but vacations, travel, and holidays show up pretty clearly.

Yes, there are two honeymoons. Both of them piggybacked on business travel — the first was a destination wedding I photographed with my lovely wife a week after we were married, and the second was glued to a conference I spoke at.

The Best Part of Waking Up

It seems like such a small detail: what’s the first thing you do when you wake up? Hop in the shower? Check your email? Hit Reddit for a quick puppy fix before the coffee starts working?

I recently switched to writing code.

Setting the tone

You wake up each day completely fresh. It takes a while before all the worries from yesterday make themselves known again, so for a while you have an empty, clear mind. And the first thing you put in is going to stick.

I used to read email, Twitter, and Facebook first thing in the morning. That got me current with what happened while I was asleep, but it put me in the mindset of keeping up. Following. From that point on, I had to know what was going on, and since people are constantly doing things, I was always behind. I hate being behind.

Mindset

The write-first strategy puts you in the maker’s mindset. You’ve made things all day, you’ve been fixing bugs since before breakfast.

A read-first morning puts you in the mindset of a consumer. You’re looking to be entertained, always out for that next endorphin hit. Not only does this reduce your output, it kills your creativity.

I’ve found that some of my best ideas come to me in the shower, but there needs to be something I’m working on in that part of my mind that lurks just behind the conscious. If that something is “I wonder what Gruber is thinking about,” I’m missing a great opportunity. I’d much rather be solving problems. Besides, the answer is always “Apple, or maybe baseball.”

The Power of Habit

We like to think of ourselves as sentient beings with free will. This is a pleasant fiction, with many practical benefits, but as any psychologist will tell you, it’s not exactly true. On any given day, it’s likely that a person will do the same thing she did yesterday, as opposed to a completely new thing. We’re mammals, and habit is powerful.

What if your habits weren’t harmful, like heroin, or merely benign, like drinking coffee? What if they were constructive, what if they actually made you feel better?

It’s been working. I feel more like a maker if, before I do anything else, I make something. I start checking Twitter and Facebook as break time, rather than using them to avoiding real work, and I get more real work done.

It’s amazing what power a small detail can have.

Native Win32 for fun and profit

[Note: this is ported from my old blog, and there’s more discussion there.]

All the cool kids these days are playing with awesome dynamic languages, or on cool frameworks. I’m stuck with c++ at work, but every now and then I get to do something cool with it.

radial menu

That’s the Wacom radial menu, which is implemented as a fully alpha-blended window in native Win32. Something like this is dead simple in WPF, but with native code it’s a bit trickier. I used WTL, GDI+, and a handy, little-known Windows feature to get it done, and I’m going to share my secrets with you, dear reader.

Dependencies

WTL

Windowing frameworks are thick on the ground, and I’ve been mostly dissatisfied with the abilities of the Win32-wrapping category. However, they make something like this reusable, so what the heck.

You can grab WTL at the project home on SourceForge. For this project, I’m just taking the files in the include directory and putting them under wtl in my project directory, so I don’t get the Windows SDK versions instead.

I’ve found this to be the best way to include the WTL headers:

#define _SECURE_ATL 1
#define _WTL_NO_AUTOMATIC_NAMESPACE
#define _ATL_NO_AUTOMATIC_NAMESPACE

// These are required to be included first
#include "atlbase.h"
#include "atlwin.h"
#include "wtl/atlapp.h"

#include "wtl/atlgdi.h"   // For WTL::CDC
#include "wtl/atlframe.h" // For WTL::CFrameWindowImpl

Those defines specify that the ATL and WTL classes should stay safely ensconced in their own namespaces. This means you have to reference them as WTL::CFrameWndImpl, but it keeps the global namespace clean, which is a major failing of windows.h.

GDI+

GDI+ is an immediate-mode drawing API that has shipped with Windows since XP, so I can use it without needing to ship yet another redistributable installer. Here’s all you need to do:

#pragma comment(lib, "gdiplus.lib")
#include &lt;gdiplus.h>

While GDI+ is written in c++ and uses classes, it’s initialization isn’t RAII-friendly, so I wrote a little wrapper class:

class ScopedGdiplusInitializer
{
public:
  ScopedGdiplusInitializer()
  {
    Gdiplus::GdiplusStartupInput gdisi;
    Gdiplus::GdiplusStartup(&mGdipToken, &gdisi, NULL);
  }
  ~ScopedGdiplusInitializer()
  {
    Gdiplus::GdiplusShutdown(mGdipToken);
  }
private:
  ULONG_PTR mGdipToken;
};

Now I can write my main function like this:

int main()
{
  ScopedGdiplusInitializer gdiplusinit;
  // ...
}

Boost

The production code for this feature uses boost (specifically shared_ptr), but in the interest of simplicity I’ve left it out. If you use boost, or your compiler supports the new std::shared_ptr introduced with TR1, I highly recommend you use that instead of raw pointers whenever possible.

A window class

Here’s where it all comes together. Meet me after the code, and I’ll explain more fully.

class AlphaWindow
  : public WTL::CFrameWindowImpl<
      AlphaWindow, ATL::CWindow,
      ATL::CWinTraits< WS_POPUP, WS_EX_LAYERED > >
{
public:
  DECLARE_FRAME_WND_CLASS(_T("WTLAlphaWindow"), 0);

  virtual ~AlphaWindow()
  {
    if (IsWindow())
    {
      SendMessage(WM_CLOSE);
    }
  }

  void UpdateWithBitmap(Gdiplus::Bitmap *bmp_I,
                        POINT *windowLocation_I = NULL)
  {
    // Create a memory DC
    HDC screenDC = ::GetDC(NULL);
    WTL::CDC memDC;
    memDC.CreateCompatibleDC(screenDC);
    ::ReleaseDC(NULL, screenDC);

    // Copy the input bitmap and select it into the
    // memory DC
    WTL::CBitmap localBmp;
    {
      bmp_I->GetHBITMAP(Gdiplus::Color(0,0,0,0),
                        &localBmp.m_hBitmap);
    }
    HBITMAP oldBmp = memDC.SelectBitmap(localBmp);

    // Update the display
    POINT p = {0};
    SIZE s = {bmp_I->GetWidth(), bmp_I->GetHeight()};
    BLENDFUNCTION bf = {AC_SRC_OVER, 0,
                        255, AC_SRC_ALPHA};
    {
      ::UpdateLayeredWindow(m_hWnd, NULL,
                            windowLocation_I,
                            &s, memDC, &p,
                            RGB(0,255,255),
                            &bf, ULW_ALPHA);
    }
    ShowWindow(SW_SHOWNORMAL);

    // Cleanup
    memDC.SelectBitmap(oldBmp);
  }
};

Layered Windows

The magic ingredients for this class are the WS_EX_* styles and the UpdateLayeredWindow call.

First, the styles. These are specified on line 3, as part of the base class. That’s just how you declare your window’s styles in WTL. There are two:

The call to UpdateLayeredWindow on line 35 is what tells Windows what the contents of the display are. There’s some clunky interop code here, since the GDI+ Bitmap object can’t be used directly with the GDI-oriented layered window API. I’m sure there’s a better way, but in my case the overhead of copying my smallish Bitmap into another smallish HBITMAP wasn’t a problem.

WTL complains rather loudly if a window object is destroyed before the HWND it’s wrapping is closed, so the destructor on line 7 takes care of that.

Pretty Pictures

That UpdatedLayeredWindow call is wrapped in a method that takes a GDI+ bitmap, so now all we need to do is provide it with one. GDI+ makes this pretty easy, especially when compared to GDI code:

using namespace Gdiplus;
// Create a bitmap buffer
Bitmap bmp(400,400);
// Context for drawing on the bitmap
Graphics g(&bmp);
g.Clear(Color::Black);
// ...

All together now

Here’s the main function of my little test program.

int main()
{
  ScopedGdiplusInitializer init;

  {
    // Create the display window
    AlphaWindow wnd;
    wnd.Create();
    wnd.SetWindowPos(NULL, 200,200, 0,0,
                     SWP_NOSIZE | SWP_NOREPOSITION);

    // Create a backbuffer
    Gdiplus::Bitmap bmp(400,400);

    // Clear the background of the buffer to
    // translucent black
    Gdiplus::Graphics g(&bmp);
    g.Clear(Gdiplus::Color(100,0,0,0));

    // This tells GDI+ to anti-alias edges
    g.SetSmoothingMode(
       Gdiplus::SmoothingModeAntiAlias);

    // Draw two semi-transparent ellipses
    Gdiplus::Pen redPen(Gdiplus::Color(100,255,0,0),
                        10.);
    Gdiplus::Pen bluePen(Gdiplus::Color(100,0,0,255),
                         10.);
    g.DrawEllipse(&redPen, 50,50, 200,300);
    g.DrawLine(&bluePen, 175,10, 175,390);
    g.DrawEllipse(&redPen, 100,50, 200,300);

    // Update the window's display
    wnd.UpdateWithBitmap(&bmp);

    // Wait to exit
    getchar();
  }
}

I know, programmer demos of this are always ugly. Maybe one day I’ll write about how to store a PNG as a resource, and load it in for use with this. For now, you get an ugly screenshot:

ugly test image