Today I’m going to talk about how I make beautiful commits to public projects. I will be focusing more on matters and processes around git usage, rather than technical details of the contribution or social aspects of interacting with random people over Internet.

Note that while this will be about an open-source project — which is definitely a labor of love for me — I strive to maintain the same process at my day job as well. Quality is rarely a matter of budget constraints, actually caring about it and setting the bar high enough is way more important.

How it begins

You start off with an issue. Ideally, one that affects yourself (aka scratch your own itch) but other people’s issues are fine too, everybody likes when their problems go away. If you have absolutely no idea what to improve then try looking for or in the issue tracker. Usually software we write is terrible, it always has some issues. Someone might even be annoyed enough to write these issues down or at least to voice their complaints.

That was the case for me: I was shuffling through my inbox, stumbled upon this person and felt sorry for them.

RAIL issue screenshot from FreeRDP's bugtracker

For real, FreeRDP should have had this feature ages ago but it doesn’t! Displaying remote application icons is important, but no one cared about it aside from that person. Do you care about it? Are you bad enough to rescue the President?

Okay, you have your problem, now we need a solution. Experience definitely helps here as solving similar problems should be easier next time you have them. Thankfully, I do have enough experience with FreeRDP and X11 under my belt so I knew more or less where and what to look for. It should be a matter of reading some specs and calling a few library functions to have those icons displayed.

Initially I did not intend to fix that issue myself, so I left a comment with some hints for whoever is going to tackle that issue. It’s very important to leave this sort of a message trail, especially in open-source work done during you free time, because of the immense help such notes provide in case you flake and someone else has to deal with this later.

Starting my work

Now comes the hard part (at least for me): starting. Once I’ve done that it’s infinitely easier to continue. So I patiently wait for desire to kick in. This time it did not take long, I was kinda frustrated with another project and it seemed to be a good idea to C some Rust off my X11 skills (pun intended).

Make a fork, checkout a new branch, start hacking. Sounds easy. I like to name my first-contact branches with a wip prefix so this gets called wip/rail-icons. I know I’ll need an icon cache so I build something like that, improving it along the way. At some point I want to test things out so I add some debug logging, marking it as such so it’s easy to remove later. Then more code for color conversion follows, with experiments, curses and TODOs all around it.

Hacking around

I make rather crappy commits while developing and experimenting:

$ git log --oneline --decorate --graph
* 45daf77d0 (HEAD -> wip/rail-icons) palette
* 42a05aa67 fix order reading
* 016ffb295 fix alpha
* cbff68a9f fix comment
* c9c7f3f88 better convesions
* f5dd034a5 pixel formats
* 444d8f301 icons are appended
* 93d984a0a todo
* 295fa4165 setting icons
* f2c70c7b0 color conversion
* 9355a0be8 [debug] flush on output
* e7fdb6295 [debug] output
* ec352a211 cache: actually, check 0xFF first
* 5a2c6359d [debug] output
* 60fbab6d2 using icon caches
* 77012fff3 introduce scratch icon
* f3a070680 rename pixel field
* 9f473d89c icon cache values, cleanup and lookup
* 8696117a0 allocating caches
* e44a3e1c2 prepare icon cache structs
*   c5c1bac31 (upstream/master) Merge pull request #4960 from akallabeth/interleaved_fix
|\
| * fff2454ae Make VS2010 happy, reworked UNROLL defines.

And this is fine because they are not meant to go public. Some people would submit a code like that for review as soon as it works, but I find it disgustingly uncultured practice. That’s like when you come to a restaurant and instead of getting your food served on a plate by a waiter you are invited to eat the meal straight from the oven in the kitchen.

I like keeping notes in TODOs when I work on the code, occasionally swearing in them. I split the commits into very small parts so I can get back to them later, or revert to a particular state of the code. I use [debug] tags for commits that need to be removed later. You’re free to do anything you want as long as you’re comfortable with it, and you do it privately without involving other people into it (such as your code reviewers).

Note, however, that while my commits seem to be incoherent they are not made by simply binding Ctrl-S hotkey to git commit. I keep most of the information in my head but the commits are still self-contained pieces of changes. Each of them introduces some small atomic improvement. It is important to keep them small but meaningful. Commit early and often, do not postpone it until the end of the day.

Wrapping up for the day

It’s important to have something working at the end of your hacking session. This feeling of accomplishment keeps you motivated and preserves momentum. In my case I’ve almost managed to get the icons displayed. At least on my machine. In one environment. Sometimes. If you know where to look for them.

Almost-working icon of notepad.exe

Now all that remains is more testing, fix the bugs (like the icons not actually displaying where they should be), cleanup the code, prepare the patches and submit a pull request.

I like to leave incomplete tasks slightly broken by the end of the day. It gives me a good anchor point to resume the work next day. In this case I left the commits and remaining bugs as-is, knowing that they will be the first thing I notice tomorrow. Just don’t forget to git push them to some remote repository.

Keeping commits organized with git rebase

Remember that ugly stream of incoherent commits? That’s obviously not pretty. If you keep pumping out commits like that then you’re digging your own grave when the only remaining option will be to squash them all into one giant ball of mud. In order to avoid that I prune my git trees at least once per day.

git rebase --interactive is an amazing tool for that. It lets you rearrange, split and squash, and meld commits into whatever shape you need.

Another tool that I use for rebasing is gitk. Despite its simplistic Tcl/Tk interface, it does a great job at displaying branch history and all relevant information about commits.

Viewing a branch history via gitk

So I start with this tree. Then I stare at commits in gitk, review their contents and think how they can be grouped up. The following four groups seem to emerge:

  • temporary debugging commits that need to be removed
  • bug fixes in code that is not directly related to my feature
  • mainaining the icon cache and interacting with RDP stack
  • processing images and actually setting the icons

To regroup the commits I run git rebase -i master and edit the rebase worklist like this, constructing four commits out of that mess:

pick 9355a0b [debug] flush on output

pick 42a05aa fix order reading

pick e44a3e1 prepare icon cache structs
s    8696117 allocating caches
s    9f473d8 icon cache values, cleanup and lookup
s    f3a0706 rename pixel field
s    77012ff introduce scratch icon
s    60fbab6 using icon caches
s    5a2c635 [debug] output
s    ec352a2 cache: actually, check 0xFF first
s    e7fdb62 [debug] output

pick f2c70c7 color conversion
s    295fa41 setting icons
s    93d984a todo
s    444d8f3 icons are appended
s    f5dd034 pixel formats
s    c9c7f3f better convesions
s    cbff68a fix comment
s    016ffb2 fix alpha
s    45daf77 palette

:wq, enjoy. I don’t bother with better commit messages for now, it’s good enough to just keep the commit count at bay and get rid of some failed experiments I made on the way. Plus, reviewing the commit messages reminds me what the hell did I do yesterday.

Also, I happen to avoid having to resolve any rebase conflicts this time. That’s usually the case if you rebase and prune commits often. However, if you neglect it for too long then you’re bound to encounter conflicts.

So I start my next coding session with this tree which contains only four commits. I expect to do only bug fixes now so it seems the final patch set will have four commits.

Resolving bugs and improving code style

Now, there were a few bugs that I was aware of and a couple that I discovered along the way. This post is not a guide on debugging C code and not a rant about hysterical raisins which caused the engineers of DIB format to make counterintuitive decisions. Let’s just say that I had to tweak some things to make these icons:

Icons with color corruptions

look like actually normal ones:

Correctly displayed icons

After that the patch set is feature-complete and I can start cleaning up the code. Like, for example, remove all the swearing from TODOs, add actually useful comments, make some code style changes and remove temporary debug logs which are no longer needed.

All of this results in another bunch of commits slapped on top of those four from the day before. Their content and messages do not really matter at this point. The one thing that matters is that I’m happy with the final state of the code. We’ll clean up the commits later to make sure they are readable. Now I can go and report the status on GitHub to keep interested people updated on the progress.

Preparing the patch set

Once again, I start the day with a bunch of commits from yesterday which I want to regroup and squash, melding the bug fixes with new code as if I never made those mistakes.

gitk can highlight commits that add or remove a particular string (like git log -S from the terminal), it helps identifying commits that can be grouped.

Example of gitk highlighting

I end up with the following rebase worklist. (Note the removed debugging commit.)

#### a9d40cb [debug] flush on output

pick 3619a2c fix order reading
s    eecba32 spec reference

pick 2d6c8d8 prepare icon cache structs
s    5e07c09 improve naming
s    d3541cb improve naming
s    4aa3638 drop more debug logs
s    564490a code style
s    ff1d47d return values
s    0e466bd drop comment just use i in the commit message:

pick 1cac1cf color conversion
s    f3c5008 fix a bug with 16x16 icons
s    1dbc6dd fun bugs with color formats
s    db30910 better explanation
s    b8f027d alpha bitmask should actually be a *mask*
s    31d9bdf fix 8-bit palettes (RGBQUAD does not include alpha)
s    5fcf43b use functions instead of swearing about formatting
s    7000dbd cleanup comments, lower amount of smug in them
s    14ce175 simplify error handling
s    7803a52 dont use malloc
s    2d39ffc drop debug logs
s    e7ad030 drop dead code
s    2dd6bd7 add xflush for icon set

This time the rebase did not apply cleany and caused merge conflicts. Commits 5e07c09 improve naming and ff1d47d return values failed to apply because some functions were renamed. These kinds of conflicts are easy to resolve: you look at the conflicting commit, remember its intent, and try redoing the same change to the old code. Keep in mind that fixes you do can cause more conflicts in the following commits. Merge conflicts are bound to happen if you reorder patches during an interactive rebase. That’s why you should prune commits early and often, lowering the possibility of conflicts in the future.

I end up with the following tree after the rebase. It’s always a good idea to retest your code after rebasing, especially after resolving merge conflicts. You can also run a git diff between pre- and post-rebase states to make sure that you did not drop something you did not intend to.

Editing patches for clarity

When I feel that the patch set is ready I focus my attention on the diffs those patches introduce. Each patch should be atomic and solve one specific task. In my case I implement a new feature so I’d expect the diffs to be mostly green (additions).

I review the commits I currently have and look what can be improved in them:

The first one looks good: it’s an easy bug fix. The second one also seems okay: nice, self-contained additions. The last commit is mostly good but there are some weird diff hunks in it. It seems that some these changes should be a part of the second commit instead. Let’s look at these strange spots more closely.

Here we see changes in struct xf_rail_icon which is introduced in the previous commit. It’s an artifact of a failed experiement. This struct can be defined correctly from the start.

@@ -63,9 +64,8 @@ static const char* movetype_names[] =

 struct xf_rail_icon
 {
-        UINT16 width;
-        UINT16 height;
-        BYTE* argbPixels;
+        long *data;
+        int length;
 };
 typedef struct xf_rail_icon xfRailIcon;

Then there’s this weird diff which is actually another artifact of renaming xf_rail_convert_icon() to convert_rail_icon(). Let’s use the correct name right away.

@@ -611,15 +611,206 @@ static xfRailIcon* RailIconCache_Lookup(xfRailIconCache* cache,
         return &cache->entries[cache->numCacheEntries * cacheId + cacheEntry];
 }

-static void xf_rail_convert_icon(ICON_INFO* iconInfo, xfRailIcon *railIcon)
+static inline UINT32 read_color_quad(const BYTE* pixels)
+{
+        return (((UINT32) pixels[0]) << 24)
+             | (((UINT32) pixels[1]) << 16)
+             | (((UINT32) pixels[2]) << 8)
+             | (((UINT32) pixels[3]) << 0);
+}
...

Here’s some debug logging that was not removed in time. It is introduced in the previous commit but it’s not needed anymore.

+static inline UINT32 round_up(UINT32 a, UINT32 b)
+{
+        return b * div_ceil(a, b);
+}
+
+static void apply_icon_alpha_mask(ICON_INFO* iconInfo, BYTE* argbPixels)
 {
-        WLog_DBG(TAG, "convert icon: cacheEntry=%u cacheId=%u bpp=%u width=%u height=%u",
-                iconInfo->cacheId, iconInfo->cacheEntry, iconInfo->bpp, iconInfo->width, iconInfo->height);
+        BYTE nextBit;
+        BYTE* maskByte;
+        UINT32 x, y;
+        UINT32 stride;
+
+        if (!iconInfo->cbBitsMask)
+                return;

Finally, here we add an argument to the function which is first added in the previous commit. We can define the stub correctly earlier.

 static void xf_rail_set_window_icon(xfContext* xfc,
-                                    xfAppWindow* railWindow, xfRailIcon *icon)
+                                    xfAppWindow* railWindow, xfRailIcon *icon,
+                                    BOOL replace)
-        xf_rail_set_window_icon(xfc, railWindow, icon);
+        replaceIcon = !!(orderInfo->fieldFlags & WINDOW_ORDER_STATE_NEW);
+        xf_rail_set_window_icon(xfc, railWindow, icon, replaceIcon);

All these changes effectively revert changes made by the previous diffs. What would your code reviewer think if they saw such patches? “Hm… weird… are you a schizo?” That’s exactly how you pinpoint bad spots in your patches.

So how do I iron out these wrinkles? That’s a job for git rebase --interactive again! I start with git rebase -i master and edit the worklist like this:

pick 1eeff89 fix order reading
edit 91036c6 prepare icon cache structs
pick 9536a2c color conversion

This tells git to stop at commit 91036c6 and let us edit it. I won’t edit it right away but rather insert some new commits between 91036c6 and 9536a2c that follows it. If 9536a2c were smaller then splitting it directly with git reset (as described in manpage for rebase) might have been easier to do, but in this case I won’t do it.

I stop at 91036c6 and re-do the changes from 9536a2c on top of that commit. I use git commit --fixup=91036c6 to create specially named commits which are compatible with --autosquash option of rebase. (I also like to add some comments to fixup commits to note what they fix. You can do that with an --edit flag.)

By the way, did I tell about the awesome git add -p mode where you can review and select the patch hunks that go into your commit? Now you know.

After I’m done I do git rebase --continue to resume rebasing and reapply 9536a2c on top of split changes. As expected, the patch does not apply cleanly, there are some conflicts but they are rather minor. Fix them, git add them and git rebase --continue again. I end up with this new commit which looks much nicer.

Now I do a final git rebase -i --autosquash master. Note how fixup! commits are already in the right place:

pick 1eeff89 fix order reading
pick 91036c6 prepare icon cache structs
fixup f36c0e8 fixup! prepare icon cache structs
fixup a715365 fixup! prepare icon cache structs
fixup b696670 fixup! prepare icon cache structs
fixup ff5e5f4 fixup! prepare icon cache structs
pick 5a97269 color conversion

And here we have our complete patch set. I usually review the patches once again to make sure that I like the diffs. It’s also a good moment to re-test your work, just in case you messed up during all the rebases. Another thing that I can recommend testing is bisectability. Make sure that each of your commits at least compiles and passes unit-tests.

Now it’s time to do something about these awful commit messages.

Writing good commit messages

I use git rebase -i again to reword commit messages (yes, it’s an amazing multitool):

reword 1eeff89 fix order reading
reword aec29ec prepare icon cache structs
reword 5c2a1f4 color conversion

Then git opens a text editor (three times) where you can go wild and write new messages.

How does a good commit message look like? Chris Beams has summarized this for me so I won’t repeat the points which are made over and over. In short, use the following format:

Summarize changes in around 50 characters or less

More detailed explanatory text, if necessary. Wrap it to about 72
characters or so. In some contexts, the first line is treated as the
subject of the commit and the rest of the text as the body. The
blank line separating the summary from the body is critical (unless
you omit the body entirely); various tools like `log`, `shortlog`
and `rebase` can get confused if you run the two together.

Explain the problem that this commit is solving. Focus on why you
are making this change as opposed to how (the code explains that).
Are there side effects or other unintuitive consequences of this
change? Here's the place to explain them.

Further paragraphs come after blank lines.

 - Bullet points are okay, too

 - Typically a hyphen or asterisk is used for the bullet, preceded
   by a single space, with blank lines in between, but conventions
   vary here

If you use an issue tracker, put references to them at the bottom,
like this:

Resolves: #123
See also: #456, #789

You can see how this advice is applied in the commit messages I came up with:

Each of them describes the changes, provides context for them, and explains some non-obvious quirks.

The formatting and line-breaking are quite important so that git log output looks nice:

$ git show 158ac195a
commit 158ac195a343936c43a924544776fc51eb2dd478
Author: ilammy <a.lozovsky@gmail.com>
Date:   Sat Nov 10 22:09:20 2018 +0200

    libfreerdp-core: fix reading TS_ICON_INFO

    The spec says that CbColorTable field is present when Bpp is 1, 4, 8.
    Actually, bpp == 2 is not supported by TS_ICON_INFO according to the
    spec (though, DIB definitely supports 16-color images).

        MS-RDPERP 2.2.1.2.3 Icon Info (TS_ICON_INFO)

        CbColorTable (2 bytes):
            This field is ONLY present if the bits per pixel (Bpp)
            value is 1, 4, or 8.

    Omitting 8-bit value breaks 256-color icons which are incorrectly
    read with color and alpha data mixed up.

diff --git a/libfreerdp/core/window.c b/libfreerdp/core/window.c
index a9b3c00b8..74281a6ff 100644
--- a/libfreerdp/core/window.c
+++ b/libfreerdp/core/window.c
@@ -86,8 +86,8 @@ BOOL update_read_icon_info(wStream* s, ICON_INFO* iconInfo)
        Stream_Read_UINT16(s, iconInfo->width); /* width (2 bytes) */
        Stream_Read_UINT16(s, iconInfo->height); /* height (2 bytes) */

-       /* cbColorTable is only present when bpp is 1, 2 or 4 */
-       if (iconInfo->bpp == 1 || iconInfo->bpp == 2 || iconInfo->bpp == 4)
+       /* cbColorTable is only present when bpp is 1, 4 or 8 */
+       if (iconInfo->bpp == 1 || iconInfo->bpp == 4 || iconInfo->bpp == 8)
        {
                if (Stream_GetRemainingLength(s) < 2)
                        return FALSE;

Submitting a pull request

Finally it’s time to make a pull request. I rename the branch into just rail-icons (without the wip prefix) and write up a nice description of the changes.

In this case I implement a simple feature to solve a specific issue so the description can be quite basic and just reference the issue. It’s also a visual feature so I can add an eye-catcher image.

Submitted pull request #5003

Always describe your work and why you think it would be a good addition to the project. Read and respect the project’s contribution guidelines (here are ones for FreeRDP) before submitting a pull request. These rules are there for a reason.

After creating a pull request I also asked the reporter of the issue to test the fix, which they gladly did and confirmed that it works for them too.

Working with maintainers

Soon the code is going to be reviewed by project’s maintainers. It’s very likely to receive feedback and some change requests. In my case these were mostly nitpicks and some clarifications which I gladly replied to and accepted.

To my pleasant surprise, one of the maintainers was exceptionally nice and actually fixed all the nitpicks himself! Thank you, Armin.

I was also very glad that my humor about DIB format being vertically flipped was not wasted:

/* ɐᴉlɐɹʇsn∀ uᴉ ǝɹ,ǝʍ ʇɐɥʇ ʇǝƃɹoɟ ʇ,uop */
maskByte = &iconInfo->bitsMask[stride * (iconInfo->height - 1 - y)];
nextBit = 0x80;

Code comments are a vessel for oral tradition and they a good opportunity to occasionally have fun and make fun. For example, of stupid things we have to do for the sake of backwards compatibility. Don’t be so serious, take your suit off sometimes. (But don’t insult anybody, even in comments.)

Conclusions

Yay! Now I have (or rather, maintain) this pretty [Contributor] badge on GitHub:

Comment with a contributor badge

Hopefully, the world is on a way to be a better place now. One reason for that is the pull request discussed here, which hopefully makes FreeRDP a bit nicer. The other reason is you, who hopefully have read until this point and learned a trick or two about git for yourself. Please use this knowledge wisely to increase the quality of your submissions.

Cheerio~

Yuyuko picture by @sasamorichou