InnoDB Internals - Consistent Reads


I’ve been doing some research in this area trying to understand how this works in databases (for my upcoming project), so I thought I’d share some of my learnings here.

InnoDB internally uses ReadView to establish snapshots for consistent reads - basically giving you the point-in-time view of the database at the time when the snapshot it is created.

In InnoDB, all changes are immediately made on the latest version of Database regardless whether it has been committed or not, so if you don’t have MVCC, everybody will see the latest version of rows and it’ll be a disaster for consistency. Not to mention you’ll need to be able to rollback the changes. In order to achieve this, InnoDB maintains a undo log to track a link list of changes made by other transactions, so reading in the past with a snapshot means going from the latest record in the BufferPool, and walk backwards to find the first visible change. Rollback is similar.

This also means the undo log can’t be purged if the snapshot is still active, and undo log will get longer and longer, which slows down the reads more and more. This is the infamous long running transaction issue.

The fundamental issue is that you need to be able to determine visibility of changes. This is done with two things:

  1. InnoDB tracks the trx_id_t of each rows and in the undo log
  2. InnoDB internally use a data structure called ReadView to determine if a transaction is visible in the snapshot.

So the algorithm becomes as simple as walking the list backwards and find the first visible record.

Trying and setting up WSL 2

Tips for setting up WSL 2 environment

The year of Linux desktop has finally come. It’s Windows + WSL 2. Seriously.

I use a MBP 16 for my daily work and SSH into linux machines for development/testing. While it’s a fantastic machine (and the track pad is second to none), I just hate the Apple trying to lock down the system so much that even setting up gdb to work is a nightmare, and running any simple script it tries to phone home for validation.

So I tried installing Linux on my machines. I do have a personal laptop X1 Carbon Gen7, but it doesn’t work well with Linux: mostly Linux just doesn’t like the 4 Channel Dolby Surround Speakers - they sound something from a tin-can and volume is much lower. While in Windows the sound I get is actually pretty nice (for a laptop, of course). I have spent countless time on it and I’ve seen many people struggling through the same issues. There are also occasionall hipcup with suspend/resume, but I can live with that. I also have a powerful gaming PC which I mostly play games. WSL sounds like a perfect solution for those machines where I can use Windows for their compatiblity / games, while also use it for development / tinkering on Linux. Yes, you can either dual boot or install a linux VM, but the integration between WSL 2 and Windows seems pretty nice to me, so I decided to try it out - and now all my Windows machines have WSL 2 installed.

Setting it up is not too bad - you do need to follow the official instructions to install it, which I’m not going to repeat here. The installation experience was fairly smooth, though it requires multiple steps.

However, to get it work properly requires a bit of extra work. Once you set it up it’s pretty much all I ever needed. Here is what it looks like when I’m done:


SWIG and Python3 unicode

Anyone familiar with Python probably knew its history of Unicode support. If you add Python3, Unicode, and SWIG together, imagine what might go wrong?

Python3, Unicode, SWIG, and me

I was debugging a test failure written in Python just now and it is failing with this error:

Many of the end-to-end tests here are written in Python because they are convenient - no one wants to write a C++ code to drive MySql and our infra service to do a series of stuff.

UnicodeEncodeError: 'latin-1' codec can't encode character '\udcfa' in position 293: ordinal not in range(256)

The code looks like this:

sql = get_sql_from_some_magic_place()
decoded_sql = cUnescape(sql.decode("latin-1"))
decoded_sql_str = decoded_sql.encode("latin-1")

The code seems straight-forward enough. The offending string looks like this: b"SELECT from blah WHERE col='\\372'.

This string was originally escaped by folly::cEscape which does simple thing rather simple - converts the string to be a C representation where ‘' are double escaped and any non-printable characters are escaped with octal. This is convenient as those escaped strings are safe to pass around without worry for encoding as they are, well, ASCII.

folly is Facebook’s open source standard C++ library collection. See for more information.

It is convenient, until you need to call from Python, for which you’ll need to use SWIG:

If you don’t know SWIG - just think it’s a tool that generates Python wrapper for C++ code so that they can be called from Python code. In this case, folly::cUnescape. Go to to learn more. Many language have equivalent tool/feature built-in, P/invoke in C#, cgo in go, JNI in Java, etc.

std::string cUnescape(const std::string& a) {
  std::string b;
  folly::cUnescape(a, b);
  return b;

I was scratching my ahead trying to understand what is happening as there is no way the strings are converted to ‘\udcfa’, until I realize cUnescape might be at fault.

It turns out, SWIG expects UTF-8 string and returns UTF-8 strings back. “\372” can be converted to UTF-8 without any trouble, but once it is unescaped it becomes “\372” which is 0xfa that is going to be interpreted as UTF-8:

b"\372".decode("utf-8", errors="surrogateescape").encode("latin-1")

And you get:

UnicodeEncodeError: 'latin-1' codec can't encode character '\udcfa' in position 0: ordinal not in range(256)

The fix

To fix the problem, you can encode the buffer again with surrogateescape:

>>> b"\372".decode("utf-8", errors="surrogateescape").encode("utf-8", errors="surrogateescape").decode("latin-1")

Seems rather backwards, isn’t it? Why not just stop messing with the strings?

That’s exactly what was discussed in SWIG doc here: There is a magic macro you can use:

%module char_to_bytes
%begin %{
std::string cUnescape(const std::string& a) {
  std::string b;
  folly::cUnescape(a, b);
  return b;

And the original code can be changed to:

sql = get_sql_from_some_magic_place()
decoded_sql = cUnescape(sql).decode("latin-1")

Much simpler too.

I’m just happy that I mostly write C++ instead of Python…

Python3 filter is evil

Personally I like Python for what it is - a quick prototype language for writing simple utilities/scripts. Unfortunately when you start writing important infrastructure code using Python it quickly falls apart due to its dynamic typing. And all the breaking changes in Python3 didn’t help either.

Go pattern for worker queue - a quick case study for go concurrency

A quick case study for go concurrency

A classical computer science problem is a worker queue - imagine you have N WorkItems to be divided between M workers (of course this implies N » M). This pattern has many application such as generating lots of data and insert into database, processing CPU-intensive work, etc. This can be easily achieved with go routine and channels. I recently spend a bit of time working on a similar problem with go and I’d like to share the pattern that I came up with.