Wednesday, May 7, 2014

Build System Performance on Windows

Over the last three months I had the pleasure to run Fedora 20 Linux on the Laptop I am using for work. Last week, I was forced to downgrade to Windows 7. (Mainly, because my employers system administrators don't support everything else. I am quite ready to have the occasional fight for my freedom against the admins, but I won't accept the constant struggle. To name just the most important problem: Accessing an MS Exchange Server without IMAP enabled is, at best, exhausting.) Why the word "downgrade"? Because my machine is so much slower now. I am a developer. My Eclipse is open for 10 hours a day and I can't count the number of invocations of Ant, Maven, Make, and other build systems. (Ant, and Maven, being my personal favourites.) Of course, the machine isn't actually slower. It is the same hardware, after all. Same amount of RAM, still without an SSD. However, and that's a fact: Running one and the same build system against the same project on Windows 7 takes more time than doing just that on Linux. If you don't believe me, try the following: Install a Linux VM on your Windows PC. Then run the following command, first on the VM, then on the Windows host:
git checkout
What are the odds, that this command will run faster on the Linux VM than on the Windows hosts. I'd bet. And I'd win. (It's true: Linux Git on the emulated hardware wins against Windows Git on the raw iron.) Btw, for an even more convincing example, try "git svn checkout".) This week, I decided to waste some time to think about the issue: How do I get my build system on Windows as fast as on Linux. First, let's identify the guilty party: It's none other than... (drum roll) NTFS! I'm not making this up: Others are quite aware of the problem. See, for example, this page. A Google search for "ntfs performance many small files" returns about 168000 hits. So, let's state this as a fact: NTFS behaves extremely poor when dealing with lots of small files. But that's exactly, what a build system is all about. Let's take a typical example:
  1. The first typical step is to remove a build directory (like "target", or "bin", or whatever you name it.)
  2. The compiler reads a lot of small source files (named *.java, *.c, or whatever) from the "src" directory.
  3. For any such file, the compiler creates a corresponding, translated file (named *.class, or *.o, or whatever) in the build directory.
  4. A packager, or linker, like "jar", or "ln" combines all these files we have just created into a single target file.
Notice something? This is the same for all build systems. It really doesn't matter, whether your build script uses XML, a DSL, JSON, or a binary format. (No, this is holy war won't have my participation.) What matters is this: All current build systems are based on the mantra of an output directory, where lots of small files are created. But, that's not a necessity. So, here's the challenge: Let's modify our build systems in a manner that replaces the output directory with a "virtual file system". If we do it right, we can be much, much faster. As a poof of concept, I wrote a small Java program, that extracts the Linux Kernel sources (aka the file "linux-3.14.2.tar.gz") and writes them into implemantations of the following interface:
public interface IVFS {

	OutputStream createFile(String pPath) throws IOException;

	void close() throws IOException;
For any source file (45941 files) the method createFile is invoked, the file is copied intoo the OutputStream, and the stream is closed. Finally, the method IVFS.close() is invoked. Here's my programs output:
   Linux Kernel Extraction, NullVFS: 4159
   Linux Kernel Extraction, SimpleVFS: 1740044
   Linux Kernel Extraction, MapVFS: 78134
The three implementations are:
  1. The NullVFS inherits the idea of /dev/null: It is basically a write-only target. Of course, this isn't really useful. On the other hand, it shows how fast we could be, in theory, if our target were arbitrarily fast: In this case 4159 milliseconds. (This is, mainly, the time for reading the Linux Kernel sources.)
  2. The SimpleVFS is basically, what we have now. Files are actually created. As expected, this is really slow, and it takes more than 1740 seconds.
  3. Finally, the MapVFS is basically an In-Memory store. However, it might be really useful, because its close method is creating a big file with the actual contents on disk. With 78 seconds, this implementation is still close to the NullVFS. It demonstrates what might be really possible.
Conclusion: When creating one file with our actual contents, we need 78 seconds, as opposed to 1740 seconds. Of course, the IVFS interface is an oversimplification. The implementations certainly aren't thread safe. We have omitted the possibility to modify files that have previousöy been created. But the numbers are so impressive that I am personally convinced: If we a) modify our build system to use a virtual file system as the output and b) provide fast implementations, then we have much to gain, fellow developers! In practice, this won't be so easy. The biggest hurdle I am anticipating, is the Java Compiler. Even the Java Compiler API (aka the interface is based on real files: We won't be able to use the Java Compiler, as it is now. Instead, we have to manipulate them to use the VFS. ECJ, the Eclipse Java Compiler might be our best option for that. Who'll take the first step? Well, Gradlers, Buildrs, SConsers, of the world: Here's something where your users could have a real difference!

Thursday, May 1, 2014

The sins of our fathers

"Fathers shall not be put to death for their sons, nor shall sons be put to death for their fathers; everyone shall be put to death for his own sin." (Deuteronomy 24:16) But, of course, we are paying for our fathers sins. Not so much our biological fathers or ancestors, but our predecessors. In my case, this is what's happened today: I wrote a very small Java program that extracts the Linux Kernel sources (More on the reasons and background, hopefully, in my next posting. Suffice it for now, that I'm not rewriting "tar xzf". I'm not that stupid! I had a good reasons. Now, the Kernel Sources are containing in particular, a small file named "aux.c". And my own program threw a FileNotFoundException when creating that file. Reproducible! The error message was, of course, meaningless, so I began to start thining about all kinds of reasons:
  1. Permissions, either those of the file itself, or the containing directory. Mo, the permissions were just fine!
  2. Length of the path name. Actually, the full path name contained quite some characters, but still far away from the 256 that I am aware of.
  3. Too many open files. No, I have had my share of beginners faults and was properly closing.
Any other ideas. I guess you don't get this one: Some JDK programmer was actually implementing a check for aux.*, nul.*, prt.* etc when creating a file, because these file names where in fact a problem with Windows in the past. Of course, the sensible solution would have been:
  1. Wait for the error message from Windows.
  2. Check the file name.
  3. Throw a meaningful error message that explains the problem.
That way, veything would have worked fine, if the unthinkable happened: Windows eliminates that stsupid restriction. Because that was exactly what happened. There is now problem with creating that file. Convince yourself:
  $ touch aux.c

  jwi@MCJWI01 /c/Users/jwi/workspace/afw-vfs
  $ ls -al aux.c
  -rw-r--r--+ 1 jwi Domain Users 0 May  1 16:02 aux.c

  jwi@MCJWI01 /c/Users/jwi/workspace/afw-vfs
So, our JDK programmer has managed to move the problem with the "aux.c" file name from Windows to the JDK. Thanks, a lot!