So I'm putting together a Windows 2008 R2 x64 RC Java image for a client (more on that later), and everything's breezing along fine. Install the OS, check. Install JDK 1.6 (u13) into the machine, check. Install Tomcat 6 into the machine, running as a native Windows service, check. Open localhost on port 8080, and... not check. Times out, no response, not good.
Naturally, the first thing to check is the logs, and I get the strangest error I've seen in a while. "Cannot create Java". This is odd—what's happening, in the aggregate, is easy enough to understand, in that the native Windows .exe launcher (ProcRun, a generic service launcher from Apache) is using JNI to create the JVM inside the launched service process and, for some reason, failing; what's not clear is why. Unfortunately, the error codes offered up by the two players involved (Tomcat/ProcRun and the Windows OS) are not helpful—the Windows Event Log basically says "Service failed to start. Check the error code", which reports 0 (not helpful, thanks), and the Tomcat "jakarta_service_date.log" file reports something along the lines of...
... which is not really all that helpful, either.
The fact that it can't create Java is not a really strong clue, so I start searching the Web for some solutions. Several people report running into this same problem, but solutions are not easily found—one web page reports that there's a missing "msvcr71.dll" file from the Windows installer installation script, and that copying the file into C:\WINDOWS\System32 fixes it, but when I go look in that directory, no dice—the DLL's there, and a quick "DUMPBIN" on the file reveals it looks good, no accidental file corruption or anything. Rats.
Maybe the problem's somewhere in the service configuration—it's possible that the Tomcat installer put the wrong configuration in or something. So I fire up the Tomcat configuration (tomcat6w.exe) from the "bin" directory, and just to be sure, I go hunting up the Service entry in the Registry (on the off-chance that the configuration utility is the source of the bug). Granted, this is kind of a stretch, but unfortunately, like I said, there's not much to go on. Sure enough, make a few changes (one of which is to tell the Tomcat native launcher to use the "server" VM, instead of the "client" VM, by default—why, oh why, hasn't Apache changed that yet?!?), verify that the changes are percolating all the way through into the Registry, and try kicking off the service. Still no luck. Still the same error.
While I'm rooting around in the Registry, I notice that there's another node in there that I'm not familiar with—the Wow6432Node. And buried underneath it (thank you, Registry Search, for finding this!) is a node for Apache Software Foundation/ProcRun2.0/Tomcat6, and a whole slew of configuration options under there, as well. Hmm. Errors in the ProcRun configuration perhaps? Sure enough... no, everything's working fine.
But now the synapses are firing in a different direction—the ProcRun bits are underneath the "Wow6432Node", and the "Wow" part of that name has me wondering—in the old 16-bit-to-32-bit transition Windows went through once before, "Wow" was an acronym for "Windows-on-Windows", meaning that the 32-bit version of Windows was opening up an emulation layer to run 16-bit programs. Given that this is an x64 image that I'm working with... is it that the service wants to be using the x64 version of Java rather than the 32-bit version I downloaded out of pure habit? Hmm. Go grab the x64 image, install it, and... still no love.
The WoW64 thing is still tickling at the back of my brain, though, and suddenly a new synapse fires off. If this is a 64-bit version of Windows, then there has to be.... Yep, sure enough, underneath the C:\WINDOWS directory there are not two, but three, "system" directories—the "C:\WINDOWS\System" directory that used to be the hangout place for 16-bit DLLs, the "C:\WINDOWS\System32" directory where 32-bit DLLs were encouraged to reside, and, just as pretty as you please, there it is, a "C:\WINDOWS\SysWOW64" directory, and inside there... no "msvcr71.dll". Copy the "msvcr71.dll" over from System32 into SysWOW64, and.... Voila. Service starts, log file looks good, and "localhost:8080" comes back with the Tomcat home page.
What have we learned from this little experience? A couple of things, some personal, some observational about the state of the universe and the industry:
- Tomcat still installs itself to depend on a JRE found elsewhere on the system. This isn't a problem, per se, but the Windows installer for Tomcat tries to discover the JRE to use to run the Tomcat bits, and usually comes up with the "public" JRE installed underneath C:\Windows\Java\... . Fact is, I would really prefer if Tomcat made use of a private JRE (one inside the Tomcat directory) rather than the "public" one—too many times an installer will take liberties with the public JRE, and as a general rule, I really don't want installers messing around with those settings or deployment picture (contents of jre/lib/ext, for example).
- I feel a little out-of-touch with x64 operating systems. Fact is, I have gotten a bit rusty on my operating system operation with respect to the 64-bit operating systems (Windows in particular), as highlighted by the fact that I really don't know what, if any, differences there are between the 64-bit version of a native executable and it's 32-bit cousin, or what the 32/64-bit transition story is. Anybody got any good book recommendations on the 64-bit Windows story?
- I feel a little out-of-touch with the Java 64-bit story. Same thing—anybody have a good overview of what's different between 32-bit and 64-bit Java on Windows, and more importantly, why, even now, when I switch back and try to run the 64-bit version of Java via the service, it fails (this time with a "not a valid Win32 image" error in the log file)? Is it worth it enough to try and diagnose/debug/develop a solution to let Tomcat run with the 64-bit version of Java instead of the 32-bit it's now using?
- The fact that this was harder to unearth via Google than usual bothers me a bit. Google usually helps with troubleshooting a lot more than it did, usually because commonly-hit errors and their fixes are reported all over the place, in blogs and forums and so on. The fact that there was relatively few hits (with potential solutions, anyway) makes me wonder: Are people not running Tomcat on Windows, not running Tomcat as a service on Windows, not running Tomcat on 64-bit Windows, or just not generally having problems? If you're running Tomcat on Windows, I'd love to hear your story.
- Diagnosing Windows services is still a pain. I was a heartbeat away from trying to debug the native parts of the Tomcat service, using either SysInternals' Process Explorer or Visual Studio itself, and really wished there was some better error-logging to indicate what the problem was so I didn't have to. Granted, from my time writing Windows services way back when, I remember there not being a lot that a service author can do to make that a more transparent experience, so I can't necessarily fault the authors of ProcRun, since they're (probably) faithfully reporting the return value of CreateProcess or LoadLibrary, but it's still frustrating and I think more information (maybe the return value of GetLastError?) might have helped out here a bit.
Meanwhile, my installations continue....