In Ultra ESB we use a special hot-swap classloader that allows us to reload Java classes on demand. This allows us to literally hot-swap our deployment units - load, unload, reload with updated classes, and phase-out gracefully - without restarting the JVM.
Windows: supporting the forbidden land
In Ultra ESB Legacy the loader was working fine on Windows, but on the newer X-version it seemed to be having some hiccups. We are not supporting Windows as a target platform, so it didn't matter much - until recently, when we decided to support non-production distros on Windows. (Our enterprise integration IDE UltraStudio runs fine on Windows, so Windows devs, you are all covered.)
TDD FTW
Fixing the classloader was a breeze, and all tests were passing; but I wanted to back my fixes up with some extra tests, so I wrote a few new ones. Most of these involved creating a new JAR file in a subditrectory under the system temp directory, and using the hot-swap classloader to load different artifacts that were placed inside the JAR. For extra credit on best practices, I also made sure to add some cleanup logic to delete the temp subdirectory via FileUtils.deleteDirectory()
.
And then, things went nuts.
And the tear-down was no more.
All tests were passing, in both Linux and Windows; but the final tear-down logic was failing in Windows, right at the point where I delete the temp subdirectory.
Being on Windows, I didn't have the luxury of lsof
; fortunately, Sysinternals already had just the thing I needed: handle64
.
Finding the culprit was pretty easy: hit a breakpoint in tearDown()
just before the directory tree deletion call, and run a handle64 {my-jar-name}.jar
.
Bummer.
My test Java process, was holding a handle to the test JAR file.
Hunting for the leak
No. Seriously. I didn't.
Naturally, my first suspect was the classloader itself. I spent almost half an hour going over the classloader codebase again and again. No luck. Everything seemed rock solid.
The "leak dumper"; a.k.a my Grim Reaper for file handles
My best shot was to see what piece of code had opened the handler to the JAR file. So I wrote a quick-n-dirty patch to Java's FileInputStream
and FilterInputStream
that would dump acquire-time stacktrace snapshots; whenever a thread holds a stream open for too long.
This "leak dumper" was partly inspired by our JDBC connection pool that detects unreleased connections (subject to a grace period) and then dumps the stacktrace of the thread that borrowed it - back at the time it was borrowed. (Kudos to Sachini, my former colleague-intern at AdroitLogic.)
The leak, exposed!
Sure enough, the stacktrace revealed the culprit:
id: 174 created: 1570560438355 --filter-- java.io.FilterInputStream.<init>(FilterInputStream.java:13) java.util.zip.InflaterInputStream.<init>(InflaterInputStream.java:81) java.util.zip.ZipFile$ZipFileInflaterInputStream.<init>(ZipFile.java:408) java.util.zip.ZipFile.getInputStream(ZipFile.java:389) java.util.jar.JarFile.getInputStream(JarFile.java:447) sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:162) java.net.URL.openStream(URL.java:1045) org.adroitlogic.x.base.util.HotSwapClassLoader.loadSwappableClass(HotSwapClassLoader.java:175) org.adroitlogic.x.base.util.HotSwapClassLoader.loadClass(HotSwapClassLoader.java:110) org.adroitlogic.x.base.util.HotSwapClassLoaderTest.testServiceLoader(HotSwapClassLoaderTest.java:128) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:86) org.testng.internal.Invoker.invokeMethod(Invoker.java:643) org.testng.internal.Invoker.invokeTestMethod(Invoker.java:820) org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1128) org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:129) org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:112) org.testng.TestRunner.privateRun(TestRunner.java:782) org.testng.TestRunner.run(TestRunner.java:632) org.testng.SuiteRunner.runTest(SuiteRunner.java:366) org.testng.SuiteRunner.runSequentially(SuiteRunner.java:361) org.testng.SuiteRunner.privateRun(SuiteRunner.java:319) org.testng.SuiteRunner.run(SuiteRunner.java:268) org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52) org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86) org.testng.TestNG.runSuitesSequentially(TestNG.java:1244) org.testng.TestNG.runSuitesLocally(TestNG.java:1169) org.testng.TestNG.run(TestNG.java:1064) org.testng.IDEARemoteTestNG.run(IDEARemoteTestNG.java:72) org.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:123)
Gotcha!
java.io.FilterInputStream.<init>(FilterInputStream.java:13) ... sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:162) java.net.URL.openStream(URL.java:1045) org.adroitlogic.x.base.util.HotSwapClassLoader.loadSwappableClass(HotSwapClassLoader.java:175)
But still, that didn't tell the whole story. If URL.openStream()
opens the JAR, why does it not get closed when we return from the try-with-resources block?
try (InputStream is = jarURI.toURL().openStream()) { byte[] bytes = IOUtils.toByteArray(is); Class<?> clazz = defineClass(className, bytes, 0, bytes.length); ... logger.trace(15, "Loaded class {} as a swappable class", className); return clazz; } catch (IOException e) { logger.warn(16, "Class {} located as a swappable class, but couldn't be loaded due to : {}, " + "trying to load the class as a usual class", className, e.getMessage()); ... }
Into the wild: JarURLConnection
, URLConnection
, and beyond
Thanks to Sun Microsystems who made it OSS, I could browse through the JDK source, right up to this shocking comment - all the way down, in java.net.URLConnection
:
private static boolean defaultUseCaches = true; /** * If <code>true</code>, the protocol is allowed to use caching * whenever it can. If <code>false</code>, the protocol must always * try to get a fresh copy of the object. * <p> * This field is set by the <code>setUseCaches</code> method. Its * value is returned by the <code>getUseCaches</code> method. * <p> * Its default value is the value given in the last invocation of the * <code>setDefaultUseCaches</code> method. * * @see java.net.URLConnection#setUseCaches(boolean) * @see java.net.URLConnection#getUseCaches() * @see java.net.URLConnection#setDefaultUseCaches(boolean) */ protected boolean useCaches = defaultUseCaches;
Yep, Java does cache JAR streams!
From sun.net.www.protocol.jar.JarURLConnection
:
class JarURLInputStream extends FilterInputStream { JarURLInputStream(InputStream var2) { super(var2); } public void close() throws IOException { try { super.close(); } finally { if (!JarURLConnection.this.getUseCaches()) { JarURLConnection.this.jarFile.close(); } } } }
If (well, because) useCaches
is true
by default, we're in for a big surprise!
Let Java cache its JARs, but don't break my test!
JAR caching would probably improve performance; but does that mean I should stop cleaning up after - and leave behind stray files after each test?
(Of course I could say file.deleteOnExit()
; but since I was dealing with a directory hierarchy, there was no guarantee that things would get deleted in order, and undeleted directories would be left behind.)
So I wanted a way to clean up the JAR cache - or at least purge just my JAR entry; after I am done, but before the JVM shuts down.
Disabling JAR caching altogether - probably not a good idea!
URLConnection
does offer an option to avoid caching connection entries:
/** * Sets the default value of the <code>useCaches</code> field to the * specified value. * * @param defaultusecaches the new value. * @see #getDefaultUseCaches() */ public void setDefaultUseCaches(boolean defaultusecaches) { defaultUseCaches = defaultusecaches; }
It would have been perfect if caching could be disabled per file/URL, as above; our classloader caches all entries as soon as it opens a JAR, so it never needs to open/read that file again. However, once a JAR is open, caching cannot be disabled on it; so once our classloader has opened the JAR, there's no getting rid of the cached file handle - until the JVM itself shuts down!
URLConnection
also allows you to disable caching by default for all subsequent connections:
/** * Sets the default value of the <code>useCaches</code> field to the * specified value. * * @param defaultusecaches the new value. * @see #getDefaultUseCaches() */ public void setDefaultUseCaches(boolean defaultusecaches) { defaultUseCaches = defaultusecaches; }
However, if you disable it once, the whole JVM could be affected from that moment onwards - since it probably applies to all URLConnection
-based implementations. As I said before, that could hinder performance - not to mention deviating my test from cache-enabled, real-world behavior.
Down the rabbit hole (again!): purging manually from the JarFileFactory
The least-invasive option is to remove my own JAR from the cache, when I know I'm done.
And good news, the cache - sun.net.www.protocol.jar.JarFileFactory
- already has a close(JarFile)
method that does the job.
But sadly, the cache class is package-private; meaning there's no way to manipulate it from within my test code.
Reflection to the rescue!
Thanks to reflection, all I needed was one little "bridge" that would access and invoke jarFactory.close(jarFile)
on behalf of me:
class JarBridge { static void closeJar(URL url) throws Exception { // JarFileFactory jarFactory = JarFileFactory.getInstance(); Class<?> jarFactoryClazz = Class.forName("sun.net.www.protocol.jar.JarFileFactory"); Method getInstance = jarFactoryClazz.getMethod("getInstance"); getInstance.setAccessible(true); Object jarFactory = getInstance.invoke(jarFactoryClazz); // JarFile jarFile = jarFactory.get(url); Method get = jarFactoryClazz.getMethod("get", URL.class); get.setAccessible(true); Object jarFile = get.invoke(jarFactory, url); // jarFactory.close(jarFile); Method close = jarFactoryClazz.getMethod("close", JarFile.class); close.setAccessible(true); //noinspection JavaReflectionInvocation close.invoke(jarFactory, jarFile); // jarFile.close(); ((JarFile) jarFile).close(); } }
And in my test, I just have to say:
JarBridge.closeJar(jarPath.toUri().toURL());
Right before deleting the temp directory.
So, what's the take-away?
Nothing much for you, if you are not directly dealing with JAR files; but if you are, you might run into this kind of obscure "file in use" errors. (That would hold true for other URLConnection
-based streams as well.)
If you happen to be as (un)lucky as I was, just recall that some notorious blogger had written some hacky "leak dumper" patch JAR that would show you exactly where your JAR (or non-JAR) leak is.
Adieu!