Having trouble viewing this email? Click here

So What *is* the Best-Case for Finding a Method via Reflection?

By Dr Heinz Kabutz

Abstract:
We now look at why the best-case scenario for a getMethod() call is O(n), not O(1) as we would expect. We also discover that the throughput of getMethod() has doubled in Java 9.

Welcome to the 254th edition of The Java(tm) Specialists' Newsletter. As a pimply-faced youth, when I needed to do some deep thinking, I'd gather my books and wander down to Botany Bay. I would sit on the rocks, watching the kelp bob up and down in the surf, and think. Def Leppard's "Pour Some Sugar On Me" bellowed from walkman into eardrums. It was the 80s after all. There's something about staring at salty water that seems to stir my creative juices.

Fast-forward thirty years to Crete. Since I am a "polyteknos" -- I have too many children -- we drive a Mercedes Viano. We use it on Sundays to cart the family off to church and a delicious lunch at Mitsos afterwards. The rest of the week it gathers dust, lots of. But not anymore. It is now my mobile executive suite. It looks rather conspicuous, all black with darkened windows. The previous owner chauffeured famous rock stars and royalty. It makes the perfect thinking place. I drive it to random beaches and sit in the back surrounded by Java books. There I philosophy about how the design patterns of old have a place in modern Java.

I will tell you more about my adventures in future newsletters.

Learning.JavaSpecialists.EU: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.

**So What is the Best-Case for Finding a Method via Reflection?**

As I mentioned yesterday, I was searching for examples of the GoF Builder in the JDK. I thought that perhaps reflection would use it. I knew that the Method returned by Class.getMethod() was a protective copy. Turns out, this is rather an example of the Prototype. In order to understand how Class.getMethod() worked internally, I stepped into the code with IntelliJ. Whilst doing so, I noticed that even when we had found the method we were looking for, it kept on searching. (BTW, if you love IntelliJ or are curious what it can do, be sure to get my free IntelliJ Wizardry Course).

When we call Class.getMethod(name,parameterTypes), it does a linear search through an internal publicMethods array, lazily created. The methods are not sorted in any way and, as I discovered in my tests, they are also not always stored in the same order. This is why a linear search is necessary. Most of the time, the number of methods is going to be fairly small for one class, although we do include superclass methods too. We would expect it to be less than 100 though. Is that reasonable? I did some searching through the Java 8 rt.jar file. I found almost 500 classes that had 100 or more methods, including those from superclasses. The worst two offenders were from CORBA, as you probably expected. The next 38 offenders were Swing or UI classes. The average number of methods per class is 25, which includes methods from Object. If we use getDeclaredMethods(), the average number of methods per class decreases to just 7.35.

Now comes the clincher: Even when we find a matching method in our jumbled up array, we don't stop searching. Why?

Java 5 gave us covariant return types. This means that if class A has method Object foo(), then subclass B can have method Number foo(). Same signature, different return type. A further subclass C of B could contain method Integer foo():

  public class A { public Object foo() { return new Object(); } } public class B extends A { public Number foo() { return new Integer(42); } } public class C extends B { public Integer foo() { return new Integer(42); } }

C.foo()'s return type must be assigneable from B. Thus we cannot make it return String. This also means that class C actually contains three methods foo():

  import java.lang.reflect.*; public class ReflectionTest { public static void main(String... args) throws NoSuchMethodException { System.out.println(A.class.getMethod("foo")); System.out.println(B.class.getMethod("foo")); System.out.println(C.class.getMethod("foo")); System.out.println("All methods in class C:"); for (Method method : C.class.getMethods()) { System.out.println(method); } } }

When we run this, we see the following output:

  public java.lang.Object A.foo() public java.lang.Number B.foo() public java.lang.Integer C.foo() All methods in class C: public java.lang.Object C.foo() public java.lang.Number C.foo() public java.lang.Integer C.foo() ... and all the Object methods

When I saw Class.searchMethods() in operation, I realized why they were continuing their search, even when they had found a good candidate. The reason is that there might be another method further down in the array that had a more specific return type. If we call C.class.getMethod("foo"), we want public java.lang.Integer C.foo()

Since the average number of methods per class is small, they decided to not bother sorting the array. A binary search would find the desired method in O(log n) steps. However, sorting would take O(n * log n) steps, so unless we did a lot of lookups, we would never regain the performance loss from sorting. A similar argument would apply for a HashMap.

Usually, the best-case scenario for a search is O(1). For example, if we search through an unsorted list, we would start at the beginning and scan through until the end. If we were lucky, the first element would be the one we were looking for and we could stop. Average case for an unsorted list would be O(n). On average we would need n/2 lookups, but since it is a constant multiplier of 0.5, we would only consider the slowdown as n became larger. Since the average time would be in relation to the size of the list, it would also be written as O(n). The worst-case is what we consider most of the time for searching, and that is also O(n) for an unsorted list. If we have a sorted ArrayList, then best-case would be O(1) and the average and worst cases would be O(log n). If this is all a bit much for you, make sure to get our Data Structures in Java 9 Course, which explains complexity in a lot more detail.

In Class.searchMethods(), the best-case is O(n), as are the average and worst cases. However, this is only since Java 5, when covariant return types were introduced into the language. Prior to that, the computational complexities were the same as for an unsorted list.

How can we demonstrate this beyond eyeballing the JDK code?

Here is a ClassGenerator that produces classes with increasing numbers of methods. I wrote the code in Java 1.4 syntax so that I could run it on an old machine. No generics, no Java 8 streams, no try-with-resource, no varargs. It felt a bit weird. Here we go:

  import java.io.*; public class ClassGenerator { public static void main(String[] args) throws IOException { for (int methods = 4; methods < 65536; methods *= 5) { PrintStream out = new PrintStream( new FileOutputStream("Methods" + methods + ".java")); out.println("public class Methods" + methods + " {"); for (int i = 0; i < methods; i++) { out.println(" public void foo" + i + "() {}"); } out.println("}"); out.close(); } } }

ClassGenerator spits out 7 classes with the largest containing 62500 methods. The JDK 1.4.0-b92 javac compiler choked with a StackOverflowError on the largest class, so I used the Java 8 compiler with -source 1.4 and -target 1.4. That worked.

To demonstrate the performance degradation post Java 4, I wrote the following class that tests how many times we can find the first, last and middle "foo" methods in our array.

  import java.lang.reflect.*; public class MethodLookupCost { public static void main(String[] args) throws Exception { System.out.println("Warmup"); for (int i = 0; i < 2; i++) { test(); } System.out.println(); System.out.println("Proper run"); for (int i = 0; i < 3; i++) { test(); } System.out.println(blackhole); } private static void test() throws Exception { for (int methods = 4; methods < 65536; methods *= 5) { test(Class.forName("Methods" + methods)); } } private volatile static Method blackhole; private static void test(Class clazz) throws Exception { System.out.println(clazz); System.out.println(clazz.toString().replaceAll(".", "-")); Method[] methods = clazz.getMethods(); int firstIndex = fooIndex(methods, 0, 1); Method first = methods[firstIndex]; int lastIndex = fooIndex(methods, methods.length - 1, -1); Method last = methods[lastIndex]; int middleStart = (lastIndex - firstIndex) / 2; int middleIndex = fooIndex(methods, middleStart, 1); Method middle = methods[middleIndex]; System.out.println("indexes = (" + firstIndex + ", " + middleIndex + ", " + lastIndex + ")"); System.out.println("foos = (" + first.getName() + ", " + middle.getName() + ", " + last.getName() + ")"); System.out.println(" first = " + methodFinds(first)); System.out.println(" middle = " + methodFinds(middle)); System.out.println(" last = " + methodFinds(last)); System.out.println(); } private static int fooIndex(Method[] methods, int start, int inc) { int idx = start; while (!methods[idx].getName().startsWith("foo")) idx += inc; return idx; } private static long methodFinds(Method method) throws Exception { return methodFinds(method.getDeclaringClass(), method.getName()); } /** * Counts how many times we can get the method in a second. */ private static long methodFinds(Class clazz, String method) throws Exception { long time = System.currentTimeMillis(); long methodFinds = 0; while (System.currentTimeMillis() - time < 1000) { blackhole = clazz.getMethod(method, (Class[]) null); methodFinds++; } return methodFinds; } }

The purpose of this benchmark was to confirm my suspicion that prior to Java 5, the best-case lookup cost of Class.getMethod() was O(1) and is now O(n). However, we need to consider the context of the benchmark. It is of almost no relevance in real code, since we will seldom have more than 100 methods in a class.

Here is the output from Java 9.0.4:

  class Methods4 -------------- indexes = (0, 1, 3) foos = (foo0, foo1, foo3) first = 10064862 middle = 10276604 last = 10112908 class Methods20 --------------- indexes = (0, 9, 19) foos = (foo13, foo8, foo4) first = 7009813 middle = 7290529 last = 7367027 class Methods100 ---------------- indexes = (0, 49, 99) foos = (foo25, foo74, foo4) first = 2238002 middle = 2166709 last = 3389455 class Methods500 ---------------- indexes = (0, 249, 499) foos = (foo25, foo249, foo499) first = 585933 middle = 399407 last = 407028 class Methods2500 ----------------- indexes = (0, 1249, 2499) foos = (foo25, foo1649, foo499) first = 72224 middle = 51273 last = 61625 class Methods12500 ------------------ indexes = (0, 6249, 12499) foos = (foo2796, foo9045, foo2795) first = 10671 middle = 10813 last = 10770 class Methods62500 ------------------ indexes = (0, 31249, 62499) foos = (foo2796, foo31545, foo2795) first = 1685 middle = 1174 last = 1732

Interesting is the output for Java 8. It starts off as half the throughput of Java 9 on my machine. At about 100 methods, they have the same throughput and then Java 8 ends up faster than Java 9. I haven't researched this thoroughly, but I suspect that the reason Java 9 is faster is that they don't call intern() on the method name on every single call. I'm not sure why Java 8 is faster after we get to 100 methods, but considering the average number of methods per class, I think that's a reasonable compromise.

  class Methods4 -------------- indexes = (0, 1, 3) foos = (foo0, foo1, foo3) first = 5711796 middle = 5611266 last = 5786185 class Methods20 --------------- indexes = (0, 9, 19) foos = (foo13, foo10, foo3) first = 5242688 middle = 5251910 last = 5254697 class Methods100 ---------------- indexes = (0, 49, 99) foos = (foo13, foo49, foo99) first = 3716464 middle = 3595559 last = 3628375 class Methods500 ---------------- indexes = (0, 249, 499) foos = (foo13, foo249, foo499) first = 1400637 middle = 1358047 last = 1380786 class Methods2500 ----------------- indexes = (0, 1249, 2499) foos = (foo13, foo1733, foo499) first = 228720 middle = 223003 last = 223820 class Methods12500 ------------------ indexes = (0, 6249, 12499) foos = (foo13, foo6733, foo499) first = 40339 middle = 40360 last = 40553 class Methods62500 ------------------ indexes = (0, 31249, 62499) foos = (foo12505, foo44215, foo62224) first = 5232 middle = 5391 last = 5397

I also ran this with Java 1.4.0 on a VMWare Fusion instance with Windows XP. We can see two main differences between pre and post Java 1.4. First is that the best case performance stays the same, irrespective of the number of methods. Second is that it appears that the Method[] is ordered, probably according to where the methods appear in the source file. In the Java 8 output above, in Methods62500 the first method starting with "foo" was foo12505(). The middle was foo44215(). And the last was foo62224(). Java 9 is even stranger, with first foo2796(), middle foo31545() and last foo2795(). Here is the output from Java 1.4.0:

  class Methods4 -------------- indexes = (0, 1, 3) foos = (foo0, foo1, foo3) first = 1243931 middle = 1328329 last = 1372984 class Methods20 --------------- indexes = (0, 9, 19) foos = (foo0, foo9, foo19) first = 1410311 middle = 1482650 last = 1301043 class Methods100 ---------------- indexes = (0, 49, 99) foos = (foo0, foo49, foo99) first = 1305948 middle = 1147514 last = 960986 class Methods500 ---------------- indexes = (0, 249, 499) foos = (foo0, foo249, foo499) first = 1232767 middle = 658452 last = 437830 class Methods2500 ----------------- indexes = (0, 1249, 2499) foos = (foo0, foo1249, foo2499) first = 1230703 middle = 251969 last = 131139 class Methods12500 ------------------ indexes = (0, 6249, 12499) foos = (foo0, foo6249, foo12499) first = 1177412 middle = 56583 last = 29448 class Methods62500 ------------------ indexes = (0, 31249, 62499) foos = (foo0, foo31249, foo62499) first = 1240943 middle = 12028 last = 5926

As we saw, in Java 1.4, when we didn't have covariant return types, we could return from our search as soon as we found our first matching method. There could be only one, so there was no point searching further. The throughput to find the first method thus stays consistent, irrespective of the size of the Method[]. Classic O(1).

We can draw several conclusions from this. Firstly, the cost of finding any method in a class is dependent on how many methods are visible, not where in the class the method is declared. Secondly, it appears that Java 9 has doubled the throughput of Class.getMethod() by getting rid of intern(), although that would need further research to confirm. Thirdly, there was a hidden cost in covariant return types that I had not considered up to now.

Kind regards from Crete

Heinz

If you no longer wish to receive our emails, click the link below:
Unsubscribe

Cretesoft Limited 77 Strovolos Ave Strovolos, Lefkosia 2018 Cyprus