Java 9 release has been unique in many ways compared to the other Java versions released earlier. For last twenty years being the leader of the top programming languages chart, revamping entire JDK to introduce modular programming isn't a cake walk. Java 9 not only has support for modular programming but also had got re-factored at the core level.
All Java 9 platform API's have been re-factored to adapt to modular programming paradigm.
Java 9 has not only brought changes that alter the way we implement java applications, but also made significant changes to JDK itself. Java platform Introduced module system through Project Jigsaw or JPMS ( Java Platform Module System) in Java 9, which allows developers to create modular Java applications.
Maintainability
Maintaining an enormous Java Application is an erroneous task. While multiple teams working parallelly on different sections of the application, constructing, and interconnecting limit the developers from maintaining the application in its entirety. Modularization focuses on constructing and maintaining the code by segregating into modules.
Memory footprint and resource consumption
Limiting the memory footprint and resource consumption allows modularized Java applications to scale down such that usage in small devices made easier than ever.
Security
Modularization enforces strong encapsulation of the code within the module and exposes only the required API’s giving no scope for the attackers to access the sensitive internals of the modules.
Supporting modularization was an ambitious goal considering the Java ‘s monolithic nature. Plus, provisioning backward compatibility despite entire revamp of the core library’s is long-drawn-out. Hence it took over Nine years and multiple deferments to support modularization in Java( Project Jigsaw was started in August 2018 and made available part of Java 9 release on Sep 21, 2017).
JDK is orchestrated into modules so that only the required modules for the application being implemented can be used and hence can be trimmed down than before. Earlier versions of JRE enforce entire installation. This had been a considerable issue as JRE was bundled with various libraries, tools and classes for that support numerous types of Java implementations. And, it is evident that for running a simple Hello World Java application, JRE doesn’t need to be bundled with libraries that deal with Swing/AWT library or CORBA. Isn’t it? And it has been insanity to bundle all unused classes in rt.jar irrespective of the application being deployed.
This is where modularization of JDK came into the picture. Project Jigsaw has significantly revamped the java at the core and also JVM level.
Java has been in the industry for over 20 years (Since 1995) and over last two decades plentiful of API’s have been added. And hence the Java run-time library rt.jar was considered fattier with size crossing 60MB and containing most of the run-time classes for Java. For Java 9, this approach had been redefined and entire JDK has been refactored into modules to bring real decoupling and flexibility in Java applications.
Alternatives to Modularization?
Is it just about avoiding unused libraries?Of course, there is more to it.
Modularity is not only for removing support for unnecessary technologies but also for providing security. There are technologies which are not legacy, but still unwanted based on the context of the use. For instance, a web application with REST API endpoints wouldn’t need GUI support with Swing, AWT or the latest JavaFX addition. But it still would be bundled as part of entire deployment prior to Java 9.
With regards security standpoint, allowing all classes to be part of deployment give attackers fair chance of access to sensitive classes. Rigorously encapsulating internal and sensitive classes within the JDK is a major improvement in Java 9 modularity revamp. Considering all this JDK itself required being modularized.
Modularity is about dissecting a monolithic application into individual modules and have the dependency that is explicitly defined on other modules. A Module consists of related code, metadata to describe itself, and its relation to other modules. An application contains multiple modules working together. Modules are available at compile as well as runtime. This enables developers to write reusable, flexible, secure and possibly lightweight code.
At the heart of modularization, every module has to fulfil three basic principles.
1. Strong Encapsulation
Encapsulating a module’s code from remaining modules of the application allows distinguishing between the code that is exposed and the code written to fulfill just internal responsibilities. Hence the classes which are not exposed cannot be used and therefore can undergo changes at will without impacting other modules.
2. Explicit Dependencies
In Modularization, it is mandatory to explicitly specify which classes and/or packages are shared with a specific module or multiple modules.Doing so avoids accidental dependencies on internal and sensitive classes.
3. Well-defined interfaces
Even Though a module is a self-contained with its encapsulated code, it couldn’t survive alone at most of the scenarios and should find a way to communicate with other modules of the application. To fulfil the same, in a modularized application each module defines and exposes interfaces that can be used by remaining dependent modules. This creates great flexibility in switching to alternative implementations for the exposed interfaces.Therefore, the dependent modules consuming the API’s wouldn't be aware of any change in implementation and continue to work without any issue.
Even Though, Encapsulation is one of the pillars of the Java programming language, prior to Java 9 achieving the same was under-fill. Encapsulation of a class through access modifier such as protected would allow access only within the defined package of the class. What if you would need to provide access to the protected class to outside the package, but just to one or two classes among many? That was unachievable. The remaining obvious way was to make change the access modifier of the class from protected to public. However, this time the class can be accessed by all the classes but not just the ones within the package, causing the weak encapsulation.
One other concern with prior Java versions was to depend on compile time imports. There was no way at runtime to identify which dependency is routed from which jar. We need to depend on external tools such as Maven to overcome this disability of Java.
Jar’s seemed to be a close alternative to Modules prior Java 9. But careful analysis reveals that jar’s do not provide true encapsulation. For instance, consider the internal API of the JDK such as sun.misc.Unsafe which is intended to be used by only the core Java classes. Yet, this can be accessed if needed despite the warning from JDK authors. This evidently states that true encapsulation was clearly missing prior to Java 9.
Another downside that has been dealt by Java developers was the quite popular classpath hell. As known, the classpath is used by JRE to load classes at run-time. This process happens through Dynamic Linking and Resolution. For simplicity, assume that a classpath denotes a list of classes to be loaded to run the given application. From this list, JVM looks for a required class in a sequential fashion.
Dynamic Linking & Resolution
The class files generated after compilation time maintain symbolic connections to one another application classes and to the Java API classes. JVM loads the classes and connects them through a process called Dynamic Linking. At runtime, the JVM builds an internal web of interconnected classes and interfaces.
A class file keeps all its symbolic references at the constant pool. Each class file has a constant pool, and each class or interface loaded by the Java virtual machine has an internal version of its constant pool called the runtime constant pool. The runtime constant pool is an implementation-specific data structure that maps to the constant pool in the class file. Thus, after a type is initially loaded, all the symbolic references from the type reside in the type's runtime constant pool.
At some point during the running of a program, if a particular symbolic reference is to be used, it must be resolved. Resolution is the process of finding the entity identified by the symbolic reference and replacing the symbolic reference with a direct reference. Because all symbolic references reside in the constant pool, this process is often called constant pool resolution.
Although JVM is permitted to perform resolution at different times, namely early resolution and late resolution, during the execution of the program, every JVM must give an impression that it uses late resolution. Despite the choice of the JVM resolution, it does always throw an error when attempting to resolve the classes' symbolic reference is failed at the time of usage for the first time. So, nevertheless, the time of the resolution, error would be thrown only when something in the class file is actually used for the first time and symbolic reference of the class file has been missing. Even Though the information is available at early resolution.
This is actually error-prone at run time and there is no way JVM verifies classpath completeness at the start of the application. As it doesn’t bother to throw an exception unless the class being used.
Another possibility of classpath errors to elude down to the wire is when there are multiple versions of the same library are in use and leading to Intermittent errors at run-time. Various classes might be using both versions of the library and it could slip through your notice. During classpath resolution, it is uncertain which version of the class would be picked, due to Java’s specification to pick the first encountered class, and the loaded class might be incompatible with the code that is using a class from a different version of the library. This error occurrence is uncertain and inconsistent and caused by JVM’s impression of late resolution. This disrupts the entire execution and causes run-time exception called ClassNotFoundException.
Partial modularization of Java was introduced in Java 8 with a concept called Compact Profiles. By definition, a Compact profile is a subset of Java SE Platform API. Using compact Profiles JAVA SE 8 platform was broadly modularized.
There are three types of Compact Profiles available: Compact1, Compact2 and Compact3. Each compact profile contains a subgroup of packages.And, each profile is a superset of preceding one.Profile 1 is the smallest of all the three supported following by Profile 2 and Profile 3. In addition, Profile 2 contains all packages of Profile 1 and few distinctive and specific packages specific for itself. Similarly, Profile 3 contains Profile 2 and distinctive packages specific to Profile 3.
In other words, Each profile is a super-set of the profile with the lower number. The following illustrates the same concept pictorially.
Fig 1: Representation of Compact Profiles in Java SE 8.
From Java SE 8 documentation we can observe what modules are categorized into each profile.
Fig 2: Categorization of compact profiles
Using a specific compact profile instead of entire Java SE API would reduce the memory footprint and consequently adapt more towards resource-constrained environments. However, if at least one package outside the targeted profile needs to be used, then there is no alternative than to include the entire profile in use. Hence causing unnecessary additions to the run-time. Java 8 introduced tools such as jdeps and jcreate using which we can identify the minimal compact profile needed to run the program. For compiling reasons javac comes with -profile option which can be used to specify the targeted profile to compile the java application.