Java through the Eyes of a C++ Programmer


Suchitra Gupta
Jeff Hartkopf
Suresh Ramaswamy

This article was originally published in 1997 in Java Developer's Journal in two parts, Volume 2, Issues 1 and 2, by Sys-Con Publications.


Introduction

Much of the excitement about Java comes from C++ programmers looking for a better way to develop software. In this article we take an in-depth look at Java from the perspective of a C++ expert who wishes to quickly come up to speed and transfer hard-gained knowledge of C++ to the new language.

Our purpose is not to disparage any particular programming language. Contrary to what some would have you believe, Java is not the answer to all the world's problems, and every language has its place. However, Java provides an elegant solution to many problems in computing, and we feel that an objective comparison of C++ and Java will both assist C++ experts in more quickly adapting to a new programming paradigm, and identify the strengths and weaknesses of each language.

This article examines various object-oriented concepts as well as other programming language concepts. Each concept has its own section where we first present the concept in commonly-used terms. Then, given this context, we compare how ANSI C++ and Java 2 implement or can be used to implement the concepts. We briefly define the terminology as we use it, and provide a complete glossary at the end.

Our diagrams are intended to be language-neutral, and use a variant of the Object Modeling Technique (OMT) notation as summarized in Figure 1. Each class is represented by a box, with solid lines between the boxes denoting inheritance. The class name appears at the top of the box in bold, and begins with an uppercase letter. Operations and attributes appear below, their names beginning with lowercase letters. Subclasses show only newly added and overridden operations. The names of abstract classes appear in italics, and the names of pure abstract classes appear in small capitals. The names of abstract operations appear in italics as well. Pseudocode is given for some operations in a dog-eared box connected by a dashed line.

Figure 1: Notation Summary

Classes

A class is a template which defines an object's interface and implementation, and every object has exactly one class; thus, an object is sometimes referred to as an instance of a class. A class is a construct to introduce user-defined types. It is the basic unit of data encapsulation and for supporting data abstraction. Access control for operations and data is also specified in a class.

An abstract class is a class whose primary purpose is to define a common interface for its subclasses. Typically an abstract class defers the implementation of at least one of its operations to its subclasses. An abstract class may not be instantiated. Instantiating a concrete class, a class that implements all its operations, yields an object.

An interface from an object-oriented perspective is a set of operation signatures. It is possible for an object to support more than one interface; thus, an object may be of more than one type. A type is a name used to denote a particular interface. Interfaces are discussed further in the Inheritance section.

Java and C++ share the same meaning of a class in the above sense. Java offers an additional construct called interface that primarily specifies a public interface. It contains principally public operation signatures. It can optionally contain public constants that are not instance specific. An interface is an alternate way in Java to introduce user-defined types. It is a pure abstract class that is discussed further in the Inheritance and Multiple Inheritance sections.

In C++, non-nested classes are visible in namespace scope whereas in Java, non-nested classes are visible in package scope. Constructors can be specified in Java à la C++. There is no need for destructors in Java because it does automatic garbage collection. See the section Object and Class Initialization and Finalization for more information.

Java does not support enumerated types. Java also does not support the equivalent of the C++ struct which is an artifact preserved in C++ for compatibility with C. Java makes a clean break and does not carry with it such encumbrances from the past while it attempts to build on the best features from C++, Smalltalk, and other object-oriented languages.


Inheritance

Inheritance is a key object-oriented concept which allows for an interface, implementation, or class to be defined in terms of one or more other interfaces, implementations, or classes, respectively. Inheritance is a mechanism for factoring out common aspects of diverse types in a supertype. The types closer to the root of an inheritance hierarchy capture general behaviors and states. The more derived types exhibit more specialized behavior.

Interface inheritance defines one interface in terms of one or more existing interfaces. In interface inheritance, the subtype is said to inherit its interface from its supertype(s). An object that supports the interface of the subtype can be used in place of an object that supports the interface of the supertype. Implementation inheritance defines the implementation of one class in terms of the implementation of another, and is a mechanism for code and data reuse. Class inheritance combines interface and implementation inheritance; it defines one class in terms of one or more existing classes. In class inheritance, the subclass is said to inherit its interface and implementation from its superclass(es). Inheritance from a single parent is called single inheritance, while inheritance from more than one parent is called multiple inheritance.

Both Java and C++ support the notion of inheritance. A potential source of confusion is that often the term inheritance is used to mean class inheritance specifically. This came about because class inheritance is the only kind of inheritance C++ supports. In this article, we are careful to use the term inheritance in its generic sense. We say that both C++ and Java support multiple inheritance. C++ supports multiple class inheritance only, while Java does not support multiple class inheritance, but does enable multiple interface inheritance through the use of its interface construct.

Because an object may inherit more than one interface through interface inheritance, its total interface may actually be made up of more than one interface, each of which has a name and is thus a type. Therefore, as mentioned earlier, an object may be of more than one type. The ability to transparently substitute one object for another, because they have an interface in common (hence having a type in common), is another key object-oriented concept known as polymorphism.

Java supports interface inheritance and class inheritance. For interface inheritance, a class may implement zero or more interfaces via the implements keyword. Additionally, an interface itself extends zero or more interfaces. For class inheritance, a class may explicitly extend a single class via the extends keyword. If no extends is specified, it implicitly extends Object. Thus, every class and interface extends zero or more interfaces, and every class except Object extends exactly one other class, allowing single class inheritance, but multiple interface inheritance.

Interface and Implementation Inheritance

Table 1 summarizes the inheritance mechanisms in C++ and Java, and the effects of those mechanisms on the subtype. In particular, for each inheritance mechanism, the table shows whether the interface is inherited and whether the implementation is inherited by the subtype.

Inheritance mechanism Interface inherited? Implementation inherited?
C++ public Yes Yes
C++ private No Yes
Java extends Yes Yes (if class extends class)
Java implements Yes No
Table 1: Comparison of C++ and Java Inheritance

The equivalent of C++ private inheritance is not supported in Java. Instead, to reuse the implementation of another type in Java without subtyping, one can use composition to hold an object of the type whose implementation is to be used. Note that if a Java interface extends another interface, there is no implementation to inherit.

Abstract Classes

As mentioned, an abstract class primarily defines a common interface for its subclasses, and generally defers part of its implementation to them. An abstract class may not be instantiated. A pure abstract class is an abstract class with no implementation for any of its operations, and no state.

For a class to be abstract in Java, the class must be explicitly declared as such with the abstract keyword. The abstract class and interface constructs capture the notion of an abstract class and a pure abstract class, respectively. An abstract class in C++ is a class with at least one pure virtual function, while a pure abstract class is a class with only pure virtual functions and no state. In Java, the abstract modifier is implied for an interface and is optional in the declaration. The only kind of data that an interface can have is public, static, and final data. It is useful to use interfaces when designing class libraries so as to allow for maximum flexibility for future modifications and reuse, as explained below.

The preferred way to capture a set of operations that specify a protocol is via an interface in Java and via a class containing only pure virtual operations in C++. Using a C++ class requires more discipline, since the compiler does not require one to have only pure virtual operations in such a class. The Java interface, on the other hand, captures the designer's intent of having only abstract operations in the class. Any attempt to add data or operation implementation results in a compile error. As the application changes over a period of time, the interface construct ensures that one does not inadvertently forget the original intent of the designer and add a non-abstract method. This is a possibility in C++ where one may add a function that is not pure virtual to a pure abstract class.

A pure abstract class that implements a protocol provides good insulation between classes that depend on it since the compilation unit that defines a class need only import the pure abstract class itself and not any additional baggage in the form of implementation and dependencies thereof. Insulation reduces compilation dependencies. This reduced coupling results in faster compilation and improved testability of the design since modules are better separated. When a Java interface is used to implement a protocol, it ensures that the benefit obtained from insulation is preserved as the application evolves, since a Java interface will stay as such.

To view these concepts in an example, consider a type hierarchy as shown in Figure 2. Animal and TerrestrialAnimal are pure abstract classes that represent animals and terrestrial animals, respectively. Reptile is an abstract class subclassing from TerrestrialAnimal. Alligator is a concrete class that implements all the operations that it specifies and inherits. The most general and abstract behaviors are captured in interfaces and classes close to the root of the hierarchy and the more specific and concrete classes are towards the bottom of the hierarchy.

Figure 2: Animal Type Hierarchy

In Java, Animal and TerrestrialAnimal could be implemented using the abstract class construct or the interface construct. Using the interface construct has the advantage that if, at a later time, a separate class such as Cockroach that derives from Arthropod, an abstract class, was required to also support the operations of TerrestrialAnimal, it could subclass from TerrestrialAnimal since TerrestrialAnimal is an interface. If TerrestrialAnimal had been an abstract or concrete class, this subclassing would not be possible since Java allows one to subclass from exactly one class and zero or more interfaces. Thus having a pure abstract class implemented as a Java interface instead of as a Java abstract class leaves the door open for future inheritance from that interface. The Java interface construct thus enables reuse and extensibility.

Casting

A subtype can be assigned to a supertype in both Java and C++. Both languages support type-safe downcasting, in which the supertype can be checked at run time before casting it to the subtype. The example below illustrates how this is done in both languages. We assume that Frog inherits from TerrestrialAnimal.

Java:

TerrestrialAnimal t = new Frog();
if (t instanceof Frog)
    Frog f = (Frog) t;

C++:

TerrestrialAnimal *t = new Frog();
Frog *f = dynamic_cast<Frog *> t;
if (!f) {
    // cast failed
}

In Java, an exception is thrown if a cast is bad, whereas in C++ a zero is returned by the cast operator.

The C++ standard specifies other type cast operators in addition to dynamic_cast. The Dynamic Behavior section contains more details on dynamic type information.


Multiple Inheritance

Multiple inheritance is inheritance from multiple sources. Multiple class inheritance is useful when a class needs to inherit data and behavior from two or more classes. Multiple interface inheritance occurs when a type inherits interface only from two or more types.

Multiple inheritance has acquired something of a bad reputation because of C++ support for only multiple class inheritance and not multiple interface-only inheritance. The notoriety of C++ multiple class inheritance stems from the complexity it introduces in the form of ambiguity in the access of derived data and methods from within a class; the possibility of having multiple copies of an ancestor class in a subclass if there is more than one way of reaching the ancestor class by traversing the inheritance hierarchy; and the additional performance penalty for virtual functions.

Java has made things simpler by disallowing multiple class inheritance. Instead it allows inheritance from exactly one class and zero or more interfaces. This leaves the door open for a class to implement functionality from multiple interfaces. Java thus supports multiple interface inheritance.

Figure 3: Animal Type Hierarchy Showing Multiple Inheritance

Figure 3 demonstrates two common ways of using multiple inheritance:

  • To provide mixin behavior by derivation from a mixin class, which is an abstract class whose role is to provide an optional interface or functionality to other classes. A subtype is predominantly of a certain base type from which it is derived, but also exhibits optional behavior that is specified in a mixin type. For example, Amphibian derives from Nocturnal which is a mixin type since it specifies behavior that is optional for an Amphibian. This means that the intrinsic characteristics of Amphibian are not affected by virtue of deriving from Nocturnal. Once Amphibian derives from Nocturnal, all instances of type Amphibian will support the Nocturnal behavior interface. Mixin types can be shared with other type hierarchies since they are not tightly coupled to a single type hierarchy.
  • To provide behavior essential to a type, parts of which are distributed across two or more base types. For example, Amphibian derives from TerrestrialAnimal and AquaticAnimal, since an Amphibian is both, equally.

If the example shown in Figure 3 were implemented in C++, Animal, TerrestrialAnimal, and AquaticAnimal would be abstract classes. If the Animal class were to declare attributes and was not declared as a virtual class, then these attributes would show up twice in Amphibian. If TerrestrialAnimal and AquaticAnimal each had an additional attribute with the same name, accessing them in Amphibian would result in ambiguity as well. Neither problem occurs in Java, because Animal, TerrestrialAnimal, and AquaticAnimal would be interfaces and having attributes in them would be prohibited by the compiler. By disallowing the presence of attributes in an interface and by preventing inheritance from more than one class, Java provides a simpler model while slightly limiting some possibilities that have proven to be more problematic than useful with C++. In Java, multiple interface inheritance is possible, but not multiple class inheritance.

If both TerrestrialAnimal and AquaticAnimal have an operation that is identical, then Amphibian or one of its derived types can implement such a method just once and satisfy both interfaces. Only if there is a semantic difference in the two interface operations will there be a problem. However this is not a new problem introduced in Java.


Operations

Operations, also known as methods or member functions, are the behavioral part of an object. Every operation has a signature, which specifies its external interface, and an implementation. Instance operations are operations on objects, while class operations are operations on classes. An instance operation can be invoked only through a particular object, but a class operation can be invoked either through an object or its class. In fact, class operations may be invoked even when no instances of the class exist.

Operations may be overridden by subclasses, which means that a subclass can replace the implementation of its superclass's operation with its own. An abstract operation is one without any implementation; hence, subclasses of the class containing the abstract operation that wish to be instantiable must provide an implementation.

Operations may also be overloaded, which means that there may be more than one operation with the same name but different signatures in the same class.

Operation Overriding

An operation in C++ may be marked virtual to indicate that it can be overridden. Virtual operations are bound at run time. The actual operation implementation invoked is determined by the class that was instantiated to create the object being referenced. If a nonvirtual operation is redefined, there is no dynamic binding, so if an object of one type is assigned to a variable which has been declared with the type of one of its superclasses, the implementation of the superclass will be invoked.

Figure 4: Operation Overriding

For example, given the inheritance hierarchy shown in Figure 4, consider the following C++ code fragment:

A *a = new B();
a->foo();

If foo is a nonvirtual operation, A's implementation of foo is invoked. If foo is virtual, B's implementation is invoked. Notice for nonvirtual operations, the implementation invoked is determined by the type of the identifier through which object is referenced, and for virtual operations it is determined by the class from which the object is instantiated.

C++ provides this level of control primarily because virtual operations are less efficient, so savings may be realized by not using them where their dynamic behavior is not needed. All Java instance operations behave like C++ virtual operations.

Both C++ and Java provide a way to invoke the implementation provided by a superclass which has been overridden. Java provides the super variable for this purpose. In the above class hierarchy, A's implementation may be invoked in B as super.foo(). In C++, the operation can be qualified by its class name. If foo is virtual then A's implementation of foo may be invoked in B as A::foo(). In fact, in C++ the superclass's operation may be invoked from anywhere, not just a subclass, as follows:

B *b = new B();
b->A::foo();

To achieve a similar effect in Java, an operation would need to be added to B, say superFoo, to invoke the superclass's operation using super. Although this is an intrusive technique that requires modification of the subclass, it does have the advantage of not requiring knowledge of the superclass name as in C++.

Operator Overloading

One nice feature of C++ is operator overloading, whereby the standard operators (+, *, %, <<, and so forth) can be overloaded to have a different meaning when applied to a class than to a primitive data type. Although this feature can certainly be abused, it makes numerical programming very convenient. Consider the following C++ example; only the second technique shown would be legal in Java.

// instantiate n by n matrices A, B, and C
...

// technique 1: legal only in C++
C = A * B;      // matrix multiplication returning new matrix
C *= B;         // in-place matrix multiplication
C *= 2;         // in-place multiplication by a scalar

// technique 2: legal in C++ and Java
C = A.mul(B);   // matrix multiplication returning new matrix
C.mulBy(B);     // in-place matrix multiplication
C.mulBy(2);     // in-place multiplication by a scalar

The first technique is more natural and convenient, because it uses the standard arithmetic operators for performing three different types of matrix multiplication. Unfortunately, Java does not allow operator overloading, although the designers of Java used the + operator themselves to allow string concatenation.

Operation Chaining

Operation chaining provides the ability for a subclass to override an operation defined by its superclass, augmenting the superclass's implementation instead of totally replacing it. For example, consider a hierarchy of persistent classes, shown in Figure 5. Inheriting from mixin interface Persistent, class A provides its own implementation of the write operation. Class B overrides this operation and does backward chaining by calling its superclass's write. Class C does not define any new persistent attributes and relies on the write operation implemented by B. Class D overrides the operation and also does backward chaining by calling its superclass's write. Since C does not provide its own implementation of write, when D calls write, the implementation provided by B is invoked.

Figure 5: Backward Chaining of Operations

In Java, B and D can invoke their superclass's implementation as super.write(). In C++, the same can be accomplished as B::write() and D::write(), respectively.

In both C++ and Java, constructors are backward chained. In Java, a constructor can explicitly invoke the constructor of its superclass by using super() as its first statement. If not explicitly invoked, Java implicitly inserts a call to the default constructor of the superclass. In C++, the constructor of the superclass can be explicitly specified in the initialization list; if not specified explicitly, the compiler implicitly invokes the default constructor. Because of multiple class inheritance and virtual derivation rules, determination of constructor chaining order is more involved in C++. In Java, these issues do not exist because multiple class inheritance is not supported.

In C++, a destructor must be declared virtual to be backward chained. In Java, every finalize method must explicitly call its superclass's finalize for backward chaining to occur.

Abstract Operations

An abstract or "pure virtual" operation may be created in C++ by appending "= 0" to the operation declaration. Java provides the keyword abstract for the same purpose. Any class with one or more abstract operations is automatically an abstract class. In Java, the abstract keyword may optionally be used for a class to clarify that notion. In addition, Java allows a class with implementations for all its operations to be declared abstract, which prevents it from being instantiated.

Class Operations

Both C++ and Java provide class operations by using the keyword static. In both languages, when a subclass redefines a class operation defined by a parent class, no dynamic binding is performed. The behavior is the same as for C++ nonvirtual operations, as described in the Operation Overriding section.

Final Operations

Java provides the keyword final as a way for an instance operation to declare that it may not be overridden. This is enforced by the compiler. C++ provides no such mechanism. In addition to the added semantics final provides, the compiler may inline final methods for greater efficiency.

Parameter Passing

All parameters in C++, objects as well as primitive types, may be passed by value or reference.

One of the limitations of Java is that primitive types may only be passed by value, and objects may only be passed by reference. To achieve the semantics of passing an object by value, the object must be cloned before manipulating it in the operation it is passed to. There is no simple way to achieve reference semantics for primitive types, which is somewhat annoying because there are times when it is useful to pass a primitive type by reference. The only workaround is to use the class that wraps each primitive, such as Integer for int, and pass an object of that class instead of the primitive type.

In C++, but not Java, a parameter may have a default value, so that all possible parameters need not be specified by the caller. This allows for flexible operations whose more esoteric parameters may be learned as needed. A similar effect may be achieved in Java (and C++) by overloading an operation with progressively more parameters.

Return Types

In both Java and C++, the void keyword may be used for the return type to indicate that an operation has no return value. In C++, if no return type is specified, int is assumed. However, in Java, the return type must be specified.


Object and Class Initialization and Finalization

Objects and classes both need to be given the opportunity to initialize their contents before they are used, and to clean up before they are destroyed. For objects, this is typically done using constructors and destructors.

Because Java has garbage collection, destructors are not needed. However, there is still the need to allow an object to be cleaned up before being deallocated---for example, to release resources other than memory, such as file handles. To allow for this, Java classes may have a finalize operation. This operation is invoked automatically when the object is about to be garbage collected.

In Java, one constructor can call another constructor in the same class using this, passing the arguments for the desired constructor. As with chained constructors, this statement must occur first in the constructor. This is not possible in C++.

C++ allows very limited class initialization capability---class attributes may be initialized only to constant values, and the initialization must occur outside the class definition.

Java allows more general class initialization. Just as Java has constructors for initializing instance variables when an object is created, it also has static initializers for initializing class variables. Static initializers are class operations with no name, parameters, or return value that are invoked when a class is loaded. Static initializers can be very useful when any form of computation is needed for initialization, such as to initialize an array in an algorithmic fashion. This is shown in the following example, which initializes a large matrix to the identity matrix.

public class Matrix {
    private static int identityMatrix[8][8] = new int[8][8];

    static {
        for (int i = 0; i < 8; i++)
            for (int j = 0; j < 8; j++)
                // initialize diagonal elements to 1
                identityMatrix[i][j] = (i == j ? 1 : 0);
    }

    // ...
}

Class attributes are initialized when the class is loaded. In C++, the order in which classes with static components are initialized is indeterminate, because all the classes are loaded in before main is called and the dependencies between the classes are not known at this time. Java loads classes on demand and thus the order in which classes are loaded is determinate---the order is based on the dependencies between the classes.


Memory Management

Every substantial program needs to manage dynamic memory. Memory management in C++ is done via pointers to raw memory. Java does not provide access to raw memory; it only provides symbolic handles to memory in the form of object references.

Object Allocation and Deallocation

C++ objects can be allocated on the stack or the heap. In C++, the programmer is aware of this distinction, and must handle each case differently. At run time, when an identifier of a particular class type comes into scope, the memory is allocated on the stack to store an object of that type. Objects allocated on the stack are automatically deallocated when they go out of scope. Objects are allocated on the heap by using the new operator, which returns a pointer to raw memory. Whenever an object is allocated, either automatically on the stack or on the heap by invoking new, the object's constructor is invoked. If an object is allocated on the heap, the programmer is responsible for deleting the object using delete. Whenever an object is deallocated, either automatically or with delete, the object's destructor is first invoked to allow internal cleanup and deletion of aggregate variables.

Java provides only one way of allocating objects. No storage is allocated for an object when an object reference is declared. The programmer is responsible for allocating storage using the new operator. Java's new operator returns an object reference as opposed to a raw memory pointer. However, since Java uses garbage collection, there is no need to explicitly deallocate objects---hence there is neither the need for destructors nor the delete operator.

Garbage Collection

Java provides automatic garbage collection, which eliminates the need to deallocate storage explicitly as in C++. The Java garbage collector runs as a separate low-priority thread collecting objects that are no longer needed.

Custom Memory Management

C++ allows the programmer to overload the new and delete operators to provide a custom memory management scheme. Java does not allow overloading of operators, including new. Although the ability to overload new might occasionally be useful, it is most useful when memory allocation primitives are provided in which case it can be used to implement a custom memory management scheme. Java does not provide such primitives for security reasons, so allowing overloading of new would complicate the language needlessly.


Memory References

Memory references can be put into two categories: data references and operation references. At run time, a data reference points to a block of data. An operation reference points to the executable code in memory that corresponds to an operation.

C++ provides access to raw memory in the form of pointers. Java does not provide pointers to raw memory---its model for allocating and referencing memory hides raw memory from the programmer. This model presents the raw memory to the programmer as objects and symbolic handles to them in the form of object references.

Data References

In C++, data references (pointers) are needed to allocate and deallocate memory off the heap, to access the data in this allocated memory, to manipulate arrays, and to implement data structures such as trees and linked lists.

The model for allocating and referencing memory provided by Java does not permit access to raw data memory---data may be accessed only through symbolic handles. Arrays are provided as objects, and array elements are accessed using indices. Consequently, there is no need for pointers for arrays.

In Java, the resolution of a symbolic handle to a memory address takes place at run time. This indirect memory access model allows the language to support compile-time type safety and the run-time verification of object access code. This model also eliminates the possibility of having corrupted memory which is a major problem in C++.

Operation References

C++ allows the programmer to take the address of a function and pass it around as a function pointer. The function can then be invoked by dereferencing the pointer. Function pointers are often used to implement callbacks. For example, consider a C++ function prototype for generating random numbers between a minimum and maximum value:

double (*getRandomNum)(double min, double max);

Now consider a method for estimating the mean value for the random numbers generated by a random number generator passed as an argument:

double mean(double (*getRandomNum)(double, double), double min, double max);

Any random number generator having the prototype of getRandomNum can be passed to the function mean. The function mean can use this random number generator to generate a sequence of random numbers which can be used to estimate the value of the mean.

In Java, the same effect can be achieved by defining an interface as follows:

interface RandomNumberGenerator {
    double getRandomNum(double min, double max);
}

A class designed for generating random numbers can be implemented using this interface. For estimating the value of the mean, a class MathUtil containing a class operation mean can be defined:

class MathUtil {
    public static double mean(RandomNumberGenerator randGen,
        double min, double max);
}

The class operation mean accepts an instance of any class that implements interface RandomNumberGenerator. For example, if a class UniformRand for generating random numbers with uniform distribution implements the interface RandomNumberGenerator, the operation mean can be invoked as follows:

double estimatedMean = MathUtil.mean(new UniformRand(), 5.0, 10.0);

It is possible to take a similar approach for implementing callbacks in C++. However, a large number of C++ class libraries rely on function pointers because it results in fewer classes and can be more efficient. Likewise, it is possible to get an indirect reference to a method using Java's reflection API, through which the method may be invoked. However, this is more expensive, and is neither object-oriented nor type safe.


Errors and Exceptions

Exceptions allow change in the normal flow of control in a program when some abnormal event, usually an error, occurs. They are useful for writing safe programs by providing a well-defined alternate flow of control to deal with errors.

The exception handling mechanism provided by Java is similar to that in C++. When a program does something illegal or an abnormal run-time condition occurs, the runtime can raise an exception. The program can also raise an exception explicitly using the throw statement. Both languages provide a mechanism for detecting and handling exceptions.

Resource Deallocation

In C++, when an exception is thrown, the runtime searches for a handler up through the calling chain. The call stack is unwound to the stack of the function or block that contains the handler. In the process, destructors for objects that are allocated on the stack are called automatically as they are removed from the stack. However, this is not the case with objects dynamically allocated on the heap. The burden of keeping track of dynamic memory and deallocating it is on the programmer. Generally, this task is accomplished by using smart pointers and proxies such as those provided by the ANSI C++ Standard Library. Since Java does automatic garbage collection, this problem does not exist.

Threaded Exceptions

Multithreading is provided as part of the Java language, and the Java exception handling mechanism is thread-aware. When an exception is thrown in a thread, the runtime searches for a handler up through the calling chain for that thread. If no handler is found, the uncaughtException method of class ThreadGroup for that thread group is called and the thread terminates. When an exception is thrown, locks held by the abruptly terminating synchronized blocks and operations are released. Threads are not part of the C++ language, and thus multithreaded exceptions are not part of its specification.

Exception Hierarchy

C++ provides an exception class hierarchy of default exception types, but in fact any variable, of a class or primitive type, may be thrown as an exception. Java also provides an exception class hierarchy, but requires that every exception be an instance of Throwable, the root class in this hierarchy, or one of its subclasses.

In C++ if an exception is not caught at a given level in the calling chain, it is passed on to the caller. If the exception is not caught at any level, the program will terminate. Operations may optionally declare exceptions they can be expected to throw. The same is true of Java, except that Java has two types of exceptions: checked and unchecked. Unchecked exceptions behave like C++ exceptions. However, when an operation raises a checked exception, it must declare it as part of its signature, or a compile-time error will occur. Such a declaration makes it known to the caller that it must either handle the exception or declare it in its own signature, thereby propagating it to its caller. Using checked exceptions is recommended because the compiler ensures that the programmer must consider each exception raised by every operation that is called and make a conscious decision as to how to handle it. C++ has no such compile-time checking.

Java provides another orthogonal classification for exceptions, recoverable and irrecoverable. Irrecoverable exceptions are used to signal an abnormal situation from which the application cannot recover, such as a virtual machine error. Even though an application can catch these exceptions, it is not recommended. Recoverable exceptions are used in all other situations.

The Java exception hierarchy is shown in Figure 6. Exception classes predefined in Java are shown as solid boxes, while the dotted boxes show the checked/unchecked and recoverable/irrecoverable status of exceptions derived from various points in the hierarchy. The names of the provided classes are somewhat misleading. Although all are really exceptions, the Exception class is actually the root of the recoverable exception hierarchy, while the Error class is the root of the irrecoverable exception hierarchy. Throwable is the root class of the entire hierarchy. As shown, subclasses of Exception are checked, while subclasses of RuntimeException and Error are unchecked.

Figure 6: Exception Class Hierarchy

Parameterized Types

C++ provides templates to allow definition of parameterized types. Templates provide a way of specifying how a family of related classes can be made. Templates are most useful in creating type-safe collection classes. A class template for a family of collection classes specifies how a collection class can be made for a particular element type. A template is written in terms of one or more type parameters. For example, for collection classes, a type parameter may refer to the type of elements. A specific collection class is created from the template when the element type is specified at the time of use. For different element types, different collection classes belonging to the same family are generated at the time of use. This can result in bloated compiled code. This problem can be eliminated by designing intrusive template collection classes, so-called because the classes of elements in the collection must inherit from a common superclass. This restriction makes it difficult to use a collection class to store instances of a preexisting class.

Although Java does not support parameterized types, it is possible to implement type-safe collection classes. For example, the constructor of a type-safe collection class can take a prototype element, an object of the same type as elements that will be inserted. Now, before allowing insertion, the collection can check if the element to be inserted and the prototype are of the same type, thus enabling it to enforce type safety. If the element is not of the desired type, it can reject the element by throwing an exception. This approach is nonintrusive, and this collection class can be used to store objects of any specified type by providing an appropriate prototype at the time of instantiation. The class can be implemented so that it does not do type checking if no prototype is provided when the constructor is invoked, thus allowing the use of the collection class for implementing heterogeneous as well as homogeneous collections. It is possible to accomplish the same in C++ using typeid.

The approach for a type-safe Java collection described here provides significant run-time flexibility as compared to most C++ template classes. However, C++ template collection classes enable more compile-time checking. Additionally, C++ template classes can be parameterized by primitive types as well as classes, while only objects may be stored in a collection class using the approach described for Java.


Naming

C++ allows classes to be nested, and the name of a nested class is scoped by the class containing it. C++ also provides the namespace construct for providing another level of scoping for classes. If a namespace foo contains a class container which in turn contains a class contained, contained would be fully qualified as foo::container::contained. In this manner, it is possible to have more than one class with the same name, as long as they reside in different namespaces. To avoid ambiguity, the classes must be referred to by their fully qualified names. Namespaces help keep names simple and natural, while preventing naming conflicts.

Java also supports nested classes in the form of inner classes, which have a richer semantics than C++ nested classes. Additionally, Java has a package construct which provides functionality similar to C++'s namespace. In Java, a fully qualified class name consists of the package name followed by a dot followed by the class name: package.class. A class is made part of a package by declaring the package name in a package statement at the beginning of the source file. If no package statement appears, the class becomes part of the anonymous package. Classes in the anonymous package may not be imported into other files.

While neither Java nor C++ allows nested namespaces or packages, Java package names may have dots in them, which provides the illusion of nested packages. An example is the best way to explain this. Part of the standard library included with Java is the java.awt package, which contains the abstract windowing toolkit classes such as java.awt.Component. There is another logically related package called java.awt.image. This is a completely independent package, in the sense that the statement import java.awt.* will not import the java.awt.image package. However it is named to imply that it is a "subpackage" of java.awt. This is a very nice bit of syntactic sugar that C++ does not support.

Note that while both C++ namespaces and Java packages are used for name scoping, Java packages are also used for access control. See the Access Control section for more information on this.


Access Control

An important feature that supports encapsulation in object-oriented languages is access control, whereby a programmer may control who is allowed to access the constructs defined. For example, attributes may be marked private to prevent them from being accessed anywhere except by operations in the defining class.

Access Control for Classes

All classes in C++ are accessible anywhere, including nested classes. In Java, classes belong to a package and may be marked public to indicate they are accessible to anyone, or not marked to indicate they are accessible only within the package.

Access Control for Operations and Attributes

A C++ class can declare its operations or attributes public, protected, or private, making them accessible from any other class, only to subclasses, and only within the class, respectively. A class may declare another class its friend, allowing the latter to access all operations and attributes of the former.

Java provides a richer set of access control specifiers but does not support the friend mechanism. Access to operations or attributes of a class depends on the level of access granted by the class and on whether the class wishing access is in the same package as the class that declares them. Thus the package is used not only as a name scoping mechanism but also to provide access control.

Java provides four levels of access on operations and attributes: public, protected, default, and private. Default applies if none of the keywords are used. The semantics of the various access levels are summarized in Table 2, for both public and nonpublic classes.

Access Level Semantics for Public Classes Semantics for Nonpublic Classes
public Allows access by any class in any package. Allows access by any class in the same package.
protected Allows access by subclasses in any package, and by any class in the same package. Allows access by any class in the same package.
default Allows access by any class in the same package. Allows access by any class in the same package.
private Allows access only from within the class. Allows access only from within the class.
Table 2: Java Access Levels

Although Java does not support friends, if both classes can be put in the same package and use the default level, then a similar effect can be achieved. The difference is that the C++ friend construct provides unidirectional access control, while this solution in Java would allow bidirectional access. There is no way in Java to have two classes in different packages have access to one another without using public variables. While such access may occasionally be useful, this is not a serious limitation. In fact, it may be seen as encouraging good design: using friends violates encapsulation, and it makes sense to put all such classes as closely together as possible (such as in the same package).


Arrays

Java provides arrays as objects. It eliminates the need for pointer arithmetic.

For each primitive type, built-in class and interface, and user-defined class and interface, the language implicitly provides an array class. For example, int[] is provided as an array class to store elements of type int. If Frog is a subclass of Amphibian, then array Frog[] is a subclass of array Amphibian[]. Figure 7 illustrates these concepts.

Figure 7: Array Inheritance

In both C++ and Java, the first element of an array has index 0. Unlike C++, when a Java array element is accessed, the array index is checked and if it is out of bounds an ArrayIndexOutOfBoundsException is thrown.

Neither Java nor C++ provide a separate type for multidimensional arrays; instead, they may be implemented as arrays of arrays.


Dynamic Behavior

When a program is executed, several things happen behind the scenes in the runtime to make a program execute correctly. A language compiler ensures that the machine code generated results in the right behavior to support the model that a language supports.

In C++, dynamic binding on invocation of virtual functions to support polymorphism is an example of dynamic behavior. Java supports a more dynamic model than C++ which opens up some novel possibilities. Understanding the dynamic nature of Java is crucial to truly appreciate some of its behavioral nuances. Armed with this understanding, useful tradeoffs can be made that enable one to trade some performance for increased flexibility where it is appropriate in an application.

Interpreted versus Compiled

Java source code is compiled into byte code that contains instructions for a virtual machine called the Java Virtual Machine. Byte compiled code is then interpreted at program execution time. The fact that Java is interpreted makes it somewhat slower than comparable compiled C++ code. This performance problem may be mitigated by just-in-time Java compilers that promise to soon become ubiquitous.

One implication of the fact that Java is interpreted and the fact that linking is only done at run time is that recompilation of subclasses because of changes made to superclasses is not necessary in most cases. This common C++ problem where subclasses need to be recompiled to take into account the change in size and the changed offsets to access instance variables does not exist in Java. Elimination of this "fragile superclass" problem results in faster development cycles when coding in Java, since the impact on recompilation when a change is made is typically less than in C++. The combination of architecture neutral byte code and interpretation of the same contributes to the portable nature of Java.

Run-Time Type Information

The Java reflection API enables one to discover information about the type of an object, as well its operations and attributes. Java also provides a way to check if an object is of a certain type and deduce the entire type hierarchy. C++ support for run-time type information has been progressively enhanced to support safe type casting and type comparison. However, type information is not as comprehensive as in Java. It is possible to check if an object in C++ is of a certain type specified at compile time, but one cannot reflect on an object to determine its operations and attributes.

Dynamic Loading

Many of the dynamic behavioral features of Java can be traced to how the language does symbol resolution, so this process is examined in some detail.

In Java, before a class or interface can be actively used, it has to be loaded, linked, verified, prepared and initialized by the Java Virtual Machine. Verification ensures that the binary code for a class is correct. During preparation, static fields are created and initialized to default values. Static initializers are executed during the initialization phase that follows the preparation.

The first step in loading a class or interface happens somewhat differently in Java than in C++, where all symbols need to be resolved at link time. The C++ compiler ensures that for each symbol that is referenced in the program, there exists a definition. If the definition, such as a function implementation, exists in a shared library, it does not load the definition into the executable at link time. However, it ensures that there exists a shared library that will provide this definition at run time. In Java, it is more appropriate to think of a program as incrementally growing as needed, instead of the conventional notion of a monolithic executable. A range of possibilities exists with dynamic loading. On one end is shared library-like behavior: when a class reference is encountered, the Java runtime looks for the class in a set of directories specified in the CLASSPATH environment variable. This is programmer transparent dynamic loading just like shared library symbol loading in C++. Furthermore, classes in Java can be loaded by programmer request, both locally and remotely. C++ does not provide this facility as part of the language.

The fact that a symbol can be loaded on demand can be used to advantage in Java, for example, when a stream of bytes is being assembled into an object. The stream of bytes may be a persistent representation of an object in a database. When the stream is parsed to reconstruct the objects in memory, a symbol encountered in the stream may require a class that is not loaded. Such a class can be loaded at the programmer's request, and the object corresponding to the stream can be constructed based on the loaded definition.

Dynamic Specification of Shared Libraries

Like C++, Java enables loading of shared libraries dynamically. However, the name and location of a shared library need not be specified at compile time. In C++, the shared library name is bound at compile time.

As an example, consider loading a shared library in Java to access a native method. Java provides a mechanism to access native code on a platform. A method that is implemented in C is called a native method. A Java method that accesses such native code is declared native. The compiled native code is placed in a shared library which must be loaded before it is accessed. This can be done by loading the library in the static initializer of the class that declares the native methods. For example:

class SignalProcessor {
    static {
        loadLibrary(java.lang.System.getProperty(MyTransformLib));
    }
    public native FourierTransform();
}

The name of the library passed to the loadLibrary method of java.lang.Runtime is obtained from an environment variable or system property as it is called in Java.

Object Memory Layout

Since objects in Java are accessed via handles as discussed in the Memory References section, code that accesses data in an object is insulated from changes made to the class that defines the object. In C++, changing any data in a class would require recompiling the code that accesses the data, since memory offsets would have changed. In Java, this recompile is not required.

Meta Information

One feature of Java is that every class has a common parent at the root of the class inheritance hierarchy, the Object class. You may be confused by the name of this class; at first blush it may seem that, being a class, it would be more appropriately named Class instead of Object. However, upon deeper inspection, the reasoning behind the name becomes clear.

In general, a class is named for the type of object that it defines. For example, a class which defines a stack might be named Stack. The Stack class itself is not a stack; instances of the Stack class are stacks. Note that instances of subtypes of Stack are also stacks. By analogy, the Object class itself is not an object (it is a class); instances of the Object class are objects. Since Object is an abstract class and cannot be instantiated directly, only instances of subclasses of the Object class are objects. Since all classes are subclasses of Object, all instances of all classes are objects.

Java actually does provide a class named Class. This class, however, truly represents classes rather than an object. Every class and interface is represented at run time by an instance of Class. A class like Class whose instances are themselves classes or that maintain information about classes is known as a metaclass.


Security

C++ does not support security as part of the language, while security is one of the defining properties of Java. Java has no language features which allow undefined behavior, such as pointers and incorrect memory accesses, making Java programs very robust. Malicious or unwanted behavior is prevented, such as accessing memory not allocated to you. In addition, the Java runtime does byte code verification to ensure that the program being executed was not modified after the byte code was generated.


Concurrency

Unlike C++, Java provides language-level support for concurrent programming in the form of threads. Synchronization is supported via the synchronized keyword and the wait and notify methods on the Object class.


Portability

Although C and C++ make claims to portability, they support portability only at the source code level, and then only if the programmer takes special pains to write portable code. In contrast, a defining feature of Java is that it guarantees portability, and at the byte code level.

Specifically, Java has four features that guarantee portability:

  • All primitive data types are defined in an architecture neutral manner, with their representations and operations defined consistently regardless of the platform.
  • Compilation of Java code generates Java virtual machine instructions, which, are mapped to instructions of a real microprocessor by an interpreter (or just-in-time compiler) at execution time. This allows the byte code to be executed on any machine that has a Java interpreter.
  • Java provides a set of GUI components which are mapped at run time to the native windowing environment, to provide for portability of the user interface.
  • The java.lang.System class provides platform-independent access to system functions.

Conclusion

Java builds upon the best concepts of C++ and many other programming languages and environments to provide an elegant computing environment that makes what the authors believe to be intelligent, pragmatic choices. Simultaneously supporting the object-oriented model in a clean manner, guaranteeing secure and robust software, and providing many distribution transparency features, we believe Java provides a very significant step forward in computing technology.

We hope that the comparison between Java and C++ provided in this paper will have piqued the interest of those not already immersed in Java, and will provide an easier transition for the programmer in moving from C++ to the exciting new phenomenon that is Java.


Appendix: Glossary of Object Terminology

This glossary provides a handy reference of terms used.

Objects

Figure 8: Ways to Partition an Object
object
The run-time representation of an entity, which packages both the state and behavior of that entity. Synonym: instance.
state
That part of an object which models the state of the represented entity. Composed of the object's attribute values.
behavior
That part of an object which models the behavior of the represented entity, possibly accessing its state. Composed of the object's operations.
attribute
A data property of an object. May be constant or variable. The set of all attribute values of an object collectively comprise the object's state. Synonym: data member (C++), field (Java).
constant attribute
An immutable attribute. Abbreviation: constant.
variable attribute
An attribute which may be modified. Abbreviation: variable.
operation
A procedure that an object provides as a way to interact with it, possibly accessing its attributes. Carried out when an object receives a corresponding request. The set of all operations of an object collectively constitute the object's behavior. An operation has both a signature and an implementation. Synonyms: method (Smalltalk, Java), member function (C++).
request
A message sent to an object in order to cause the object to perform a corresponding operation. Synonym: message.
member
An attribute or operation.
encapsulation
The property of an object whereby its state may be modified only via its operations.
signature
The external specification of an operation, including its name, parameters, return value, and exceptions raised. The set of all signatures of an object collectively comprise the object's interface.
implementation1
The code that implements an object's operation.
interface1
The external specification of an object's behavior, composed of the signatures of the object's operations.
implementation2
The state and operation implementations of an object.

Classes and Types

class
A template which defines an object's interface and implementation. An object is an instance of exactly one class, created by instantiating the class.
instance
An object, from the perspective of the class from which it was instantiated. Synonym: object.
instantiate
To bring into existence an object of a given class.
interface2
A set of operation signatures. An object may support one or more interfaces through interface inheritance.
type
A name assigned to an interface. In typed languages, declaring a class creates a new type, and an object may be of one or more type through interface inheritance.

Inheritance

inheritance
Definition of a new construct in terms of one or more existing constructs, such that the new construct implicitly has the characteristics of the existing constructs. In particular, an interface, implementation, or class may inherit from one or more other interface, implementation, or class, respectively. Synonym: specialization.
interface inheritance
Definition of a new interface in terms of one or more existing interfaces, such that the new interface is a superset of the existing interfaces. An object that supports the new interface implicitly supports the inherited interfaces as well, and is said to be of the type of the new interface and all inherited interfaces, thus allowing the object to be used wherever an object of any of its types is expected. Synonym: subtyping.
implementation inheritance
Definition of a new implementation in terms of one or more existing implementations. Allows for reuse of code and attribute definitions.
class inheritance
Definition of a new class in terms of one or more existing classes. Combines interface inheritance and implementation inheritance. Synonym: subclassing.
single inheritance
Inheritance from a single existing construct.
multiple inheritance
Inheritance from more than one existing construct.
subtype
A type whose interface supports all the operations supported by another type, its supertype. In interface inheritance, the subtype is said to inherit its interface from its supertype(s). Synonyms: derived type, descendant type.
supertype
A type whose interface is completely supported by another type, its subtype. In interface inheritance, the subtype is said to inherit its interface from its supertype(s). Synonyms: parent type, ancestor type, base type.
subclass
A class defined in terms of another class, its superclass. In class inheritance, the subclass is said to inherit its interface and implementation from its superclass(es). Synonyms: derived class, descendant class.
superclass
A class in terms of which another class is defined, its subclass. In class inheritance, the subclass is said to inherit its interface and implementation from its superclass(es). Synonyms: parent class, ancestor class, base class.
polymorphism
The ability to substitute one object for any other object with which it shares an interface (and hence a type). Allows different behaviors to be obtained transparently through a common interface.

Class and Instance Members

instance attribute
An attribute. Abbreviation: attribute. Synonyms: nonstatic data member (C++), nonstatic field (Java).
class attribute
A data property of a class, which exists regardless of whether any instance of the class exists. Synonyms: static data member (C++), static field (Java).
instance operation
An operation. Abbreviation: operation. Synonyms: instance method (Smalltalk), nonstatic member function (C++), nonstatic method (Java).
class operation
A procedure that a class provides as a way to interact with it, which may be invoked regardless of whether any instance of the class exists. Synonyms: class method (Smalltalk), static member function (C++), static method (Java).

Abstract Classes

abstract class
A class whose role is to define a common interface for its subclasses, and which may not itself be instantiated. An abstract class may defer the implementation of some of its operations to its subclasses.
abstract operation
An operation which has a signature but whose implementation is deferred to a subclass. A class with an abstract operation is implicitly abstract. Synonyms: abstract method (Smalltalk, Java), pure virtual member function (C++).
pure abstract class
An abstract class supplying no state or operation implementations. A pure abstract class solely defines an interface.
mixin class
An abstract class whose role is to provide an optional interface or functionality for its subclasses.
concrete class
A class which may be instantiated.

Miscellaneous

metaclass
A class whose instances are themselves classes.
overloading
Having more than one operation in the same scope with the same name but different signatures.
overriding
Replacing the implementation of an operation provided by a superclass in a subclass.
parameterized type
A type defined in terms of other types which are provided as parameters at the point of use. Synonym: template (C++).

Cross References

abstract method
See abstract operation.
ancestor class
See superclass.
ancestor type
See supertype.
base class
See superclass.
base type
See supertype.
class method
See class operation.
data member
See instance attribute.
derived class
See subclass.
derived type
See subtype.
descendant class
See subclass.
descendant type
See subtype.
generalization
The opposite of specialization. See specialization.
instance method
See instance operation.
member function
See operation.
message
See request.
method
See operation.
nonstatic attribute
See instance attribute.
nonstatic data member
See instance attribute.
nonstatic member function
See instance operation.
nonstatic method
See instance operation.
parent class
See superclass.
parent type
See supertype.
pure virtual member function
See abstract operation.
specialization
See inheritance.
static data member
See class attribute.
static attribute
See class attribute.
static member function
See class operation.
static method
See class operation.
static operation
See class operation.
subclassing
See class inheritance.
subtyping
See interface inheritance.

Appendix: Language Comparison Summary

Table 3 summarizes the object-oriented terms discussed in this paper in the left hand column. The other columns give the analogous constructs for Java and C++.

OO Term Java C++
attribute field data member
operation method member function
instance attribute nonstatic field nonstatic data member
class attribute static field static data member
instance operation nonstatic method nonstatic member function
class operation static method static member function
class class class
abstract class class preceded by keyword abstract class with at least one pure virtual member function
pure abstract class interface, or abstract class with no implementation and no state class with no implementation and no state
abstract operation abstract method---method with no implementation preceded by keyword abstract pure virtual member function---member function with no implementation whose declaration is followed by "= 0"
implementation inheritance (only) not available private inheritance
interface inheritance (only) class implementing an interface, or interface extending an interface not available
class inheritance class extending a class public inheritance
single inheritance class extending a class, or interface extending an interface private or public inheritance
multiple inheritance interface inheritance only via extends or implements class inheritance via public inheritance, implementation inheritance via private inheritance
type built-in primitive types, and user-defined types introduced through declaration of class or interface built-in primitive types, and user-defined types introduced through declaration of class
Table 3: Language Comparison Summary

References

  1. Flanagan, D. Java in a Nutshell, Second Edition, O'Reilly & Associates, Inc., 1997.
  2. Gamma, E., R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley, 1995.
  3. Gosling, J. and H. McGilton. The Java Language Environment: A White Paper, Sun Microsystems, 1995.
  4. Gosling, J., B. Joy, and G. Steele. The Java Language Specification, The Java Series, Addison-Wesley, 1996.
  5. ISO/ANSI C++, April 1997.
  6. Stroustrup, B. The C++ Programming Language, Third Edition, Addison-Wesley, 1997.

Author Profiles

Suchitra Gupta received his Ph.D. from the State University of New York at Stony Brook. He worked at the Pennsylvania State University at University Park as an Assistant Professor of Computer Science. He also worked at Shell Oil Company and contributed to the design and development of reservoir simulators. At present he works at U S WEST Communications. His recent work has focused on design and development of distributed applications using object-oriented techniques. He can be reached at bobbygupta@yahoo.com.

Jeffrey M. Hartkopf received his B.S. and M.S. in Computer Science from the University of Colorado at Boulder. He works at U S WEST Communications on a groundbreaking project in computing systems management. He is also participating in the development of an enterprise platform for large scale client-server applications. He can be reached at jeffhartkopf@yahoo.com.

Suresh Ramaswamy received his B.S. in Electrical Engineering from the University of Bombay and his M.S. in Electrical Engineering from the University of Southern California. He worked on the design and development of object-oriented systems for electronic CAD applications at Mentor Graphics Corporation. Presently, he works at U S WEST Communications, where he has been working on a development platform for enterprise class client-server applications. He can be reached at Lucid Field.