Collections in general

Here is the basic question

What are Collections?

A collection - sometimes called a container - is simply an object that groups multiple elements into a single unit. Collections are used to store, retrieve, manipulate, and communicate aggregate data. Typically, they represent data items that form a natural group, such as a collection of strings, or a mapping of names to addresses.

Then how are collections (collection implementation classes) different from arrays?

Nope, they are not different. Array is nothing but a collection and our definition above proves that. Infact the collection implementation classes like ArrayList internally uses arrays to store objects. Vector, Hashtable and array are collection implementations in earlier versions (pre 1.2) of the Java Platform which unfortunately were not easy to extend, and did not implement a standard member interface. Well, they are not part of JDK collections framework but still they are collections.

Then why do we need Collections Framework?

All three of these collections viz, Vector, Hashtable, array have different methods and syntax for accessing members: arrays use the square bracket ([]) symbols, Vector uses the elementAt method, and Hashtable uses get and put methods. These differences have forced programmers to implement their own inconsistent collections - some imitate the Vector access methods and some imitate the Enumeration interface. To make it worse, most of the Vector methods are marked as final; that is, you cannot extend the Vector class to implement a similar sort of collection. We could create a collection class that looked like a Vector and acted like a Vector, but it couldn't be passed to a method that takes a Vector as a parameter. Finally, none of the collections (array, Vector or Hashtable) implements a standard member access interface. When programmers develop algorithms (like sorting) to manipulate collections, what object should be passed to the algorithm - is it an array or a Vector or implement both interfaces? A lot of such questions pop up.

Thankfully, the Java Collections Framework (JCF) solves these problems and offers a number of advantages over using no framework or using the Vector and Hashtable

What is Java Collections Framework (JCF)?

A collections framework is a unified architecture for representing and manipulating collections. It provides a well-designed set of interfaces and classes for storing and manipulating groups of data as a single unit, a collection. The framework provides a convenient API to many of the abstract data types familiar from computer science data structure curriculum: maps, sets, lists, trees, arrays, hashtables and other collections. Because of their object-oriented design, the Java classes in the Collections Framework encapsulate both the data structures and the algorithms associated with these abstractions. The framework provides a standard programming interface to many of the most common abstractions, without burdening the programmer with too many procedures and interfaces. The operations supported by the collections framework nevertheless permit the programmer to easily define higher level data abstractions, such as stacks, queues, and thread-safe collections.

The main design goal of JCF was to produce API that was reasonably small, both in size, and, more importantly, in "conceptual weight". There are only 14 collection interfaces and the most basic interface is "Collection". To keep the number of core interfaces small, the interfaces do not attempt to capture such subtle distinctions as mutability, modifiability, and resizability. Many of the modification methods in the collection interfaces are labeled optional, allowing implementations to throw an UnsupportedOperationException to indicate that they do not support a specified optional operation. An interface contains a method only if either (a) it is a fundamental operation or (b) there is a compelling perfomance reason why an important implementation would want to override it.

All reasonalble representations of collections should interoperate well. Hence the framework includes methods to allow collections to be dumped into arrays, arrays to be viewed as collections, and maps to be viewed as collections.

What does Java Collections framework (JCF) contain?

So what is the benefit of going for Java Collections Framework (JCF) instead of self defined and implemented classes or array?

Well, Java Collections Framework

The following diagrams shows the collections framework interface hierarchy.

Interfaces Hierarchy

There are fourteen collection interfaces. Set, List, SortedSet, NavigableSet, Queue, Deque, BlockingQueue and BlockingDeque interfaces extend Collection. Map, SortedMap, NavigableMap, ConcurrentMap and ConcurrentNavigableMap interfaces do not extend Collection interface, as they represent mappings rather than true collections. However, these interfaces contain collection-view operations, which allow them to be manipulated as collections. Some collection implementations may restrict what elements may be stored (Ex: non-null values, or specific type elements, etc) and attempting to add or remove or test for the presence of an element that violates an implementation's restrictions results in a runtime exception, typically a ClassCastException, an IllegalArgumentException or a NullPointerException.

The following table shows the general-purpose collection implementations hierarchy.

Hash Table Resizable Array Balanced Tree Linked List Hash Table & Linked List
Interfaces Set HashSet   TreeSet   LinkedHashSet
List   ArrayList   LinkedList  
Deque   ArrayDeque   LinkedList  
Map HashMap   TreeMap   LinkedHashMap

Classes that implement the collection interfaces typically have names of the form <implementation-style><Interface>. The general-purpose implementations are unsynchronized, but the Collections class contains static factories called synchronization wrappers that may be used to add synchronization to many unsynchronized collections. The AbstractCollection, AbstractSet, AbstractList, AbstractSequentialList and AbstractMap classes provide skeletal implentations of the core collection interfaces, to minimize the effort required to implement them. You can extent these classes and create you own collection implementation class.

Few interesting points to note:

blog comments powered by Disqus