Serialization is a process by which data structures or objects are translated into a format or a sequence of bytes (often called marshaling), which can later be unpacked (often called unmarshalling) in the same/different environment. Since the two environments might be completely different, a common method of serialization and then later deserialization is required.
Questions:
- What is the use of serialization?
- How to implement Java serialization API?
How To Use Serialization?
- Client/server model: Suppose we need to transfer some information from client to server, then we can serialize the object and send it over the network for the server to unpack it and perform necessary operations. Remember that the client/server may be two different computers.
- Data persistence for later use (may be a file/database blob record).
- Java’s RMI, JNDI uses serialization techniques.
- To flatten objects into array of bytes in memory.
- Suppose we would like to continue a game after some time, so the current state can be serialized/saved into a file so that when user plays the same game next time, it starts from the current state instead of starting from scratch.
Java Serialization API
How do we serialize and make sure we deserialize it correctly? Fortunately, Java takes care of the internal implementation of serializing and unpacking. All we need to do is to indicate Java that we need to serialize an object. And for that purpose, we use java.util.serializable marker interface. It is called a marker interface, as it has no associated methods but just used as a marker/placeholder.
class SerializeEx implements Serializable { int perAge; String perName; }
A simple SerializeEx class implementing java.util.Serializable interface. As noted earlier, Serializable is just a marker interface. Now having marked the interface serializable, we need to actually store the object. And java provides APIs for that as well.
An example class showing the serialization of an object into a file. The steps to serialize an object into a file are:
- Make sure the object to be serialized implements java.util.Serializable.
- Create a new file using FileOutputStream object.
- Create a new ObjectOutputStream object and pass the FileOutputStream instance.
- Write the object to the file using writeObject() method.
public class JavaSerializableEx { public static void main(String args[]) throws FileNotFoundException, IOException { SerializeEx javaEx = new SerializeEx(); javaEx.perAge =28; javaEx.perName="Test"; try(ObjectOutputStream objOut = new ObjectOutputStream (new FileOutputStream("serial.out"))) { objOut.writeObject(javaEx); } } }
Question might arise on what exactly is stored when an object is serialized? Only the contents of the object are serialized and not the class definition i.e., fields that are necessary to reconstruct the object are saved. This includes instance variables but not the class methods. So, in our case the instance variables are “perAge” and “perName” which will be saved into the file.
Open the saved file in a hex-editor and this is how it looks.
(There is a separate protocol called Object Serialization Stream Protocol, which dictates on how the object should be saved so that it can be later retrieved successfully.)
Let’s look at another example. Suppose, I change my SerializeEx object to look like this:
class SerializeEx implements Serializable { int perAge; String perName; private Child child = new Child(); } class Child implements Serializable { int childAge; String childFather; String childMother; }
An object reference inside SerializeEx. Do we need to do something special? No. What do you think would be serialized? It is very clear that in order to recreate SerializeEx, we need Child object as well and hence the Child object and its corresponding instance variables would also be serialized when SerializeEx is serialized.
Now, Child and SerializeEx object along with its instance variables would be serialized when SerializeEx object is serialized. We needn’t worry about serializing the dependant objects. It is taken care of by Java and so we just need to serialize one object and all corresponding dependent objects would also be serialized by Java. But the only thing to remember is that every dependent object also must implement Serializable in order to be serialized, otherwise a run time exception will be thrown.
Transient
What if a particular instance field needn’t be serialized for some reasons. How do we say to Java not to serialize a field? Just use the keyword “transient” for fields that needn’t be serialized (Read : What is transient keyword in Java?). So in the below example, childAge wouldn’t be serialized whereas other instance variables would be serialized.
class Child implements Serializable { transient int childAge; String childFather; String childMother; }
Deserialization
Having briefly looked at Serialization, now lets see how to read back the serialized file which is called deserialization. Deserialization is exactly the opposite process of serialization. The example class for deserialization would look something like this:
public class DeSerializeEx { public static void main(String[] args) throws FileNotFoundException, IOException, ClassNotFoundException { try(ObjectInputStream objOut = new ObjectInputStream (new FileInputStream("serial.out"))) { SerializeEx exam = (SerializeEx) objOut.readObject(); System.out.println(exam.perAge); System.out.println(exam.perName); } } }
The example class uses ObjectInputStream and FileInputStream just like ObjectOutputStream and FileOutputStream to read data from the serialized file. Since we stored “28” and “Test” in serialized file, the output would be “28” and “Test”, the same values that were serialized.
serialVersionUID
Having looked at basics of serialization, we will look at the concept of serialVersionUID. By default, every time an object is serialized; Java automatically associates it an auto-calculated version ID based on the object structure. And during deserialization, Java checks for the same version ID and if it doesn’t match throws an Exception. Why does Java perform this check? Consider a scenario, where we have serialized the object but later removed the transient keyword from one of the instance variable and so the version ID changes. But assume we didn’t serialize the object after this change and then deserialize it. How does the deserializing class know about this change?
One way, Java resolves this discrepancy is by checking if the version ID matches between deserialized and serialized class. So this serialVersionUID acts like a version control for the Serializable class by checking the deserialized object compatibility with the current definition of same class.
Why worry about this if Java takes care of this ? If we are using Eclipse IDE and after we implement java.util.Serializable, we will notice a warning “didn’t declare a static final serialVersionUID of type long”. If we use Eclipse IDE, clicking on that warning will generate a serialVersionUID. And there are two options: use a “default” or “generated version id”. According to Java experts, we should avoid using this default version ID. Hence we should always use the other option “generated version id”.
class SerializeEx implements Serializable{ private static final long serialVersionUID = -6181741326791855053L; int perAge; String perName; }
Our class got a SerialVersionUID. Are we done? But what if our class definition changes. Assume I change the object to:
class SerializeEx implements Serializable{ private static final long serialVersionUID = -6181741326791855053L; int perAge; String perName; String newField; }
I have added an instance variable, “newField” but the versionUID remains the same because we not changed that value. Now, assume we perform the following: Deserialize SerializeEx using “serial.out” generated earlier without this new instance variable.
Will this work? Yes. Because the serialVersionUID matches … but isn’t the deserialized class incompatible with our current class definition????? Then, who’s fault is this as deserialization worked even though we haven’t maintained compatibility with the current class version????? It is the developer’s responsibility to update the serialVersionUID everytime changes are made to the Serializable class which can affect the uniqueness of the object. If we fail to update the serialVersionUID, Java will not flag the incompatibility as it matches the same old versionUID. Hence developers have to make sure to update serialVersionUID whenever the Serializable object changes significantly which can affect the uniqueness of the object …
In this article, I have focused to give just a few pieces of Serialization in Java as it’s an extensive subject. I hope this article will serve as a foundation for further reading on Java Serialization. Thanks for reading.
This article is originally published at Java tutorials – Lets jump into the ocean, re-posted here with authors permission and as part of the JBC program.