Transaction Management in Spring Batch Components

This article is based on SpringBatch in Action, to be published July-2011. It is being reproduced here by permission from Manning Publications. Manning publishes MEAP (Manning Early Access Program,) ebooks and pbooks. MEAPs are sold exclusively through Manning.com. All print book purchases include an ebook free of charge. When mobile formats become available all customers will be contacted and upgraded. Visit Manning.com for more information. If you are interested in learning more tutorials on spring, please read spring tutorials.

also read:

Spring Tutorials

Spring 4 Tutorials

Spring Interview Questions

Transaction Management in Spring Batch handles transactions at the step level. This means that Spring Batch will never use only one transaction for a whole job (unless the job has a single step!). You’re likely to implement a Spring Batch job in one of two ways: using a tasklet or a chunk-oriented step. Let’s see how Spring Batch handles transactions in both cases.

Transaction management in tasklets

You use a tasklet whenever you need custom processing. This differs from the usual read-process-write behavior that Spring Batch’s chunk-oriented step handles well. Here are cases where you can use a tasklet: launching a system command, compressing files in a ZIP archive, decompressing a ZIP archive, digitally signing a file, uploading a file to a remote FTP server, and so on. The Tasklet interface is:

public interface Tasklet {
	RepeatStatus execute(StepContribution contribution,
	ChunkContext chunkContext) throws Exception;
}

By default, the execute method of a tasklet is transactional. Each invocation of execute takes place in its own transaction. Here is a simple example implementation:

class MyTasklet implements Tasklet {
	@Override
	public RepeatStatus execute( #A
	StepContribution contribution, #A
	ChunkContext chunkContext) throws Exception { #A
		// your custom processing here #A
		return RepeatStatus.FINISHED; #A
	} #A
}

#A Called in a transaction

#A Called in a transaction

A tasklet is repeatable: Spring Batch calls the execute method of a tasklet as long as the method returns RepeatStatus.CONTINUABLE. As we mentioned, each execute invocation takes place in its own transaction. When the execute method returns RepeatStatus.FINISHED or null, Spring Batch stops calling it and moves on to the next step.

Note Be careful when implementing repeatable tasklets, because Spring Batch creates a new transaction for each invocation to the execute method. If a tasklet doesn’t use a transactional resource —like when uncompressing a ZIP archive—you can set the propagation level to PROPAGATION_NEVER.

To summarize, a tasklet is a potentially repeatable, transactional operation. Let’s now see how Spring Batch handles transactions in a chunk-oriented step.

Transaction management in chunk-oriented steps

A chunk-oriented step follows the common read-process-write behavior for a large number of items. You know by now that you can set the chunk size. Transaction management depends on the chunk size: Spring Batch uses a transaction for each chunk. Such transaction management is:

Efficient—Spring Batch uses a single transaction for all items. One transaction per item isn’t an appropriate solution because it doesn’t perform well for a large amount of items.
Robust—An error affects only the current chunk, not all items.

When does Spring Batch roll back a transaction in a chunk? Any exception thrown from the item processor or the item writer triggers a rollback. This isn’t the case for an exception thrown from the item reader. This behavior applies regardless of the retry and skip configuration.

You can have transaction management in a step; you can also have transaction management around a step. Remember that you can plug in listeners in jobs and step executions to log skipped items, for example. If logging to a database, for example, logging needs proper transaction management to avoid losing data or logging the wrong information.

Transaction management in listeners

Spring Batch provides many types of listeners to respond to events in a batch job. When Spring Batch skips items from an input file, you may want to log them. To do so, you can plug in an ItemSkipListener in the step. How does Spring Batch handle transactions in these listeners? Well, it depends. (The worst answer a software developer can get.) There’s no strict rule on whether a listener method is transactional or not; you always need to consider each specific case. Here is one piece of advice: always check the Javadoc. (you’re in luck; the Spring Batch developers documented their source code well.)

If we take the ChunkListener as an example, its Javadoc states that Spring Batch executes its beforeChunk method in the chunk transaction but its afterChunk method out of the chunk transaction. Therefore, if you use a transaction resource such as a database in a ChunkListener’s afterChunk method, you should handle the transaction yourself using the Spring Framework’s transaction support.

Spring Batch also includes listeners to listen to phases for item reading, processing, and writing. Spring Batch calls these listeners before and after each phase and when an error occurs. The error callback is transactional, but it happens in a transaction that Spring Batch is about to roll back. Therefore, if you want to log the error to a database, you should handle the transaction yourself and use the REQUIRES_NEW propagation level. This allows the logging transaction to be independent from the chunk and the to-be-rolled-back transaction.

also read:

Spring Books

Introduction to Spring Framework

Introduction to Spring MVC Framework

Summary

Spring Batch Reference Documentation

Transaction management is a key part of job robustness and reliability. Because errors will happen, you need to know how Spring Batch handles transactions, figure out when a failure can corrupt data, and learn to use appropriate settings. Remember that Spring Batch handles transactions at the step level. A tasklet is transactional—Spring Batch creates and commits a transaction for each chunk in a chunk-oriented step.

Transaction management in tasklets

Transaction management in chunk-oriented steps

Transaction management in listeners

Summary

About Krishna Srinivasan