Configure Spring Batch Retry on Error

SHARE & COMMENT :

This article is based on SpringBatch in Action, to be published July-2011. It is being reproduced here by permission from Manning Publications. Manning publishes MEAP (Manning Early Access Program,) ebooks and pbooks. MEAPs are sold exclusively through Manning.com. All print book purchases include an ebook free of charge. When mobile formats become available all customers will be contacted and upgraded. Visit Manning.com for more information. If you are interested in learning more tutorials on spring, please read spring tutorials.

also read:

Spring Batch Retrying on Error

By default, an exception in a chunk-oriented step causes the step to fail. You can skip the exception if you don’t want to fail the whole step. Skipping works well for deterministic exceptions, for example, an incorrect line in a flat file. Exceptions aren’t always deterministic; sometimes they can be transient. We call an exception transient when an operation fails at first, but a new attempt—even immediately after—is successful.

spring framework

What are transient exceptions in batch applications? Concurrency exceptions are a typical example. If a batch job tries to update a row that another process holds a lock on, the database can cause an error. Retrying the operation immediately after can be successful because the other process may have released the lock in the meantime. Any operation involving an unreliable network like a web service call can also throw transient exceptions. A new attempt, with a new request (or connection) can succeed.

You can configure Spring Batch to retry operations transparently when they throw exceptions, without any impact on the application code. Because transient failures cause these exceptions, we call them retriable exceptions.

Configuring retriable exceptions

You configure retriable exceptions inside the chunk element using the retriable-exception-classes element, as shown in listing 1.

Listing 1 Configuring retriable exceptions

<job id="importProducsJob">
		<step id="importProductsStep">
			<tasklet>
				<chunk reader="reader" writer="writer" commit-interval="100"
				retry-limit="3"> 						#A
					<retriable-exception-classes> 				#B
						<include class="org.springframework.dao 	#B
	[CA].OptimisticLockingFailureException" /> 						#B

1It happened… while working on this book!

					</retriable-exception-classes> 				#B
				</chunk>
			</tasklet>
		</step>
	</job>
	#A Sets max number of retries
	#B Sets exceptions to retry on

</span>
<pre><retriable-exception-classes>
		<include
			class="org.springframework.dao.TransientDataAccessException"/>
		<exclude
			class="org.springframework.dao.PessimisticLockingFailureException"/>
	</retriable-exception-classes>

#A Sets max number of retries
#B Sets exceptions to retry on

Notice the retry-limit attribute, used to specify how many times Spring Batch should retry an operation. Just as for skipping, you can include a complete exception hierarchy with the include element and exclude some specific exceptions with the exclude element. You can use both XML elements several times. The following snippet illustrates the use of the exclude element for retry:

<job id="job">
		<step id="step">
			<tasklet>
				<chunk reader="reader" writer="writer" commit-interval="100"
					retry-limit="3" skip-limit="10">
					<retriable-exception-classes> 					#A
						<include class="org.springframework.dao 		#A
						[CA] .DeadlockLoserDataAccessException" /> 		#A
					</retriable-exception-classes> 					#A
					<skippable-exception-classes> 					#B
						<include class="org.springframework.dao 		#B
						[CA] .DeadlockLoserDataAccessException" /> 		#B
					</skippable-exception-classes> 					#B
				</chunk>
			</tasklet>
		</step>
	</job>

	#A Specifies retriable exceptions
	#B Specifies skippable exceptions

[/code]
Figure 1 shows the relationship between the exceptions TransientDataAccessException and PessimisticLockingFailureException. In the previous snippet, we tell Spring Batch to retry when Spring throws transient exceptions, except when the exceptions are related to pessimistic locking.

Spring Batch only retries the item processing and item writing phases. By default, a retriable exception triggers a rollback, so you should be careful with retry, as retrying too many times for too many items can degrade performance. You should use retriable exception only for exceptions that are nondeterministic, not exception related to format or constraint violations, which are typically deterministic. Figure 2 summarizes the retry behavior in Spring Batch.


Override equals() and hashCode() when using retry In a chunk-oriented step, Spring Batch handles retry on the item processing and writing phases. By default, a retry implies a rollback, so Spring Batch needs to restore the context of retried operations across transactions. Spring Batch needs to track items closely to know which item could have triggered the retry. Remember that Spring Batch can’t always know which item triggers an exception during the writing phase because an item writer handles a list of items. It relies on the identity of items to track them so, for Spring Batch retry to work correctly, you should override the equals and hashCode methods of your items’ classes using a database identifier for example.

Combining retry and skip

You can combine retry with skip: a job retries an unsuccessful operation several times and then skips it. Remember that once Spring Batch reaches the retry limit, the exception causes the step to exit and, by default, fail. Combine retry and skip when you don’t want than a persisting transient error to fail a step. Listing 2 shows how to combine retry and skip.

Listing 2 Combining retry and skip

<job id="job">
		<step id="step">
			<tasklet>
				<chunk reader="reader" writer="writer" commit-interval="100"
					retry-limit="3" skip-limit="10">
					<retriable-exception-classes> 					#A
						<include class="org.springframework.dao 		#A
						[CA] .DeadlockLoserDataAccessException" /> 		#A
					</retriable-exception-classes> 					#A
					<skippable-exception-classes> 					#B
						<include class="org.springframework.dao 		#B
						[CA] .DeadlockLoserDataAccessException" /> 		#B
					</skippable-exception-classes> 					#B
				</chunk>
			</tasklet>
		</step>
	</job>

	#A Specifies retriable exceptions
	#B Specifies skippable exceptions

Automatic retry in a chunk-oriented step can make jobs more robust. It’s indeed a shame to fail a step because of an unstable network, whereas retrying a few milliseconds later could have worked. You now know about the default retry configuration in Spring Batch and this should be enough for most cases. The next section explores how to control retry by setting a retry policy.

Controlling retry with a retry policy

By default, Spring Batch lets you configure retriable exceptions and the retry count. Sometimes, retry is more complex: some exceptions deserve more attempts than others, or you want to keep retrying as long as the operation doesn’t exceed a given timeout. Spring Batch delegates the decision to retry or not to a retry policy. When configuring retry in Spring Batch, you can use the retriable-exception-classes element and retry-limit pair or provide a RetryPolicy bean instead.

Table 1 lists the RetryPolicy implementations included in Spring Batch. You can use these implementations as is or choose to implement your own retry policy for specific needs.

Let’s see how to set a retry policy with an example. Imagine you want to use retry on concurrent exceptions, but you have several kinds of concurrent exceptions to deal with and you don’t want the same retry behavior for all of them. Spring Batch should retry all generic concurrent exceptions three times, whereas it should retry the deadlock concurrent exceptions five times, which is more aggressive.

The ExceptionClassifierRetryPolicy implementation is a perfect match: it delegates the retry decision to different policies depending on the class of the thrown exception. The trick is to encapsulate two SimpleRetryPolicy beans in the ExceptionClassifierRetryPolicy, one for each kind of exception, as shown in listing 3.

Listing 3 Using a retry policy for a different behavior with concurrent exceptions

<job id="retryPolicyJob"
		xmlns="http://www.springframework.org/schema/batch">
		<step id="retryPolicyStep">
			<tasklet>
				<chunk reader="reader" writer="writer" commit-interval="100"
				retry-policy="retryPolicy" /> 						#A
			</tasklet>
		</step>
	</job>
	<bean id="retryPolicy" class="org.springframework
		[CA].batch.retry.policy.ExceptionClassifierRetryPolicy">
		<property name="policyMap"> 								#B
			<map>
				<entry key="org.springframework.dao.ConcurrencyFailureException">
					<bean class="org.springframework.batch.retry
						[CA].policy.SimpleRetryPolicy">
						<property name="maxAttempts" value="3" /> 		#C
					</bean>
				</entry>
				<entry key="org.springframework.dao
					[CA] .DeadlockLoserDataAccessException">
					<bean class="org.springframework.batch.retry
						[CA] .policy.SimpleRetryPolicy">
						<property name="maxAttempts" value="5" /> 		#D
					</bean>
				</entry>
			</map>
		</property>
	</bean>

	#A Sets retry policy on chunk
	#B Maps policies to exception classes
	#C Sets max number of attempts for concurrent exceptions
	#D Sets max number of attempts for deadlock exceptions

Listing 3 shows that setting a retry policy allows for flexible retry behavior: the number of retries can be different, depending on the kind of exceptions thrown during processing.

Transparent retries make jobs more robust. Listening to retries also helps to learn about the causes of retries. Let’s see then how to plug in a retry listener in a step.

Listening to retry

Spring Batch provides the RetryListener interface to react to any retried operation. A retry listener can be useful to log retried operations and to gather information. Once you know more about transient failures, you’re more likely to change the system to avoid them in subsequent executions (remember, retried operations always degrade performance.)

You can directly implement the RetryListener interface; it defines two lifecycle methods—open and close—that often remain empty because we usually care only about the error thrown in the operation. A better
way is to extend the RetryListenerSupport adapter class and override the onError method, as shown in listing 4.

Listing 4 Implementing a retry listener to log retried operations

package com.manning.sbia.ch09.retry;
	import org.slf4j.Logger;
	import org.slf4j.LoggerFactory;
	import org.springframework.batch.retry.RetryCallback;
	import org.springframework.batch.retry.RetryContext;
	import org.springframework.batch.retry.listener.RetryListenerSupport;
	public class Slf4jRetryListener extends RetryListenerSupport {
		private static final Logger LOG = LoggerFactory.getLogger(Slf4jRetryListener.class);
		@Override
		public <T> void onError(RetryContext context, RetryCallback<T> callback,
		Throwable throwable) {
			LOG.error("retried operation",throwable);
		}
	}

Our retry listener uses the SLF4J logging framework to log the exception the operation throws. It could also use JDBC to log the error to a database. The next step is to register the listener in the step using the retry-listeners XML element, as shown in listing 5.

Listing 5 Registering a retry listener

<bean id="retryListener" class="com.manning.sbia.ch09 						#A
	[CA] .retry.Slf4jRetryListener" /> 								#A

	<job id="job" xmlns="http://www.springframework.org/schema/batch">
		<step id="step">
			<tasklet>
				<chunk reader="reader" writer="writer"
				commit-interval="10" retry-limit="3">
					<retriable-exception-classes>
						<include class="org.springframework.dao
						[CA] .OptimisticLockingFailureException" />
					</retriable-exception-classes>
					<retry-listeners> 						#B
						<listener ref="retryListener" /> 			#B
					</retry-listeners> 						#B
				</chunk
			</tasklet>
		</step>
	</job>

	#A Declares retry listener bean
	#B Registers retry listener

#A Declares retry listener bean
#B Registers retry listener

Any time you need to know about retried operations—for example, to get rid of them!—Spring Batch lets you register retry listeners and log errors.

Retry is a built-in feature of chunk-oriented steps. What can you do if you need to retry in your own code, for example, in a tasklet?

Retrying in application code with the RetryTemplate

Imagine you use a web service in a custom tasklet to retrieve data that a subsequent step will then use. A call to a web service can cause transient failures, so being able to retry this call would make the tasklet more robust. You can benefit from Spring Batch’s retry feature in a tasklet, with the RetryOperations interface and its RetryTemplate implementation. The RetryTemplate allows for a programmatic retry in the application code.

The online store uses a tasklet to retrieve the latest discounts from a web service. The discount data are small enough to keep in memory for later use in the next step. The DiscountService interface hides the call to the
web service. Listing 6 shows a tasklet that retrieves the discounts. (We omit the setter methods for brevity.) The tasklet uses a RetryTemplate to retry in case of a failure.

Listing 6 Programmatic retry in a tasklet

package com.manning.sbia.ch09.retry;
	import java.util.List;
	import org.springframework.batch.core.StepContribution;
	import org.springframework.batch.core.scope.context.ChunkContext;
	import org.springframework.batch.core.step.tasklet.Tasklet;
	import org.springframework.batch.repeat.RepeatStatus;
	import org.springframework.batch.retry.RetryCallback;
	import org.springframework.batch.retry.RetryContext;
	import org.springframework.batch.retry.policy.SimpleRetryPolicy;
	import org.springframework.batch.retry.support.RetryTemplate;
	public class DiscountsWithRetryTemplateTasklet implements Tasklet {
		private DiscountService discountService;
		private DiscountsHolder discountsHolder;
		@Override
		public RepeatStatus execute(StepContribution contribution,
		ChunkContext chunkContext) throws Exception {
			RetryTemplate retryTemplate = new RetryTemplate(); 				#A
			SimpleRetryPolicy retryPolicy = 						#A
			new SimpleRetryPolicy(); 							#A
			retryPolicy.setMaxAttempts(3); 							#A
			retryTemplate.setRetryPolicy(retryPolicy); 					#A
			List<Discount> discounts = retryTemplate.execute( 				#B
			new RetryCallback<List<Discount>>() { 						#B
				@Override 								#B
				public List<Discount> doWithRetry(					#B
				RetryContext context) 							#B
				throws Exception { 							#B
					return discountService.getDiscounts(); 				#B
				}		 							#B
			}); 										#B
			discountsHolder.setDiscounts(discounts); 					#C
			return RepeatStatus.FINISHED;
		}
		(...)
	}

	#A Configures RetryTemplate
	#B Calls web service with retry
	#C Stores result for later use

#A Configures RetryTemplate
#B Calls web service with retry
#C Stores result for later use

The use of the RetryTemplate is straightforward. Note how we configure the RetryTemplate with a RetryPolicy directly in the tasklet. We could also have defined a RetryOperations property in the tasklet and used Spring to inject a RetryTemplate bean as a dependency. Thanks to the RetryTemplate, we shouldn’t fear transient failures on the web service call anymore.

Using the RetryTemplate is simple, but the retry logic is hard-coded in the tasklet. Let’s go further to see how to remove the retry logic from the application code.

Retrying transparently with the RetryTemplate and AOP

Can we remove all the retry logic from the tasklet? It would make it easier to test, because the tasklet would be free of any retry code and the tasklet could focus on its core logic. Furthermore, a unit test wouldn’t necessarily deal with all retry cases.
Aspect-oriented programming (AOP) Aspect-oriented programming is a programming paradigm that allows modularizing crosscutting concerns. The idea of AOP is to remove crosscutting concerns from an application’s main logic and implement them in dedicated units called aspects. Typical crosscutting concerns are transaction management, logging, security, and retry. The Spring Framework provides first-class support for AOP, thanks to its interceptor-based approach: Spring intercepts application code and calls aspect code to address crosscutting concerns. Thanks to AOP, boilerplate code does not clutter the application code and code aspects address crosscutting concerns in their own units, which also avoids code scattering.

Spring Batch provides an AOP interceptor for retry called RetryOperationsInterceptor. By using this interceptor, the tasklet is able to use a DiscountService object directly. The interceptor delegates calls to the real DiscountService and handles the retry logic. No more dependency on the RetryTemplate in the tasklet; the code becomes simpler! Listing 7 shows the new version of the tasklet, which doesn’t handle retries anymore.

Listing 7 Calling the web service without retry logic

package com.manning.sbia.ch09.retry;
	import java.util.List;
	import org.springframework.batch.core.StepContribution;
	import org.springframework.batch.core.scope.context.ChunkContext;
	import org.springframework.batch.core.step.tasklet.Tasklet;
	import org.springframework.batch.repeat.RepeatStatus;

	public class DiscountsTasklet implements Tasklet {
		private DiscountService discountService;
		private DiscountsHolder discountsHolder;
		@Override
		public RepeatStatus execute(StepContribution contribution,
		ChunkContext chunkContext) throws Exception {
			List<Discount> discounts = discountService.getDiscounts();
			discountsHolder.setDiscounts(discounts);
			return RepeatStatus.FINISHED;
		}
		(...)
	}

If we want to keep the tasklet this simple, we need the magic of AOP to handle the retry transparently. Spring AOP will wrap the target DiscountService—the one that makes the web service call—in a proxy. This proxy will handle the retry logic thanks to the retry interceptor. The tasklet ends up using this proxy. Listing 8 shows the Spring configuration for transparent, AOP-based retry.

Listing 8 Configuring transparent retry with Spring AOP

<bean id="discountService" class="com.manning.sbia 						#A
	[CA] .ch09.retry.DiscountServiceImpl" /> 							#A
	<bean id="retryAdvice" 										#B
		class="org.springframework.batch.retry 							#B
	[CA] .interceptor.RetryOperationsInterceptor"> 							#B
		<property name="retryOperations">
			<bean class="org.springframework.batch.retry.support.RetryTemplate">
				<property name="retryPolicy">
					<bean class="org.springframework.batch.retry.policy
					[CA] .SimpleRetryPolicy">
						<property name="maxAttempts" value="3" />
					</bean>
				</property>
			</bean>
		</property>
	</bean>
	<aop:config> 											#C
		<aop:pointcut id="retriedOperations" 							#C
			expression="execution(* com.manning.sbia.ch09 					#C
		[CA] .retry.DiscountService.*(..))" /> 							#C
		<aop:advisor pointcut-ref="retriedOperations" 						#C
			advice-ref="retryAdvice" /> 							#C
		</aop:config> 										#C
		<bean class="com.manning.sbia.ch09.retry.DiscountsTasklet">
			<property name="discountService" ref="discountService" />
			<property name="discountsHolder" ref="discountsHolder" />
		</bean>
		<bean id="discountsHolder"
			class="com.manning.sbia.ch09.retry.DiscountsHolder" />
	#A Declares target discount service
	#B Declares retry interceptor with RetryTemplate
	#C Applies interceptor on target service

That’s it! Not only should you no longer fear transient failures when calling the web service, but the calling tasklet doesn’t even know that there’s some retry logic on the DiscountService. In addition, retry support isn’t limited to batch applications: you can use it in a web application, whenever a call is subject to transient failures.

Summary

This ends our coverage of retry. Spring Batch allows for transparent, configurable retry, which allows you to decouple the application code from any retry logic. Retry is useful for transient, non-deterministic errors, like concurrency errors. The default behavior is to retry on given exception classes until Spring Batch reaches the retry limit. Note that you can also control the retry behavior by plugging in a retry policy.

also read:

Skip and retry help to avoid job failures; they make jobs more robust. Thanks to skip and retry, you’ll have fewer red-light screens in the morning.

Comments

comments

About Krishna Srinivasan

He is Founder and Chief Editor of JavaBeat. He has more than 8+ years of experience on developing Web applications. He writes about Spring, DOJO, JSF, Hibernate and many other emerging technologies in this blog.

Speak Your Mind

*

Close
Please support the site
By clicking any of these buttons you help our site to get better