This article is based on SpringBatch in Action, to be published July-2011. It is being reproduced here by permission from Manning Publications. Manning publishes MEAP (Manning Early Access Program,) ebooks and pbooks. MEAPs are sold exclusively through Manning.com. All print book purchases include an ebook free of charge. When mobile formats become available all customers will be contacted and upgraded. Visit Manning.com for more information.
What is a bulletproof job?
A bulletproof job is able to handle errors gracefully; it won’t fail miserably because of a minor error like a missing comma. It won’t fail abruptly either for a major problem like a constraint violation in the database. Before giving some guidelines on the design of a robust job, let’s list some requirements that a job must meet.
What makes a job bulletproof?
A bulletproof batch job should meet the following general requirements:
- Robust—The job should fail only for fatal exceptions and should recover gracefully from any nonfatal exception. As software developers, we can’t do anything about a power cut, but we can properly handle incorrectly formatted lines or a missing input file.
- Traceable—The job should record any abnormal behavior. A job can skip as many incorrectly formatted lines as it wants, but it should log them to record what did not make it in the database and allow someone to do something about it.
- Restartable—In case of an abrupt failure, the job should be able to restart properly. Depending on the use case, the job could restart exactly where it left off or even forbid a restart because it would process the same data again.
Good news: Spring Batch provides all the features to meet these requirements! You can activate these features through configuration or by plugging in your own code through extension points (to log errors for example.) A tool like Spring Batch isn’t enough to write a bulletproof job; you also need to design the job properly before leveraging the tool.
Designing a bulletproof job
To make your batch jobs bulletproof, you first need to think about failure scenarios. What can go wrong in this batch job? Anything can happen but the nature of the operations in a job helps to narrow the failure scenarios. Say, a batch job starts by decompressing a ZIP archive to a working directory before reading the lines of the extracted file and inserting them in the database. Many things can go wrong: the archive can be corrupt (if it’s there!); the OS does not allow the process to write in the working directory, some lines in the files can be in an incorrect format, and the list goes on.
Testing failure scenarios
Remember that Spring Batch is a lightweight framework. It means you can easily test failure scenarios in integration tests. You can simulate many failure scenarios thanks to the testing techniques like mock objects, for example.
Once you’ve identified failure scenarios, you must think about how to deal with them. If there’s no ZIP archive at the beginning of the execution, there’s not much the job can do, but that’s not a reason to fail abruptly. How should the job handle incorrectly formatted lines? Should it skip them or fail the whole execution as soon as it finds a bad line? In our case, we could skip incorrect lines and ensure that we log them somewhere.
Spring Batch has built-in support for error handling, but it doesn’t mean you can make batch jobs bulletproof by setting some magical attribute in an XML configuration file (even if sometimes that’s the case). Rather, it means that Spring Batch provides infrastructure and deals with tedious plumbing, but you must always know what you’re doing: when and why to use Spring Batch error handling. That’s what makes batch programming interesting! Let’s now see how to deal with errors in Spring Batch.
Techniques for bulletproofing jobs
Unless you control your batch jobs as Neo controls the Matrix, you’ll always end up getting errors in your batch applications. Spring Batch includes three features to deal with errors: skip, retry, and restart. Table 1 describes these features.
Table 1 Error handling support in Spring Batch
The features listed in table 1 are independent from each other: you can use one without the others or combine them. Remember that skip and retry are about avoiding a crash on an error, whereas restart is useful when the job has crashed, to restart it where it left off.
Skipping allows for moving the processing along to the next line in an input file if the current line is in an incorrect format. If the job doesn’t process a line, perhaps you can live without it and the job can process the remaining lines in the file.
Retry attempts an operation several times: the operation can fail at first, but another attempt can succeed. Retry isn’t useful for errors like badly formatted input lines; it’s useful for transient errors, like concurrency errors. Skip and retry contribute to making job executions more robust, because they deal with error handling during processing.
Restart is useful after a failure, when the execution of a job crashed. Instead of starting the job from scratch, Spring Batch allows for restarting it exactly where the failed execution left off. Restarting can avoid potential corruption of the data in case of reprocessing. Restarting can also save a lot of time if the failed execution was close to the end. Before covering each feature, let’s see how skip, retry, and restart can apply to our import products job.
Skip, retry, and restart in action
Suppose you have an import products job: The core of the job reads a flat file containing one product description per each line and updates the online store database accordingly. Here is how skip, retry, and restart could apply to this job.
- Skip—A line in the flat file is not in the correctly format. We don’t want to stop the job execution because of a couple of bad lines: this could mean losing an unknown amount of updates and inserts. We can tell Spring Batch to skip the line that caused the item reader to throw an exception on a formatting error.
- Retry—Because some products are already in the database, the flat file data is used to update the products (description, price, and so on). Even if the job runs during a period of low activity in an online store, users sometimes access the updated products, causing the database to lock the corresponding rows. The database throws a concurrency exception when the job tries to update a product in a locked row, but retrying the update again a few milliseconds later works. You can configure Spring Batch to retry automatically.
- Restart—If Spring Batch has to skip more than 10 products because of badly formatted lines, we consider the input file invalid and it should go through a validation phase. The job fails as soon as we reach 10 skipped products, as defined in the configuration. An operator will analyze the input file and correct it before restarting the import. Spring Batch is able to restart the job on the line that caused the failed execution. The work performed by the previous execution isn’t lost.
The import products job is robust and reliable thanks to Spring Batch.
Spring Batch has built-in support to make jobs more robust and reliable. Spring Batch jobs can meet the requirements of reliability, robustness, and traceability, which are essential for automatic processing of large amount of data.