Spring Batch Processing

Lakshan Madhuranga
5 min readDec 15, 2022

--

Hello Solvers…,

In this blog post, I have discussed the fundamental concepts of batch processing, why we need and importance of batch processing and finally, I have explained how to implement a simple batch application using the Spring boot framework and Spring batch.

What is batch processing

Simply we can say batch processing is a technique that processes data in a large group instead of a single element of data. So, you can process a high volume of data with minimal human interaction. And also, most of the time we need to implement some business applications or features that should run without end-user interactions or can be scheduled specific time to run our process. Those kind of processes also can implement using batch processing.

Behind-the-scenes process

Runs without user interaction

Executes over a fixed dataset

Then let’s consider some use cases in that batch processing works in the real world.

Reporting — Processing large datasets to calculate and distribute information. We can get some examples like monthly bank statement generation, and quarterly financial reports.

Information Exchange — Sending/receiving data between systems.

Billing — Telecom companies run a monthly batch job to process call data records that includes the details of millions of phone calls to calculate charges.

What is Spring Batch

We know Spring is a very popular Java framework that can use to develop large-scale enterprise applications. So, when we develop large-scale applications there might use different batch processing based on user requirements. Because of that, we need a batch-processing framework to achieve that.

To fill that gap Spring has provided Spring batch framework which is a lightweight, comprehensive batch framework to enable the development of robust batch applications vital for the daily operations of enterprise systems.

Feature of Spring Batch

State Management — The framework stores metadata regarding jobs and their execution out of the box.

Restartability — The framework can restart failed jobs at the appropriate step.

Readers and Writers — The framework provides out-of-the-box components to integrate with popular data sources. Ex: FlatFileItemReader(CSV), JpaPagingItemReader.

Spring Batch Architecture

Spring Batch Architecture

Job Repository Stores metadata regarding JobParameters, JobInstance, JobExecution, and StepExecution.

Jobs combine multiple steps that execute in a logical flow.

Step contains the batchable business logic to run for a portion of the batch job. Typically, a batch step contains code to read a record from a batch data stream, perform business logic with that record and then continue to read the next record.

Demonstration

Coding time…….🤩

Here onward I have discussed how to implement simple batch processing based application using Spring Boot and Spring Batch. So let’s get started…

Architecture of our application

Application architecture

What happens in this application? Simply our application reads a CSV file that contains 1000 of data records and then write all the records on to the database within few seconds by helping Spring Batch. That is the main goal of our application.

Our CSV files contains 1000 of customer details with eight columns as id, firstName, lastName, email, gender, contact No, country, dob.

As a first step we need to create a simple spring boot application with following dependencies. And also, you should create a database using MySQL workbench. That database is used to store the customer data.

Spring web

Spring Data JPA

Spring Batch

MySQL Driver

Then I have created some packages as Entity, Config, Controller, Repository.

Inside the entity package I have created a java class called customer that contains customer attributes which need to create database table. You can see I have give a name to the database table as Customer_Info.

And then I have created the BatchConfig class inside the Config package. All the read operation, write operation, step creation, job running method are created inside this class. In this class we have used special annotation that is @EnableBatchProcessing. This annotation is used to enable the spring batch processing in this application.

By default Spring Batch is synchronize not asynchronized. That’s mean this is not used multithreading to perform the batch operations. But we can implement the concurrent process by using asyncTaskExecutor. Here I have implement that at the end of above code. So we can run 10 tasks concurrently, because of that we can reduce our process (Reading, Writing) time.

And then I have implemented the process operation by creating CustomerProcess class inside the Config package. Actually in this class we can control our batch process, as an example if we want to store customers who lives in German, we can implement that logic inside this class. But I don't have implement any logic just returned customer object as it is because of a demo.

After that I have created the controller class as JobController inside the Controller package. In this class I have implemented the REST API to trigger the Job. After send a POST request, the JobLauncher will run the job automatically.

We need to provide our database details inside the properties file as follows.

spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.datasource.url = jdbc:mysql://localhost:3306/spring_batch_db
spring.datasource.username = root
spring.datasource.password = root123
spring.jpa.show-sql = true
spring.jpa.hibernate.ddl-auto = update
spring.jpa.properties.hibernate.dialect = org.hibernate.dialect.MySQL5Dialect
server.port=9191
spring.batch.jdbc.initialize-schema=ALWAYS

#disabled job run at startup
spring.batch.job.enabled=false

Now we can run our application and send a POST request. So the results are here..

Batch logs

You can see all the data that are inside the csv file has stored to the database table within 5seconds.

Database

You can see there are lots of tables have created apart from customer_info. Actually those tables are created automatically from Spring Batch. We can get more information from those tables such as how many jobs are ran, time, failed jobs likewise we can get lots of details.

Customer_info Table

You can see customer_info table after the job ran. All the details are here that contains inside the CSV file.

I think you can get batter understand about how Batch processing is working and how easily to implement that using Spring Boot and Spring Batch frameworks from this blog.

Thank you very much for read my post.

--

--

Lakshan Madhuranga
Lakshan Madhuranga

Written by Lakshan Madhuranga

Undergraduate at university of Kelaniya, and Studies ICT in faculty of computing and technology.

No responses yet