Review
In the previous section, we have written and discussed the Spring Batch-related classes. In this section, we will write and declare the Spring Batch-related configuration files.Table of Contents
Part 1: Introduction and Functional SpecsPart 2: Java classes
Part 3: XML configuration
Part 4: Running the Application
Configuration
Properties File
The spring.properties file contains the database name and CSV files that we will import. A job.commit.interval property is also specified which denotes how many records to commit per interval.Spring Batch
To configure a Spring Batch job, we have to declare the infrastructure-related beans first. Here are the beans that needs to be declared:- Declare a job launcher
- Declare a task executor to run jobs asynchronously
- Declare a job repository for persisting job status
What is Spring Batch?
Spring Batch is a lightweight, comprehensive batch framework designed to enable the development of robust batch applications vital for the daily operations of enterprise systems. Spring Batch builds upon the productivity, POJO-based development approach, and general ease of use capabilities people have come to know from the Spring Framework, while making it easy for developers to access and leverage more advance enterprise services when necessary. Spring Batch is not a scheduling framework.
Source: Spring Batch Reference Documentation
What is a JobRepository?
JobRepository is the persistence mechanism for all of the Stereotypes mentioned above. It provides CRUD operations for JobLauncher, Job, and Step implementations.
Source: Spring Batch - Chapter 3. The Domain Language of Batch
What is a JobLauncher?
JobLauncher represents a simple interface for launching a Job with a given set of JobParameters
Source: Spring Batch - Chapter 3. The Domain Language of Batch
Here's our main configuration file:
Notice we've also declared the following beans:
- Declare a JDBC template
- User and Role ItemWriters
Job Anatomy
Before we start writing our jobs, let's examine first what constitutes a job.What is a Job?
A Job is an entity that encapsulates an entire batch process. As is common with other Spring projects, a Job will be wired together via an XML configuration file
Source: Spring Batch: The Domain Language of Batch: Job
Each job contains a series of steps. For each of step, a reference to an ItemReader and an ItemWriter is also included. The reader's purpose is to read records for further processing, while the writer's purpose is to write the records (possibly in a different format).
What is a Step?
A Step is a domain object that encapsulates an independent, sequential phase of a batch job. Therefore, every Job is composed entirely of one or more steps. A Step contains all of the information necessary to define and control the actual batch processing.
Source: Spring Batch: The Domain Language of Batch: Step
Each reader typically contains the following properties
- resource - the location of the file to be imported
- lineMapper - the mapper to be used for mapping each line of record
- lineTokenizer - the type of tokenizer
- fieldSetMapper - the mapper to be used for mapping each resulting token
What is an ItemReader?
Although a simple concept, an ItemReader is the means for providing data from many different types of input. The most general examples include: Flat File, XML, Database
Source: Spring Batch: ItemReaders and ItemWriters
What is an ItemWriter?
ItemWriter is similar in functionality to an ItemReader, but with inverse operations. Resources still need to be located, opened and closed but they differ in that an ItemWriter writes out, rather than reading in.
Source: Spring Batch: ItemReaders and ItemWriters
The Jobs
As discussed in part 1, we have three jobs.Job 1: Comma-delimited records
This job contains two steps:- userLoad1 - reads user1.csv and writes the records to the database
- roleLoad1 - reads role1.csv and writes the records to the database
Both steps are using their own respective FieldSetMapper: UserFieldSetMapper and RoleFieldSetMapper.
What is DelimitedLineTokenizer?
Used for files where fields in a record are separated by a delimiter. The most common delimiter is a comma, but pipes or semicolons are often used as well.
Source: Spring Batch: ItemReaders and ItemWriters
Job 2: Fixed-length records
This job contains two steps:- userLoad2 - reads user2.csv and writes the records to the database
- roleLoad2 - reads role2.csv and writes the records to the database
Notice userLoad2 is using FixedLengthTokenizer and the properties to be matched are the following: username, firstName, lastName, password. However, instead of matching them based on a delimiter, each token is matched based on a specified length: 1-5, 6-9, 10-16, 17-25 where 1-5 represents the username and so forth. The same idea applies to roleLoad2.
What is FixedLengthTokenizer?
Used for files where fields in a record are each a 'fixed width'. The width of each field must be defined for each record type.
Source: Spring Batch: ItemReaders and ItemWriters
Job 3: Mixed records
This job contains two steps:- userLoad3 - reads user3.csv and writes the records to the database
- roleLoad3 - reads role3.csv and writes the records to the database
Job 3 is a mixed of Job 1 and Job 2. In order to mix both, we have to set our lineMapper to PatternMatchingCompositeLineMapper.
What is PatternMatchingCompositeLineMapper?
Determines which among a list of LineTokenizers should be used on a particular line by checking against a pattern.
Source: Spring Batch: ItemReaders and ItemWriters
For the FieldSetMapper, we are using a custom implementation MultiUserFieldSetMapper which removes a semicolon from the String. See Part 2 for the class declaration.
Next
In the next section, we will run the application using Maven. Click here to proceed.
Share the joy:
|
Subscribe by reader Subscribe by email Share
No comments:
Post a Comment