0 / 0
Consideration about end of wave in DataStage

Consideration about end of wave (DataStage)

In DataStage® parallel jobs, some stages can send an end of wave marker (EOW), which indicates the end of a unit of work or transaction. When all the records that are extracted from the input link are included in a unit of work (called a single wave), the Excel stage generates Microsoft Excel sheets that contain the maximum number of records until all of the records are written.

For example, suppose that the Excel stage has two input links, DSLink1 and DSLink2. DSLink1 is associated with Sheet1 and DSLink2 is associated with Sheet2. The maximum number of records in a sheet is 65,536 and DSLink1 has 100,000 records; DSLink2 has 150,000 records. Each sheet does not have column names in the first row. In this case, the following number of records is included in each sheet of each file.
File Name Sheet1 Sheet2
Workbook001.xlsx 65,536 65,536
Workbook002.xlsx 34,464 65,536
Workbook003.xlsx 0 18,928
When records that are extracted from the input link are divided into two or more units of work (called multiple waves), the Excel stage stops writing records to the Microsoft Excel sheet and creates a new Microsoft Excel file if the number of records in a wave exceeds the maximum number in at least one sheet. The Excel stage does not write any records in the next wave to the previous file even if a sheet can contain more records. For example, assume that each link contains the following number of records in each wave.
Wave# DSLink1 DSLink2
1 90,000 50,000
2 5,000 90,000
3 5,000 10,000

In the first wave, the Excel stage creates a Microsoft Excel file named Workbook001.xlsx that has two sheets, Sheet1 and Sheet2. The Excel stage writes records that are extracted from DSLink1 to Sheet1 until it reads the maximum number of records (65,536) and writes all the records (50,000) from DSLink2 to Sheet2. Next, the Excel stage creates a Microsoft Excel file named Workbook002.xlsx and writes the rest of the records (24,464) in the first wave and all the records (5,000) in the second wave from DSLink1 to Sheet1. Even though Sheet2 of Workbook001.xlsx does not exceed the maximum number of records, the Excel stage writes records (65,536) in the second wave extracted from DSLink2 to Sheet2 of Workbook002.xlsx, not Workbook001.xlsx. When the number of records in the second wave from DSLink2 exceeds the maximum number, the Excel stage creates a Microsoft Excel file named Workbook003.xlsx and writes the rest of the records (24,464) in the second wave from DSLink2 to Sheet2 of Workbook003.xlsx. For the third wave, because both of the sheets have enough room, the Excel stage writes all of the records extracted from DSLink1 (5,000) and DSLink2 (10,000) to Sheet1 and Sheet2 of Workbook003.xlsx.

As a result, the following number of records is written in each sheet of each file:
File name Sheet1 Sheet2
Workbook001.xlsx 65,536 from 1st wave 50,000 coming from 1st wave
Workbook002.xlsx 29,464 (= 24,464 from 1st wave + 5,000 from 2nd wave) 65,536 coming from 2nd wave
Workbook003.xlsx 5,000 from 3rd wave 34,464 (= 24,464 from 2nd wave + 10,000 from 3rd wave)
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more