Datastage Interview Questions and Answers

50+ TOP Datastage Interview Questions and Answers

Datastage Interview Questions
Datastage Interview Questions

1. What are the components of datastage?
Datastage consists of a number of client and server components. Datastage has four client components :
? Datastage Designer
? Datastage Director
? Datastage Manager
? Datastage Administrator

2. What are the main features of datastage?
DataStage has the following features to aid the design and processing required to build a data warehouse :
? Uses graphical design tools. With simple point and click techniques you can draw a scheme to represent your processing requirements.
? Extracts data from any number or types of database.
? Handles all the metadata definitions required to define your data warehouse.
? You can view and modify the table definitions at any point during the design of your application.
? Aggregates data.
? You can modify SQL SELECT statements used to extract data.
? Transforms data. DataStage has a set of predefined transforms and functions. you can use to convert your data. You can easily extend the functionality by defining your own transforms to use.
? Loads the data warehouse.

3. What are the types of jobs availabe in datastage?
? Server Job
? Parallel Job
? Sequencer Job
? Container Job

4. What is the difference between Server Job and Parallel Jobs?
Server Jobs works in sequential way while parallel jobs work in parallel fashion (Parallel Extender work on the principal of pipeline and partition) for I/O processing.

5. Where the DataStage stored his repository?
DataStage stored his repository in IBM Universe Database.

6. What are the types of datastage clients?
? Datastage Administrator
? Datastage Designer
? Datastage manager
? Datastage Director

7. What is metadata?
Data about data. A table definition which describes the structure of the table is an example of meta data.

8. What is the difference between maps and locales?
Maps : Defines the character sets that the project can use.
Locales : Defines the local formats for dates, times, sorting order, and so on that the project can use.

9. Define job?
A collection of linked stages, data elements, and transforms that define how to extract, cleanse, transform, integrate, and load data into a target database.

10. What is datastage director?
The DataStage Director is the client component that validates, runs, schedules, and monitors jobs run by the DataStage Server.

11. What is the difference between DataStage and Informatica?
? DataStage support parallel processing which Informatica doesn’t.
? Links are object in the DataStage ,in Informatica it’s a port to port connectivity.
? In Informatica its easy to implement Slowly Changing Dimensions which is little bit complex in dataStage.
? DataStage doesn’t support complete error handling.

12. What is data aggregation?
An operational data source usually contains records of individual transactions such as product sales. If the user of a data warehouse only needs a summed total, you can reduce records to a more manageable number by aggregating the data.

13. What are the types of server components?
? Repository
? Datastage Server
? Datastage Package Installer

14. WhWhat is datastage job?
DataStage jobs consist of individual stages. Each stage describes a particular database or process. For example, one stage may extract data from a data source, while another transforms it. Stages are added to a job and linked together using the designer.

15. What are the types of stage?
A stage can be passive or active. A passive stage handles access to databases for the extraction or writing of data. Active stages model the flow of data and provide mechanisms for combining data streams, aggregating data, and converting data from one data type to another. There are two types of stage :
Built in stages : Supplied with DataStage and used for extracting, aggregating, transforming, or writing data.
Plug in stages : Additional stages defined in the DataStage Manager to perform tasks that the built-in stages do not support.

16. What is universe stage?
A stage that extracts data from or loads data into a Universe database using SQL. Used to represent a data source, an aggregation step, or a target data table.

17. What is dynamic array?
Dynamic arrays map the structure of DataStage file records to character string data. Any character string can be a dynamic array. A dynamic array is a character string containing elements that are substrings separated by delimiters.

18. What is the difference between DataStage version 5.2 and 6.0?
Version 5.2 doesn’t have – IPC Stage ,Link Partition Stage,Link Collector Stage and Parallel Extender.

19. What are the components of Ascential Data Stage?
Client Components : Administrator, Director, Manager, and Designer.
Server Components : Repository, Server and Plug–ins.

20. What are the types of Containers?
There are two types of containers are :
Local Container
Shared Container

21. What is hash file stage?
Hash file stage is binary file used for lookup, for better performance.

22. What is staging variable?
These are the temporary variables created in transformer for calculation.

23. What are Routines?
Routines are the functions which we develop in BASIC Code for required tasks, which we DS is not fully supported (Complex).

24. How do u convert the columns to rows in DataStage?
Using Pivot Stage.

25. What is merge stage?
The Merge stage combines a sorted master data set with one or more sorted update data sets. The columns from the records in the master and update data sets are merged so that the output record contains all the columns from the master record plus any additional columns from each update record.

26. What is datastage designer?
A design interface used to create DataStage applications (known as jobs). Each job specifies the data sources, the transforms required, and the destination of the data. Jobs are compiled to create executables that are scheduled by the Director and run by the Server.

27. What is datastage manager?
A user interface used to view and edit the contents of the repository.

28. What is datastage administrator?
A user interface used to set up DataStage users, create and move projects, and set up purging criteria.

29. What is repository?
A central store that contains all the information required to build a data mart or data warehouse.

30. What is datastage server?
Runs executable jobs that extract, transform, and load data into a data warehouse.

31. What is fact table?
The chief feature of a star schema is the table at the center, called the fact table.

32. What are the advantages of data warehousing?
A data warehousing strategy provides the following advantages :
? Capitalizes on the potential value of the organization’s information.
? Improves the quality and accessibility of data.
? Combines valuable archive data with the latest data in operational sources.
? Increases the amount of information available to users.
? Reduces the requirement of users to access operational data.
? Reduces the strain on IT departments, as they can produce one database to serve all user groups.
? Allows new reports and studies to be introduced without disrupting operational systems.
? Promotes users to be self sufficient.

33. What is container?
Containers are the reusable set of stages.

34. What is the difference between local and shared container?
Local Container is local to the particular job in which we developed the container.
Shared Container is can be used in any other jobs also.

35. What is orabulk Stage?
This Stage is used to Bulk Load the Oracle Target Database.

36. What is the difference between active and passive Stage?
Passive Stages are used for data extraction and loading.
Active Stage are used to implements and process the business rules

37. What is meta data repository?
Meta Data is a data about the data. It also contains
? Query statistics
? ETL statistics
? Business subject area
? Source Information
? Target Information
? Source to Target mapping Information

38. What is the difference between join stage and merge stage?
JOIN : Performs join operations on two or more data sets input to the stage and then outputs the resulting dataset.
MERGE : Combines a sorted master data set with one or more sorted updated data sets. The columns from the rccords in the master and update data set s arc merged so that the out put record contains all the columns from the master record plus any additional columns from each update record that required.

39. What is the default cache size?
Default cache size is 256 MB.

40. How do you schedule or monitoring the job?
Using the DataStage Director we can schedule or monitor the job.

41. How we can reuse the components?
Using the Shared and Local Containers.

42. What is the difference between primary key and partition key?
Primary Key is a combination of unique and not null. It can be a collection of key values called as composite primary key. Partition Key is a just a part of Primary Key. There are several methods of partition like Hash, DB2, and Random etc. While using Hash partition we specify the Partition Key.

43. How do you eliminate duplicate rows?
Data Stage provides us with a stage Remove Duplicates in Enterprise edition. Using that stage we can eliminate the duplicates based on a key column.

44. How we can reuse the components?
Using the Shared and Local Containers.

45. What is transformer stage?
Transformer stages do not extract data or write data to a target database. They are used to handle extracted data, perform any conversions required, and pass data to another Transformer stage or a stage that writes data to a target data table.

46. What are the command line functions that import and export the DS jobs?
dsimport.exe : imports the DataStage components.
dsexport.exe : exports the DataStage components.

47. Did you use the conditional scheduling in your project?
Using Sequencer Job we can create the conditional scheduling.

48. How do u handle the version controls?
Ascential have a separate product called ( Version Control ),which is used to handle version control.

49. What are stage variables?
Stage variables are declaratives in Transformer Stage used to store values. Stage variables are active at the run time. (Because memory is allocated at the run time).

50. What can we do with DataStage Director?
Validating, Scheduling, Executing and Monitoring Jobs (server Jobs).

Frequently Asked Datastage Interview Questions