Friday, January 29, 2010

DB2

DB2 Tutorial :-

Introduction to Databases

What is data?

A representation of facts or instruction in a form suitable for communication.

What is a database?

It is a repository for stored data.

What is a database System?

An integrated and shared repository for stored data or collection of stored operational data used by application systems of some particular enterprise.

Nothing more than a computer-based record keeping system

Advantages of DBMS over File Management Systems

Data redundancy

Multiple Views

Shared data

Data independence (logical/Physical)

Data dictionary

Search versatility

Cost effective

Security and Control

Recovery, restart & Backup

Concurrency

Types of Databases ( or Models)

Hierarchical Model

Network Model

Relational Model

Object-Oriented Model

HIERARCHICAL MODEL

Top down structure resembling an upside-down tree.

Parent child relationship

First logical database Model

Available on most of the Mainframe computers

Examples: IMS

NETWORK MODEL

Does not distinguish between parent and child. Any record type can be associated with any number of arbitrary record types Enhanced to overcome limitations of other models but in reality there is minimal difference due to frequent enhancement

Example: IDMS

RELATIONAL MODEL

Data stored in the form of table consists of multiple rows and columns.

Examples: DB2,ORACLE, SYBASE

OBJECT-ORIENTED MODEL

Data attributes and methods that operate on those attributes are encapsulated instructions called objects.

Types of Integrity

Entity Integrity

Referential Integrity

Domain Integrity

Entity Integrity

Is a state where no column that is part of a primary key can have a null values.

Referential Integrity

Is a state where every foreign key in the first table must either match a primary key value in the second table or must be wholly null.

Domain Integrity

Integrating of information allowed in column.

Entity Relationship Model

E-R model is a logical representation of data for a business area.

Represented as entities relationship between entities and attributes of both relationships and entities.

E-R models are outputs of analysis phase.

Example of relational Structure
CUSTOMER places ORDERS
ORDER has PRODUCTS

Each order relates to only one customer (one-to-one)

Many orders can contain many products (many-to-many)

A customer can place any number of orders ( one-to-many)

In last example customer,order & product are called entities

An entities may transform into tables.

The unique identity for information stored in an entity is called a primary key

Attributes which define the characteristics of the table.

DB2

OBJECTS

Stogroup(Storage group)

Database

Table Space

Table

View

Index

Stogroup ( Storage Group)

It is a collection of direct access volume, all of the same device type

The option is defined as a part of table space definitions

When a given space needs to be extended, storage is acquired from the appropriate stogroup.

Database

A collection of logically related objects – like table spaces, index spaces, tables etc.

Not a physical kind of object, may occupy more than one disk space.

A STOGROUP & BUFFER POOL must be defined for each database.

In a given database, all the spaces need not have the same STOGROUP.

TABLE SPACES

Logical address space on secondary storage to hold one or more tables.

It is the storage unit for recovery and reorganizing purpose

Three types of table spaces are :

Simple

Partitioned

Segmented

Simple Table Spaces

Can contain more than one stored tables

Depending on application, storing more than one tables might enable faster retrieval for joins using these tables.

Usually only one table is preferred. This is because a single page can contain rows from all tables defined in the database.

Segmented table spaces

Can contain more than one stored tables, but in a segmented space.

A ‘segment’ consists of a logically contiguous set of ‘n’ pages.

No segment is allowed to contain records for more than one table.

Sequential access to a particular table is more efficient.

Lock Table on table locks only the table, not the entire table space.

If a table is dropped, the space for that table can be reclaimed within minimum reorganization.

Partitioned Table Space

Primarily used for very large tables

Only one table in a partitioned table space

Individual partitions can be independently recovered and reorganized.

Different partitions can be stored on different storage groups for efficient access.

TABLES

A table is a collection of rows and columns.

The data is stored on magnetic disks in a series of TABLES.

Each COLUMN contains some specific information about suppliers and each ROW contains all the information about a particular supplier.

For example SUPPLIER table looks like :

S#

SNAME

STATUS

CITY

001

PRASAD

20

CHENNAI

002

VASU

30

DELHI

003

SIVA

10

BOMBAY

VIEWS

Views can be very practical ways of simplifying queries by reducing the number of different tables.

Views can also hide sensitive data from users who don’t need access to it.

CREATE VIEW EMP_VIEW
AS
SELECT EMPNO,NAME,DEPT,JOB FROM EMP

SYNONYMS

Synonym is like a nick name to a table name and when no longer needed it can be dropped. Synonym access is specific to the user who has created it.

STRUCTURED QUERY LANGUAGE(SQL)

A powerful database management language that performs the function of data manipulation, data definition and data control.

A non-procedural language.

The capability to act on a set of data and the lack of need to know how to retrieve it.

It allows to specify WHAT the result should be, NOT HOW the result obtained.

SQL TYPES(Based on functionality)

DDL - CREATE, ALTER and DROP

DML - SELECT,INSERT, UPDATE & DELETE

DCL - GRANT and REVOKE

SQL TYPES(Others)

Static or Dynamic SQL

Embedded or Stand-alone SQL

HOW DO YOU UTILIZE SQL?

It is usually accessed on-line.

It can be used interactively, such as SPUFI,QMF or it can be embedded in COBOL,PL/I,ASSEMBLER or FORTRAN program.

WHAT MIGHT YOU DO WITH SQL?

As an end-user you will be using SQL on-line to access data and produce formatted reports.

As an application programmer, you will use SQL to extract data from one or more tables and pass these data to another program for additional processing.

SQL task is center about four basic functions & these functions correspond to the four basic SQL command (SELECT, UPDATE, INSERT, DELETE)

HOW TO CREATE A SIMPLE SQL QUERY?

Search the table of your choice

Select specific columns to be displayed from a table

Select specific rows to be displayed from a table

Specify the order in which to display data

How do you specify which tables you want to select from?

Which tables contain data?

Which columns to display from the table?

Which rows to display?

In which order to display the columns and/or rows.

Any SELECT statement must specify a table

If the table belongs to another user, you may have to prefix the table name with the other user’s ID.

How do you specify which COLUMNS you want to display?

Listing the name of the columns on a SELECT statement, separated by commas.

SELECT

SNAME, STATUS

FROM

SUPPLIERS

Why not use the ‘*’ in all your queries?

Using the ‘*’ to indicate all columns should be displayed is inefficient if you don’t need to see all the columns.

You can put the columns in the order you want them.

It is better to specify just those column you want.

How do you specify which rows you want to display?

Using WHERE keyword.

The format is WHERE followed by a condition or list of conditions.

SELECT

SNAME

FROM

SUPPLIER

WHERE

STATUS=30

Can you put the data in Order?

Using ORDER BY keyword to your query

SQL automatically orders data in ascending order if you have any ORDER BY statement in your query.

If you want your data in descending order, use the DESC keyword.
SELECT STATUS FROM SUPPLIERS ORDER BY STATUS DESC

Can you order by more than one column?

You can order by as many columns as you want.

Simple add the columns you want to sort to the ORDER BY clause, separated by comma.

Example:
SELECT SNAME ,STATUS, CITY FROM SUPPLIERS ORDER BY STATUS,SNAME

Use of ‘AND’ and ‘OR’ to make a query more specific.

Sometimes a single search condition is not precise enough to select only the rows you need. So you need to use the logical operator AND and OR.

You can link as many search expression as you want using AND and OR.

Using both operator together, searched expression linked by AND are paired up first. Parentheses can be used to force expression to be paired together.

How do you avoid duplicate data in your output?

By adding the keyword DISTINCT immediately following SELECT , you suppress any duplicates. SELECT DISTINCT STATUS FROM SUPPLIERS

What are the relational operators?

Here is the full list of operators:

SYMBOL

CONDITION

=

equal to

>

greater than

<

less than

>=

greater than or equal to

<=

lesser than or equal to

<>

less than or greater than

LIKE

containing certain characters

BETWEEN

within a range of values

IN

one of a set of values

Each condition also has a negative version.

IN

Lets you retrieve data from each row whose columns has a value equal to one of the several listed values.

Example: WHERE STATUS IN (20,10,30)

Any row with STATUS column value of 20,10, 30 is selected.

Each character value must be in quotes.

Values are separated by comma & values list is enclosed in parenthesis.

BETWEEN

  • Lets you retrieve data from each row whose columns has a value with two limits.
  • Example: WHERE STATUS BETWEEN 10 AND 30
  • Any row with STATUS column value of 10 thru 30 is selected.
  • It can be used for both numeric and character values.
  • The two values must be separated by AND, the lower value must be the first value.

LIKE

  • Lets you retrieve data from each row whose columns has a value similar to the value specified in the WHERE CLAUSE.
  • Example: WHERE SNAME LIKE ‘B%C’
  • Any row with name column beginning with ‘B’ followed by ANY VALUE of ANY LENGTH, and ending with ‘C’ is selected.
  • A ‘%’ represents any value of any length, where as an underscore ‘_’ represents any single character LIKE can’t use for any numeric column.

What is NULL?

  • A NULL is a column value that does not exists.
  • It is not a blank, not is it zero – it is a ‘non-value’.
  • The columns must predefined to the DBMS as a column that may contain NULL values.

HOW DOES SQL TREAT NULL VALUES?

  • Generally, NULL values are ignored.
  • If you perform a calculation using a column with NULL values, those rows containing NULL's are ignored and a dash will output as the result.
  • NULL means “VALUE UNKNOWN”, so it cannot meet any of the standard search conditions.

Example of NULL values

  • To select all rows with a NULL value in the COMM column, you must use WHERE COMM IS NULL as your search condition
  • To select all rows that do NOT have a NULL value, you must use WHERE COMM IS NOT NULL as your search condition
  • Note:

WHERE COMM = 0 …

Who has earned no commission?

WHERE COMM IS NULL

Who is ineligible for commission?

ARITHMETRIC EXPRESSIONS

  • An arithmetic expression is a calculation using numeric values and one or more of the arithmetic operators :addition (+), subtraction (-), multiplication (*), and division (/).
  • Example: SELECT SALARY + COMM,NAME,DEPT
  • Arithmetic expressions can also be used in a WHERE clause: WHERE SALARY + COMM >5000.00

RULES FOR CODING ARITHMETIC EXPRESSIONS

  • Spaces on either side of the arithmetic symbols are optional.
    SALARY/12 OR SALARY / 12
  • You can combine column names and constants in arithmetic expressions.
    1998 - JOIN_YR, 1.0*SALARY

HOW DOES SQL EVALUATE ARITHMETIC EXPRESSIONS?

  • SQL works from left to right, evaluating multiplication and division before addition and subtraction (*, / , +, -).
  • However any expressions in parentheses are evaluated first.

BUILT IN FUNCTION

  • SUM
  • MAX
  • MIN
  • AVG
  • COUNT(*)
  • COUNT(DISTINCT col name)

SUM

  • Calculate the total value for a given column or arithmetic expression
    SELECT SUM(SALARY) FROM EMP

MAX and MIN

  • Display either the highest or lowest value found in the rows selected by the query. SELECT MAX(SALARY),MIN(SALARY) FROM EMP

AVG

  • Display the average of all non-NULL columns
    SELECT AVG(SALARY) FROM EMP
  • AVG disregards any rows with NULLs in any column involved in the averaging expression
  • Integer columns are not rounded, so result may be inaccurate.
  • To convert an integer column to a decimal column multiply by 1.0 as part of the AVG expression.
    AVG(1.0 * STATUS)

COUNT(*)

  • It counts all the rows selected by a query
    SELECT COUNT(*) FROM SUPPLIERS

COUNT(DISTINCT col name)

  • It counts only those rows that have a unique value in the specified column
    SELECT COUNT(DISTINCT STATUS) FROM SUPPLIERS
  • COUNT (*) counts all the rows selected
  • COUNT(DISTINCT col name) does not count rows with a duplicate value in the specified column.
  • COUNT(DISTINCT col_name) does not count rows that have NULLS in the specified column.

HOW DOES SQL CREATES SUMMARY DATA

  • When you request a built-in summary functions, SQL goes through several steps to reach a result.
  • First, all the rows and columns are selected. This intermediate result is kept in a temporary table.
  • Next, the system process your GROUP BY request, creating a temporary table for each GROUP and putting the appropriate rows in each table.
  • Then the built-in functions are calculated.
  • If, for example you requested an AVG, then the average for each temporary sub group is calculated.

GROUP BY

  • GROUP BY outputs a summary table – one line for each group.
  • GROUP BY can only be used in conjunction with built-in functions.
  • GROUP BY follows FROM and WHERE and precedes ORDER BY
  • Example: SELECT DEPT, MAX(SALARY) FROM EMP GOUP BY DEPT
  • It will produce a table listing the maximum salary value for each department.

HAVING

  • It is similar in function to the WHERE clause
  • Its purpose is to further define a ‘group’ specified in the GROUP BY statement SELECT DEPT, AVG(SALARY) FROM EMP GROUP BY DEPT HAVING COUNT(*)>3
  • HAVING can only be used in conjunction with GROUP BY
  • HAVING must immediately follow the GROUP BY clause
  • HAVING can only compare built-in functions not individual columns

JOINING

  • Joining tables involves matching the rows of one table with related rows of another table to produce a table consisting of rows with columns from both the original tables.

BASIC PRINCIPLES OF JOINING TABLES

  • It will combine related information from two tables into a single row.
  • If a value appears once in the “joining column” of one table, but several times in the other, the higher number of occurrences will appear in the output table.
  • If a value exists in the “joining column” of one table, but not in the other, no row appears in the output table with that value.

JOINING COLUMN

  • It is a column that is common to all the tables being joined and contains the row values that tie one table to next.

HOW DO YOU ACTUALLY TIE THE TABLES TOGETHER?

  • With a special type of WHERE clause called the joining condition
    WHERE EMP.DEPTNO = DEPT.DEPTNO
  • Example: SELECT ITEM,PRICE,COLOR FROM Table1,Table2 WHERE Table1.ITEM = Table2.ITEM

MERGING

  • Merging tables also involves combining data from two tables, but the rows are not joined together.
  • Example:SELECT *FROM EMP UNION SELECT * FROM SUPPLIER

MERGING QUERY CODING

  • Select data from one table
  • Specify UNION or UNION ALL to indicate MERGE
  • Select data from second table
  • Example: SELECT * FROM Table_1 UNION SELECT * FROM Table_2
  • The UNION or UNION ALL keyword must be entered after the first SELECT statement

UNION vs UNION ALL

  • UNION sorts the result tables, UNION ALL does not. In other words, UNION mixes rows together in the output. UNION ALL outputs all the rows from the first table that meet the search condition. UNION removes duplicate rows, UNION ALL does not.

WHEN TO MERGE TWO TABLES

  • Each column selected from first table must be compatible with the corresponding columns from the second table.
  • A numeric column are compatible with numeric column, a character column to a character column, and so on.
  • Numeric data types need not be the same column length You must select the same number of columns from each tables.

CREATING AND UPDATING TABLES

  • To create a new table you must use the SQL statement CREATE TABLE. You must specify the table name and the names and attributes of each column.

RULES FOR TABLE NAME

  • Maximum 18 characters
  • Permissible characters
    • letters of the alphabet
    • @, #, and $
    • numbers
    • underscore (_)
  • The first character must be alphabetic or @,#, or $

DATA TYPES

Data type depends on the nature of the data itself and will always be one of two types – NUMERIC or CHARACTER

Types of CHARACTER column

CHAR – Allows any values to be entered in the column. All entries are the same length

VARCHAR – Allows any values to be entered in the column. All entries are the varying length

Types of NUMERIC columns

INTEGER - For very large integers up to 2,147,483,647

SMALLINT - For small integer up to 99999 Table Space

DECIMAL - For numbers with a fixed number of decimal places after the decimal point. The number can have a total of 15 digits

FLOAT -For very large numbers with an undetermined number of decimal places after the decimal point

Types of DATE/TIME columns

DATE - Dates are stored in YYYYMMDD format

TIME - Times are stored on a 24-hour clock in HHMMSS format

TIMESTAMP – TIMESTAMP columns contain both date and time information, with the time value accurate to the millisecond.

GRANT & REVOKE

  • Privileges in DB2 to control accessing data
  • To grant access to a table or view you must run a query that specifies three things in the following order
    • The privileges you are granting (SELECT,INSERT,DELETE,UPDATE or ALL)
    • The name of the table or view
    • The user or users to whom you are granting access STOGROUP(Storage group)
  • To grant everything, use GRANT ALL
  • To grant a privilege to EVERYBODY , you grant TO PUBLIC

PRIVILEGES

  • GRANT ALL ON TABLE EMP TO PUBLIC
  • REVOKE ALL ON EMP TO PUBLIC

CAN YOU CHANGE A TABLE OR VIEW AFTER YOU CREATE IT?

  • The only change you can make to a table or view after it has been created is to add an additional column.
  • This is done with the SQL command ALTER
    • ALTER TABLE EMP ADD COMM DECIMAL (7 ,2)

DROP

  • When you no longer need a TABLE, VIEW, or SYNONYM you can DROP them.
    • DROP TABLE SUPPLIER
    • DROP VIEW EMP_VIEW
    • DROP SYNONYM S

INSERTING A TABLE

  • There must be one value for each column
  • Separate values with commas
  • Put character values in single quotes
  • Numeric values do not require quotes
  • If you want enter the values in a different order, there is an alternative way of inserting a row, by listing the columns along with their assigned values
    • INSERT INTO SUPPLIER (S#,STATUS,SNAME) VALUES(‘S1’,30,’VAST’)

UPDATE

  • SQL provides two commands for updating row values – UPDATE and SET.
  • UPDATE tells SQL which table you want to update and SET provides the name of the column to update and the new value for the column.
  • You tell SQL which row to update with a WHERE clause can be very specific, so that a single row gets changed, or it can be very general, so that many rows get updated.
    • <B>UPDATE SUPPLIER SET STATUS = 40 WHERE S# = ‘S1’
    • UPDATE EMP SET SALARY = SALARY + SALARY *.01

DELETE

  • Using the DELETE command you can delete rows and using WHERE clause you can specify which rows you want to delete.
    • DELETE FROM SUPPLIER WHERE S# = ‘S1’

Complex SQL’s

  • One terms a SQL to be complex when data that is to be retrieved comes from more than one table.
  • SQL provides two ways of coding a complex SQL
    • Subqueries and Join

SUB QUERIES

  • Nested select statements.
  • Specified using the IN or NOT IN predicate, equality or non-equality predicate and comparative operator.
  • When using the equality, non-equality or comparative operators the inner query should return only a single value.
  • Nested loop statements gives the user flexibility for querying multiple tables.
  • A specialized form is a correlated sub query. It works on top-bottom-top fashion.
  • Non correlated sub query works in bottom to top fashion.

JOINS

  • OUTER JOIN
  • INNER JOIN

OUTER JOIN

  • For one or more tables being joined both matching and the non-matching rows are returned.
  • Duplicate columns may be eliminated.
  • Non-matching columns will have nulls.

INNER JOIN

  • Here there is a possibility one or more of the rows from either or both tables being joined will not be included in the table that results from the join operation

QMF- Query Management Facility

  • It is an MVS and VM-based query tools
  • Allows end users to enter SQL queries to produce a verity of reports and graphs as a result of this query.
  • QMF queries can be formulated in several ways: by direct SQL statements, by means of relational prompted query interface

SPUFI (SQL Processing Using File Input)

  • Supports the on-line execution of SQL statements from a TSO terminal.
  • Used for developers to check SQL statements or view table details.
  • SPUFI menu contains the input file in which the SQL statements are coded, option for default settings and editing and output file.

Program Preparation

  • Precompile
  • Compile & Link
  • Bind
    • Package
    • Plan

Precompile

  • Searches all the SQL statements and DB2 related INCLUDE members and comments out every SQL statement in the program.
  • The SQL statements are replaced by a CALL to the DB2 runtime interface module along with parameter.
  • All SQL statements are extracted and put in a DBRM.
  • Places a time stamp in the modified source and the DBRM so that these are tied.
  • All DB2 related INCLUDE statements must be placed between EXEC SQL & END-EXEC key words for the precompiler to recognize them.

Compile & Link

  • Modified precompiler COBOL output is complied.
  • Complied source is link edited to an executable load module.

Bind

  • A type of compiler for SQL statement.
  • It reads the SQL statements from the DBRM and produces a mechanism to access data as directed by the SQL statements being bound.
  • Checks syntax, check for correctness of table & column definitions against the catalog information & performs authorization validation.

Package

  • It is a single bound DBRM with optimized access paths.
  • It also contains a location identifier a collection identifier and a package identifier.
  • A package can have multiple versions, each with it’s own version identifier.

Advantages of Package

  • Reduced bind time.
  • Versioning.
  • Provides remote data access.
  • Can specify bind options at the programmer level.

Plan

  • An application plan contains one or both of the following elements
    • A list of package names.
    • The bound form of SQL statements taken from one or more DBRM.
  • Every DB2 application requires an application plan.
  • Plans are created using the DB2 sub commands BIND PLAN.

DB2 Optimizer

  • Analyzes the SQL statements and determines the most efficient way to access data, gives physical data independence.
  • It evaluates the following factors: CPU cost, I/O cost, DB2 catalogue statistics & the SQL statements.
  • It estimate CPU time, cost involved in applying predicates traversing pages & sorting.
  • It estimates the cost of physically retrieving and writing data.

Steps involved in creating a DB2 Application

  • Using Embedded SQL
  • Using Host Variables (DCLGEN)
  • Using SQLCA
  • Precompile
  • Compile & Linkedit the program
  • Bind

Embedding SQL Statements

  • It is like the file I/O
  • Normally the embedded SQL statements contain the host variables coded with the INTO clause of the SELECT statement.
  • They are delimited with EXEC SQL & END-EXEC
  • Example: EXEC SQL
    SELECT eno, ename INTO :h-eno,:h-ename
    FROM employee WHERE eno=1001
    END-EXEC.

Host Variables

  • These are variables defined in the host language to use the predicates of a DB2 table. These are referenced in the SQL statement.
  • A means of moving data from and to DB2 tables.
  • DCLGEN produces host variables the same as the columns of the tables.

DCLGEN

  • Issued for a single table.
  • Prepares the structures of the table in a COBOL copy book.
  • The copy book contains a SQL DECLARE TABLE statement along with a working storage host variable definition for the table.

SQLCA

  • An SQLCA is a structure or collection of variables that is updated after each SQL statement executes.
  • An application program that contains executable SQL statements must provide exactly one SQLCA.

CURSOR

  • Used when more than one row are to be selected.
  • Can be used for modifying data using ‘FOR UPDATE OF’ clause.

The Four Cursor control Statements

  • DECLARE : name assigned for a particular SQL statement.
  • OPEN: Builds the result table.
  • FETCH : Returns data from the results table one row at a time and assign the value to the specified host variables.
  • CLOSE: Releases all resources used by the cursor.

DECLARE:
EXEC SQL
DECLARE empcur CURSOR FOR
SELECT empno, ename, dept, job
FROM emp WHERE dept=‘D11’
FOR UPDATE OF job
END-EXEC.

OPEN: for the OPEN statement
EXEC SQL
OPEN empcur
END-EXEC.

FETCH: for the FETCH statement
EXEC SQL
FETCH empcur
INTO :empno, :ename, :dept, :job

END-EXEC.

WHENEVER: for the WHENEVER clause
EXEC SQL
WHENEVER NOT FOUND
GO TO close-empcur
END-EXEC.

UPDATE: for the update statement using cursors
EXEC SQL
UPDATE emp
SET job=:new-job
WHERE CURRENT OF empcur
END-EXEC.

DELETE: for the delete statement using cursor.
EXEC SQL
DELETE FROM EMP
WHERE CURRENT OF empcur
END-EXEC.

DB2 LOCKING

  • Locking is used to provide multiple user access to the same system.
  • DB2 uses locking services provided by an MVS subsystem called IMS Resource Locking Manager(IRLM).

Explicit Locking Facilities

  • The SQL statement LOCK TABLE
  • The ISOLATION parameter on the BIND PACKAGE command – the two possible values are RR(Repeatable Read) & CS (Cursor Stability).
  • CS is the value specified if the application program is used in an on-line environment.
  • The LOCKSIZE parameter – physically DB2 locks data in terms of pages or tables or table spaces. This parameter is specified in CREATE or ALTER table space option ‘LOCKSIZE’. The options are Tablespace, Table, Page or Any.
  • The ACQUIRE / RELEASE parameters on the BIND PLAN command specifies when table locks to be acquired and release.
    • ACQUIRE USE / ALLOCATE
    • RELESE COMMIT / DEALLOCATE

Catalogue Tables

  • Repository for all DB2 objects – contains 43 tables
  • Each table maintains data about an aspects of the DB2 environment
  • The data refers to information about table space, tables, indexes etc…
  • SYSIBM.SYSTABLES, SYSIBM.SYSINDEXS, SYSIBM.SYSCOLUMNS etc…

DB2 Utilities:-

UTILITIES

  • CHECK
  • COPY
  • MERGECOPY
  • RECOVER
  • LOAD
  • REORG
  • RUNSTATS
  • EXPLAIN

CHECK

  • Checks the integrity of DB2 data structures
  • Checks the referential integrity between two tables and also checks DB2 indexes for consistency.
  • Can delete invalid rows and copies them to a exception table.

COPY

  • Used to create an image copy for the complete table space or a portion of the table space- full image copy or incremental image copy.
  • Every successful exception of COPY utility places in the table, SYSIBM.SYSCOPY, at least one row that indicates the status of the image copy.

MERGECOPY

  • The MERGECOPY utility combines multiple incremented image copy data sets into a new full or incremental image copy dataset.

RECOVER

  • Restore Db2 table spaces and indexes to a specific instance.
  • Data can be recovered for a single page, pages that contain I/O errors, a single partition or an entire tablespace.
  • Standard unit of recovery is table space.

LOAD

  • To accomplish bulk inserts into DB2 table
  • Can replace the current data append to it
  • If a job terminates in any phase of LOAD REPLACE, the utility has to be terminated and rerun

REORG

  • To reorganize DB2 tables and indexes and there by improving their efficiency of access re-clusters data, resets free space to the amount specified in the ‘create DDL’ ,statement and deletes and redefines underlying VSAM datasets for stogroup defined objects.

EXPLAIN

  • Explain can be used to obtain the details about the access paths chosen by the DB2 optimizer for SQL statements.
  • Used specifically for performance monitoring.
  • When EXPLAIN is requested the access paths that the DB2 chooses are put in coded format into the table PLAN_TABLE, which is created in the default database by the user.
  • The other method is specifying EXPLAIN YES with BIND command
  • The PLAN_TABLE is to be queried to get the required information
  • Since the EXPLAIN results are dependent on the DB2 catalogue, it is better to run RUNSTAT before running EXPLAIN.

Db2 FAQ's frame:-

DB2 FAQS

1. How would you find out the total number of rows in a table?

o Use SELECT COUNT(*) ...

2. How do you eliminate duplicate values in SELECT?

o Use SELECT DISTINCT ...

3. How do you select a row using indexes?

o Specify the indexed columns in the WHERE clause.

4. What are aggregate functions?

o Built-in mathematical functions for use in SELECT clause.

5. How do you find the maximum value in a column?

o Use SELECT MAX(...)

6. Can you use MAX on a CHAR column?

o Yes.

7. My SQL statement SELECT AVG(SALARY) FROM EMP yields inaccurate results. Why?

o Because SALARY is not declared to have NULLs and the employees for whom the salary is not known are also counted.

8. How do you retrieve the first 5 characters of FIRSTNAME column of EMP table?

o SELECT SUBSTR(FIRSTNAME,1,5) FROM EMP;

9. How do you concatenate the FIRSTNAME and LASTNAME from EMP table to give a complete name?

o SELECT FIRSTNAME ¦¦ ' ' ¦¦ LASTNAME FROM EMP;

10. What is the use of VALUE function?

o Avoid -ve SQLCODEs by handling nulls and zeroes in computations

o Substitute a numeric value for any nulls used in computation

11. What is UNION,UNION ALL?

o UNION : eliminates duplicates
UNION ALL: retains duplicates
Both these are used to combine the results of different SELECT statements.

12. Suppose I have five SQL SELECT statements connected by UNION/UNION ALL, how many times should I specify UNION to eliminate the duplicate rows?

o Once.

13. What is the restriction on using UNION in embedded SQL?

o It has to be in a CURSOR.

14. In the WHERE clause what is BETWEEN and IN?

o BETWEEN supplies a range of values while IN supplies a list of values.

15. Is BETWEEN inclusive of the range values specified?

o Yes.

16. What is 'LIKE' used for in WHERE clause? What are the wildcard characters?

o LIKE is used for partial string matches. '%' ( for a string of any character ) and '_' (for any single character ) are the two wild card characters.

17. When do you use a LIKE statement?

o To do partial search e.g. to search employee by name, you need not specify the complete name; using LIKE, you can search for partial string matches.

18. What is the meaning of underscore ( '_' ) in the LIKE statement?

o Match for any single character.

19. What do you accomplish by GROUP BY ... HAVING clause?

o GROUP BY partitions the selected rows on the distinct values of the column on which you group by.

o HAVING selects GROUPs which match the criteria specified

20. Consider the employee table with column PROJECT nullable. How can you get a list of employees who are not assigned to any project?

o SELECT EMPNO
FROM EMP
WHERE PROJECT IS NULL;

21. What is the result of this query if no rows are selected:

o SELECT SUM(SALARY)
FROM EMP
WHERE QUAL='MSC';
NULL

22. Why SELECT * is not preferred in embedded SQL programs?

o For three reasons:

1. If the table structure is changed ( a field is added ), the program will have to be modified.

2. Program might retrieve the columns which it might not use, leading on I/O over head.

3. The chance of an index only scan is lost.

23. What are correlated sub queries?

o A sub query in which the inner ( nested ) query refers back to the table in the outer query. Correlated sub queries must be evaluated for each qualified row of the outer query that is referred to.

24. What is a cursor? why should it be used?

o Cursor is a programming device that allows the SELECT to find a set of rows but return them one at a time.

o Cursor should be used because the host language can deal with only one row at a time.

25. How would you retrieve rows from a DB2 table in embedded SQL?

o Either by using the single row SELECT statements, or by using the CURSOR.

26. Apart from cursor, what other ways are available to you to retrieve a row from a table in embedded SQL?

o Single row SELECTs.

27. How do you specify and use a cursor in a COBOL program?

o Use DECLARE CURSOR statement either in working storage or in procedure division (before open cursor), to specify the SELECT statement. Then use OPEN, FETCH rows in a loop and finally CLOSE.

28. What happens when you say OPEN CURSOR?

o If there is an ORDER BY clause, rows are fetched, sorted and made available for the FETCH statement. Other wise simply the cursor is placed on the first row.

29. Is DECLARE CURSOR executable?

o No.

30. Can you have more than one cursor open at any one time in a program?

o Yes.

31. When you COMMIT, is the cursor closed?

o Yes.

32. How do you leave the cursor open after issuing a COMMIT? (for DB2 2.3 or above only )

o Use WITH HOLD option in DECLARE CURSOR statement. But, it has not effect in psuedo-conversational CICS programs.

33. Give the COBOL definition of a VARCHAR field.

o A VARCHAR column REMARKS would be defined as follows:

10 REMARKS.

49 REMARKS-LEN PIC S9(4) USAGE COMP.

49 REMARKS-TEXT PIC X(1920).

34. What is the physical storage length of each of the following DB2 data types: DATE, TIME, TIMESTAMP?

DATE:

4bytes

TIME:

3bytes

TIMESTAMP:

10bytes

35. What is the COBOL picture clause of the following DB2 data types: DATE, TIME, TIMESTAMP?

DATE:

PIC X(10)

TIME :

PIC X(08)

TIMESTAMP:

PIC X(26)

36. What is the COBOL picture clause for a DB2 column defined as DECIMAL(11,2)?

o PIC S9(9)V99 COMP-3.
Note: In DECIMAL(11,2), 11 indicates the size of the data type and 2 indicates the precision.

37. What is DCLGEN?

o DeCLarations GENerator: used to create the host language copybooks for the table definitions. Also creates the DECLARE table.

38. What is JOIN and different types of JOIN.

o The ability to join rows and combine data from two or more tables is one of the most powerful features of relational system. Three types of joins:

1. Equi-join

2. Non-equijoin

3. self-join

39. Can I alter a table (e.g. adding a column) when other user is selecting some Columns or updating some columns from the same table?

o yes possible. until the updation or selection is committed db2 table will not be restructured. new column definition will be there but it will not be included until all the tasks on the table are committed.

40. What are the different methods of accessing db2 from tso?

o There are three ways in establishing tso/db2 connection

1. SPUFI

2. QMF

3. CATALOG VISIBILITY

41. How is the connection established between TSO & DB2?

o A thread between TSO & DB2 is established while attempting to make Connection between tso & db2.

42. What is sqlcode -922?

o Authorization failure

43. How do you do the EXPLAIN of a dynamic SQL statement?

o Use SPUFI or QMF to EXPLAIN the dynamic SQL statement

o Include EXPLAIN command in the embedded dynamic SQL statements

44. How is a typical DB2 batch pgm executed?

o Use DSN utility to run a DB2 batch program from native TSO. An example is shown:

DSN

SYSTEM(DSP3)

RUN

PROGRAM(EDD470BD) PLAN(EDD470BD)

LIB

('EDGS01T.OBJ.LOADLIB')

END

o Use IKJEFT01 utility program to run the above DSN command in a JCL.

45. Is it mandatory to use DCLGEN? If not, why would you use it at all?

o It is not mandatory to use DCLGEN.
Using DCLGEN, helps detect wrongly spelt column names etc. during the
pre-compile stage itself (because of the DECLARE TABLE ).
DCLGEN being a tool, would generate accurate host variable definitions for the table reducing chances of error.

46. Name some fields from SQLCA.

o SQLCODE, SQLERRM, SQLERRD

47. How does DB2 determine what lock-size to use?

o Based on the lock-size given while creating the table space

o Programmer can direct the DB2 what lock-size to use

o If lock-size ANY is specified, DB2 usually choses a lock-size of PAGE

48. What is the difference between CS and RR isolation levels?

o CS: Releases the lock on a page after use

o RR: Retains all locks acquired till end of transaction

49. Where do you specify them?

o ISOLATION LEVEL is a parameter for the bind process.

50. How do you simulate the EXPLAIN of an embedded SQL statement in SPUFI/QMF? Give an example with a host variable in WHERE clause.

o Use question mark in place of a host variable (or an unknown value). e.g.

SELECT EMP_NAME
FROM EMP
WHERE EMP_SALARY > ?

51. What is ACQUIRE/RELEASE in BIND?

o Determine the point at which DB2 acquires or releases locks against table and Table spaces, including intent locks.

52. In SPUFI suppose you want to select max. of 1000 rows, but the select returns only 200 rows. What are the 2 sqlcodes that are returned?

o 100 (for successful completion of the query), 0 (for successful COMMIT if AUTOCOMMIT is set to Yes).

53. How would you print the output of an SQL statement from SPUFI?

o Print the output data set.

54. What does it mean if the null indicator has -1, 0, -2?

o -1 : the field is null

o 0 : the field is not null

o -2 : the field value is truncated

55. How do you retrieve the data from a nullable column?

o Use null indicators. Syntax ... INTO :HOSTVAR:NULLIND

56. What else is there in the PLAN apart from the access path?

o PLAN has the executable code for the SQL statements in the host program

57. What is lock escalation?

o Promoting a PAGE lock-size to table or table space lock-size when a transaction has aquired more locks than specified in NUMLKTS. Locks should be taken on objects in single table space for escalation to occur.

58. When is the access path determined for dynamic SQL?

o At run time, when the PREPARE statement is issued.

59. What are the various locks available?

o SHARE, EXCLUSIVE, UPDATE

60. What is sqlcode -811?

o SELECT statement has resulted in retrieval of more than one row.

61. What are the advantages of using a PACKAGE?

o Avoid having to bind a large number of DBRM members into a plan

o Avoid cost of a large bind

o Avoid the entire transaction being unavailable during bind and automatic rebind of a plan

o Minmize fallback complexities if changes result in an error.

62. What is REORG? When is it used?

o REORG reorganizes data on physical storage to re-cluster rows, positioning oveflowed rows in their proper sequence, to reclaim space, to restore free space. It is used after huge updates, inserts and delete activity and after segments of a segmented table space have become fragmented.

63. How does DB2 store NULL physically?

o as an extra-byte prefix to the column value. physically, the nul prefix is Hex '00' if the value is present and Hex 'FF' if it is not

64. What is CHECK PENDING?

o When a table is LOADed with ENFORCE NO option, then the table is left in CHECKPENDING status. It means that the LOAD utility did not perform constraint checking.

65. When do you specify the isolation level? How?

o During the BIND process. ISOLATION (CS/RR)...

66. What is a DBRM, PLAN?

o DBRM: DataBase Request Module, has the SQL statements extracted from the host language program by the pre-compiler.

o PLAN: A result of the BIND process. It has the executable code for the SQL statements in the DBRM.

67. Is DECLARE TABLE in DCLGEN necessary? Why it used?

o It not necessary to have DECLARE TABLE statement in DCLGEN. This is used by the pre-compiler to validate the table-name, view-name, column name etc., during pre-compile.

68. What do you need to do before you do EXPLAIN?

o Make sure that the PLAN_TABLE is created under the AUTHID.

69. What is a collection?

o a user defined name that is the anchor for packages. It has not physical existence. Main usage is to group packages.

70. How can you quickly find out the # of rows updated after an update statement?

o Check the value stored in SQLERRD(3).

71. What is EXPLAIN?

o EXPLAIN is used to display the access path as determined by the optimizer for a SQL statement. It can be used in SPUFI (for single SQL statement ) or in BIND step (for embedded SQL ).

72. Where is the output of EXPLAIN stored?

o In userid.PLAN_TABLE

73. Suppose I have a program which uses a dynamic SQL and it has been performing well till now. Off late, I find that the performance has deteriorated. What happened?

o Probably RUN STATS is not done and the program is using a wrong index due to incorrect stats. Probably RUNSTATS is done and optimizer has chosen a wrong access path based on the latest statistics.

74. What are the contents of a DCLGEN? - GS

o EXEC SQL DECLARE TABLE statement which gives the layout of the table/view in terms of DB2 data types.

o A host language copy book that gives the host variable definitions for the column names.

75. Will pre-compile of an DB2-COBOL program bomb, if DB2 is down?

o No. Because the pre-compiler does not refer to the DB2 catalogue tables.

76. What are the isolation (also called isolation parameters) levels possible?

o CS: Cursor Stability

o RR: Repeatable Read

77. What is the picture clause of the null indicator variable?

o S9(4) COMP.

78. EXPLAIN has output with MATCHCOLS = 0. What does it mean?

o A non-matching index scan if ACCESSTYPE = I.

79. I use CS and update a page. Will the lock be released after I've done with that page?

o No.

80. What are the various locking levels available?

o PAGE, TABLE, TABLESPACE

81. What is ALTER?

o SQL command used to change the definition of DB2 objects.

82. Can I use LOCK TABLE on a view?

o No. To lock a view, take lock on the underlying tables.

83. What are the disadvantages of PAGE level lock?

o High resource utilization if large updates are to be done

84. What are PACKAGES ?

o They contain executable code for SQL statements for one DBRM.

85. Lot of updates have been done on a table due to which indexes have gone haywire. What do you do?

o Looks like index page split has occured. DO a REORG of the indexes.

86. What is dynamic SQL?

o Dynamic SQL is a SQL statement created at program execution time.

87. What is IMAGECOPY ?

o It is full backup of a DB2 table which can be used in recovery.

88. What is QUIESCE?

o A QUIESCE flushes all DB2 buffers on to the disk. This gives a correct snapshot of the database and should be used before and after any IMAGECOPY to maintain consistency.

89. What does the sqlcode of -818 pertain to? - GS

o This is generated when the consistency tokens in the DBRM and the load module are different.

90. What happens to the PLAN if index used by it is dropped?

o Plan is marked as invalid. The next time the plan is accessed, it is rebound.

91.

o

92. What is FREEPAGE and PCTFREE in TABLESPACE creation?

o PCTFREE: percentage of each page to be left free

o FREEPAGE: Number of pages to be loaded with data between each free page

93. Are views updatable?

o Not all of them. Some views are updatable e.g. single table view with all the fields or mandatory fields. Examples of non-updatable views are views which are joins, views that contain aggregate functions (such as MIN), and views that have GROUP BY clause.

94. What is COPY PENDING status?

o A state in which, an image copy on a table needs to be taken, in this state, the table is available only for queries. You cannot update this table. To remove the COPY PENDING status, you take an image copy or use REPAIR utility.

95. What is an inner join, and an outer join?

o Inner Join: combine information from two or more tables by comparing all values that meet the search criteria in the designated column or columns of one table with all the values in corresponding columns of the other table or tables. This kind of join which involve a match in both columns are called inner joins.

o Outer join is one in which you want both matching and non matching rows to be returned. DB2 has no specific operator for outer joins, it can be simulated by combining a join and a correlated sub query with a UNION.

96.

o

97. What is the difference between primary key & unique index?

o Primary: a relational database constraint. Primary key consists of one or more columns that uniquely identify a row in the table. For a normalized relation, there is one designated primary key.

o Unique index: a physical object that stores only unique values. There can be one or more unique indexes on a table.

98. When do you use the IMAGECOPY?

o To take routine backup of tables.

o After a LOAD with LOG NO. After REORG with LOG ON.

99. What is RUNSTATS?

o A DB2 utility used to collect statistics about the data values in tables which can be used by the optimizer to decide the access path. It also collects statistics used for space management. These statistics are stored in DB2 catalog tables.

100. How do you insert a record with a nullable column?

o To insert a NULL, move -1 to the null indicator

o To insert a valid value, move 0 to the null indicator