Ask Anantha (Mirror): Features

Showing posts with label Features. Show all posts

Oracle XE-Data Uploading from CSV file

This is a step-by-step guide on the simplest of the simplest ways by which data can be uploaded from CSV file to Oracle table. I have done uploading using this way for as many as 10,000 records without any performance issues. It went very swiftly.

Login to OracleXE first and follow the instructions below:

Step 1 - Click Utilities

Step 2 - Click Data Load/Unload

Step 3 - Click Load

Step 4

Step 5 - Click Load Spreadsheet Data

Step 6 - Follow the screen

Step 7 - Follow the screen

Step 8 - Follow the screen

Step 9 - Follow the screen

Step 10 - Follow the screen

Compile full schema in Oracle

To compile an entire schema in Oracle, there are two utilities provided by Oracle.

DBMS_UTILITY.COMPILE_SCHEMA
Two procedures in UTL_RECOMP

DBMS_UTILITY.COMPILE_SCHEMA
This package prior to 10g would recompile all objects (prior to Oracle 10g) and optionally recompile all Invalid Objects (starting from Oracle 10g). This procedure is available with all schemas (there is no need for giving additional privileges).

I am using Oracle 10g (10.2.0.3.0) for the following examples.

Syntax:
DBMS_UTILITY.COMPILE_SCHEMA( schema VARCHAR2, compile_all BOOLEAN, reuse_settings BOOLEAN);

Example (from SQL *Plus):
EXEC DBMS_UTILITY.COMPILE_SCHMEA( schema => 'SCOTT', compile_all => FALSE);

For recompiling all objects irrespective or VALID or INVALID use the following example (simply omit the compile_all parameter):

EXEC DBMS_UTILITY.COMPILE_SCHMEA( schema => 'SCOTT');

UTL_RECOMP Package
This package has been introduced from Oracle 10g Release 2, which is specially designed for recompiling invalid objects. It recompiles PLSQL modules, Java procedures etc. Only a SYSDBA account can run the subprograms in this package. But always a DBA can grant EXECUTE privilege to your schema (or PUBLIC).

There are two ways in which invalid objects are recompiled if you choose to use this package.

Sequentially
Parallelly

By using the subprogram RECOMP_SERIAL for compiling invalid objects sequentially one by one. The syntax of the subprogram is as follows:

UTL_RECOMP.RECOMP_SERIAL( schema VARCHAR2, flags BINARY_INTEGER);

Simply call the program as follows:

EXEC UTL_RECOMP.RECOMP_SERIAL (schema => 'SCOTT');

The subprogram RECOMP_PARALLEL is available within this package UTL_RECOMP to enable multiple threads for recompiling invalid objects. The syntax of the procedure is as follows:

UTL_RECOMP.RECOMP_PARALLEL (threads BINARY_INTEGER, schema VARCHAR2, flags BINARY_INTEGER);

Note: By using more number of threads the recompiling could be slow taken to consideration the I/O intensive nature of recompiling invalid objects. There is no optimized number of threads to be used while recompiling, but as a rule of thumb use minimum number of threads as possible.

Example of usage of RECOMP_PARALLEL procedure:
EXEC UTL_RECOMP.RECOMP_SERIAL (schema => 'SCOTT');

Coming soon: Article on comparison of these procedures.

Multitable Inserts using INSERT ALL

Multitable inserts allow a single INSERT INTO .. SELECT statement to conditionally, or non-conditionally, insert into multiple tables. This statement reduces table scans and PL/SQL code necessary for performing multiple conditional inserts compared to previous versions. It's main use is for the ETL process in data warehouses where it can be parallelized and/or convert non-relational data into a relational format.
--Unconditional Insert into all tables
INSERT ALL
INTO ap_cust VALUES (customer_id, program_id, delivered_date)
INTO ap_orders VALUES (order_date, program_id)
SELECT program_id, delivered_date, customer_id, order_dateFROM airplanes;
-- Pivoting insert to split non-relational data
INSERT ALL
INTO Sales_info VALUES (employee_id,week_id,sales_MON)
INTO Sales_info VALUES (employee_id,week_id,sales_TUE)
INTO Sales_info VALUES (employee_id,week_id,sales_WED)
INTO Sales_info VALUES (employee_id,week_id,sales_THUR)
INTO Sales_info VALUES (employee_id,week_id, sales_FRI)
SELECT EMPLOYEE_ID, week_id, sales_MON, sales_TUE,
sales_WED, sales_THUR,sales_FRI FROM Sales_source_data;
-- Conditionally insert into ALL tables
INSERT ALL
WHEN SAL>10000 THEN INTO sal_history VALUES(EMPID,HIREDATE,SAL)
WHEN MGR>200 THEN INTO mgr_history VALUES(EMPID,MGR,SYSDATE)
SELECT employee_id EMPID, hire_date HIREDATE, salary SAL, manager_id MGR
FROM employees WHERE employee_id > 200;

-- Insert into the FIRST table with a matching condition
INSERT FIRST
WHEN SAL > 25000 THEN INTO special_sal VALUES(DEPTID,SAL)
WHEN HIREDATE like ('%') THEN INTO hiredate_history_00 VALUES(DEPTID,HIREDATE)
WHEN HIREDATE like ('%99%') THEN INTO hiredate_history_99 VALUES(DEPTID,HIREDATE)
ELSE INTO hiredate_history VALUES(DEPTID, HIREDATE)
SELECT department_id DEPTID, SUM(salary) SAL, MAX(hire_date) HIREDATE
FROM employees GROUP BY department_id;

Error logging using DBMS_ERRLOG

If your DML statement encounters an exception, it immediately rolls back any changes already made by that statement, and propagates an exception out to the calling block. Sometimes that's just what you want. Sometimes, however, you'd like to continue past such errors, and apply your DML logic to as many rows as possible.

DBMS_ERRLOG, a package introduced in Oracle10g Release 2, allows you to do precisely that. Here is a quick review of the way this package works.

First you need to create an error log table where the errors will be inserted, and while issuing any DML statements use a clause newly introduced in 10g, LOG ERRORS.

1. Create an error log table for the table against which you will execute DML statements:

BEGIN
DBMS_ERRLOG.create_error_log (dml_table_name => 'EMP');
END;

Oracle then creates a table named ERR$_EMP that contains error-related columns as well as VARCHAR2 columns for each of your table's columns (where it makes sense).

For example in our case, the EMP table has this structure:

EMPNO    NUMBER(4)
ENAME    VARCHAR2(10)
JOB      VARCHAR2(9)
MGR      NUMBER(4)
HIREDATE DATE
SAL      NUMBER(7,2)
COMM    NUMBER(7,2)
DEPTNO   NUMBER(2)

The ERR$_EMP table has this structure:

ORA_ERR_NUMBER$   NUMBER
ORA_ERR_MESG$     VARCHAR2(2000)
ORA_ERR_ROWID$    UROWID(4000)
ORA_ERR_OPTYP$    VARCHAR2(2)
ORA_ERR_TAG$      VARCHAR2(2000)
EMPNO             VARCHAR2(4000)
ENAME             VARCHAR2(4000)
JOB               VARCHAR2(4000)
MGR               VARCHAR2(4000)
HIREDATE          VARCHAR2(4000)
SAL               VARCHAR2(4000)
COMM             VARCHAR2(4000)
DEPTNO            VARCHAR2(4000)

From this we can understand that there are columns ORA_ERR_NUMBER$, ORA_ERR_MESG$, ORA_ERR_ROWID$, ORA_ERR_OPTYP$, ORA_ERR_TAG$ which will be created for extra information on the error encountered, the data at the time of the error etc.

2. Add the LOG ERRORS clause to your DML statement:

BEGIN
   UPDATE emp
      SET sal = sal * 2
      LOG ERRORS REJECT LIMIT UNLIMITED;
END;

3. After running your code, you can check the contents of the ERR$ table to see if any errors occurred. You can then perform row by row recovery, or transfer that error information to your application error log, etc.

This approach will not only allow you to continue past exceptions; it will also greatly improve performance of DML processing in which lots of exceptions would normally be raised, because exception handling is quite slow compared to writing a row to a log table via an autonomous transaction procedure.

Check out DBMS_ERRLOG. It could offer some very powerful alternatives ways of executing large numbers of DML statements, each of which might change many rows.

Conclusion
The disadvantage of this approach is in maintenance. The user is likely to report errors like "my data is not showing up, and no errors are shown". My advice on using this feature is to have a highly sophasticated error handling mechanism that handles such errors and throw messages to users. This way the debugging person will also have exact data as to what caused the error. Unless used responsibly this feature can be a disaster.

Proxy user connect for SQL *Plus

Oracle Tips by Burleson Consulting

Today, systems with thousands of individual Oracle user ID's are not very common. Almost all enterprise applications (ERP packages like SAP, Oracle Financials) use pre-spawned Oracle connections that are created by a single Oracle user ID with names like AR or SAPR3.

The application server manages the connections to Oracle for us.

This is called a "proxy", the term proxy meaning an act where a 3rd party does something on our behalf. Unfortunately, when our end-users connect anonymously through a proxy, we do not have any end-user level security through traditional security tools like granting privileges or using roles and the only Oracle security is that is granted the ERP "master" user ID.

Who goes there?

From a DBA perspective, all database activity is being done by this single user and a query of the v$session view for an ERP does not identify any specific user, because they are connected to Oracle via the proxy on the application server.

This is a special nightmare for Oracle auditing, and all database auditing must now be done within the application server because is not aware of the "real" end-user, behind the proxy connection.

For complete information on Oracle auditing for ERP's see my notes in my bestselling security book "Oracle Privacy Security Auditing" with Arup Nanda.

The answer is to create a method whereby the ERP can maintain it's pre-spawned connection to Oracle while giving Oracle the ability to impose traditional security management (via roles, privileges, VPD's, etc.).

Oracle proxy connect authentication in 10gr2

To address this issue of unknown specific users within pre-spawned Oracle connections, Oracle has introduced a new proxy connection authentication in Oracle 10g release 2.

Under this new architecture, the ERP user ID is given limited privileges (create session only), and individual Oracle user ID's are created for each ERP user, using the traditional "create user" syntax.

Traditionally, you always connected to Oracle by only specifying your user ID and password:

connect scott/tiger

New with 10r2 proxy identification in SQL*Plus, the "connect" command has been enhanced to allow for a "proxy":

connect sapr3[scott]/tiger

You can also use this syntax to connect to SQL*Plus directly from the OS prompt:

root> sqlplus sapr3[scott]/tiger@mydb

As we see, we have a second, "proxy" user ID in the connect request (scott, in this case). When Oracle establishes the database connection, he will first connect as SAPR3 (who only has connect and resource privileges) and then open a proxy connection as scott.

Now, we have complete Oracle privileges and access control, just as if the scott user was connected, but in reality we have connected with the SAPR3 user, keeping our pre-spawned connection pool happy.

Note: To see the "proxy owner" of any Oracle session, you can see a proxy connection by using the sys_context package:

-- Here we see that Oracle has scott as the user ID

select user from dual;

USER
---------------------------------------
SCOTT

-- Here we call sys_context to reveal the proxy user:

select sys_context('USERENV','PROXY_USER') from dual;

SYS_CONTEXT('USERENV','PROXY_USER')
---------------------------------------
SAPR3

If you like Oracle tuning, see the book "Oracle Tuning: The Definitive Reference", with 950 pages of tuning tips and scripts.

You can buy it direct from the publisher for 30%-off and get instant acc

Article Courtesy:

http://www.dba-oracle.com/t_proxy_connect_authentication.htm

Feature comparison Oracle, MySQL and PostgreSQL

There are plenty of high-level data warehousing features in Oracle for instance that you might never need. Moreover, there may be some other features such as ACID compliant transactions that you just can't do without. We'll look at the major ones, such as stored procedures, views, snapshots, table datatypes, transactions, and so on. We'll look at Postgresql, MySQL, and Oracle and discover which ones do what you need.

Database Stored Procedures.

In a stored procedure the data is read, manipulated, and updated in one step. Meanwhile if you did the same in your middle tier application code, you would have to send that data set over the network, do your manipulations, and send it back. Not only would it make this one task slower, but other transactions vying for that same data could potentially have to wait while that data is in transit, and being manipulated. Also, stored code can serve to encapsulate specific requests which can be invaluable at simplifying your overall application. All three databases support stored procedures and functions. Oracle also supports packages, or collections of stored procedures as well as various object oriented features, which almost no one ever uses. An additional note, a database engine actually context switches between stored code, and the SQL code embedded therein. As of 9i, Oracle introduced bulk binding, so you can do work on large sets of rows, and update them all in one go, instead of “each loop” iteration. This feature can even further improve performance quite dramatically.

Database Views.

Views are basically stored queries, and as such are not overly complex to implement. However when they are used in a query, they necessarily make that query more complex. So obviously, subqueries need to be supported before Views can become available in a database. Oracle has obviously had Views for some time. As of 5.0, MySQL has also supported views. As with Oracle, MySQL also supports UPDATEABLE views, with some restrictions. Postgres also supports views and UPDATEABLE views.

Database Materialized Views (snapshots).

These are supported well in Oracle by default. As a refresher, remember a materialized view is a periodically updating copy or subset of a table. Imagine a view that fills a mirror copy with its query. Until the next refresh, that copy is static, not updated with the master. Usually a tradeoff is made between frequency of update, and maintenance of the transaction log (like an index) which supports it. On paper, MySQL and Postgresql do not support materialized views, however there are implementations of this on the internet, which should fit your needs, should you go this route and need some support. A stored procedure creates the materialized view and another one refreshes it. In essence a CREATE TABLE my_copy AS SELECT...

Database Language Integration.

Today, programming web based applications for various database applications is a truly egalitarian world. Almost all of the popular web languages support all of these database types. Java, PHP, Perl, Python, C#/.NET, etc etc. The world is your oyster.

Database Triggers

MySQL, Oracle, and Postgres all support BEFORE & AFTER event triggers on INSERT, UPDATE and DELETE. Personally I prefer to shy away from triggers unless absolutely necessary. They tend to be forgotten about, and sometimes come back to bite you. When used sparingly, they can be great though.

Database Security.

All three of these databases have vulnerabilities. It is truly the nature of software to have corners with trouble hiding there. Moreover, all three of them have regular patches of updates released. My personal feeling though is that open-source means that necessarily more eyes, and often more critical eyes, will be on the code. What's more, the pressure of the community is much greater in the open-source world. In the commercial space, the vendor can, and often does spin it's wheels when the repair is more expensive than the perceived cost of waiting on the fix.

On the point of security inside the database, all three support password logins and encryption of various types inside the database. Oracle does have a newer feature called virtual private database, where sections of tables, and columns can be encrypted, and hidden from view. This can be very useful for controversial or sensitive data, that DBAs and other administrators should not have access to.

Conclusion after Comparison.

Our three database platforms obviously have a lot of features, and different solutions for the same problems. In terms of security, triggers, views, materialized views, and stored procedures they provide a lot of the same functionality, though with differences in terms of performance and configurability. In part II we'll talk about some of the ways that the databases really start to differ significantly, in terms of indexing, but probably most importantly in their optimizing engines.

Oracle 10g: Exploring Data Pump

Sreeram Surapaneni, svsreeram@yahoo.com

Introduction:

Oracle 10g offers several new features, one of which is Data Pump technology for fast data movement between databases. Most Oracle shops still use their traditional export and import utility scripts rather this new technology. Data Pump technology is entirely different from the export/import utility, although they have a similar look and feel. Data Pump runs inside the database as a job, which means jobs are somewhat independent of the process that started the import or export. Another advantage is that other DBAs can login to the database and check the status of the job. The advantages of Data Pump, along with Oracle's plan to deprecate the traditional import/export utilities down the road, make Data Pump a worthwhile topic for discussion.

Oracle claims Data Pump offers a transfer of data and metadata at twice the speed of export and twenty to thirty times the speed of the import utility that DBAs have been using for years. Data Pump manages this speed with multiple parallel streams of data to achieve maximum throughput. Please note that Data Pump does not work with utilities older than the 10g release 1 utility.

Data Pump consists of two components: the Data Pump export utility called “expdp,” to Export the objects from a database, and the Data Pump Import utility called “impdp,” to load the objects into database. Just like traditional export and import utilities, the DBA can control these jobs with several parameters.

For example:

$expdp username/password (other parameters here)

$impdp username/password (other parameters here)

We can get a quick summary of all parameters and commands by simply issuing

$expdp help=y

$impdp help=y

Similar to the export and import utility, Data Pump export and import utilities are extremely useful for migrating especially large databases from an operating system and importing them into a database running on a different platform and operating system in a short amount of time.

The Oracle supplied package, DBMS_DATAPUMP, can be used to implement the API, through which you can access the Data Pump export and import utilities programmatically. In other words, we can create a much powerful, custom Data Pump utility using Data Pump technology, if you have hundreds of databases to manage.

One of the interesting points is how Data Pump initiates the export session. In the traditional export utility, the user process writes the exported data to the disk that is requested from the server process, as a part of regular session. The Data Pump expdp - user process launches a server-side process or job that writes data to disks on the server node, and this process runs independently of the session established by expdp client. However, similar to the traditional export utility, Data Pump writes the data into dump files in an Oracle proprietary format that only the Data Pump import utility can understand.

New Features of Data Dump that improve the performance of Data movement:

Below are some of the features that differentiate the traditional export and import utility from Data Pump. These features not only enhance the speed of the data transfer but also are handy for the DBA to asses how the job would run before actually running Data Dump.

Parallel Threads: The parallel parameter specifies the maximum number of threads of active execution operating on behalf of the export job. This execution set consists of a combination of worker processes and parallel I/O server processes. Because each active worker processes or I/O server process works on one active file at a time, the DBA must specify a sufficient number of files. Therefore, the value the DBA specifies for this parameter should be less than or equal to the number of files in the dump file set. This important parameter helps the DBA to make a trade-off between resource consummation and the elapsed time.
Ability to restart the job: The ability to restart a job is an extremely useful feature if DBA is involved in moving large amounts of data. The Data Pump job can be restarted without any data loss or corruption after unexpected failure or if the DBA stopped the job with stop_job parameter.
Ability to detach from and reattach the job: This allows other DBAs to monitor jobs from multiple locations. We can attach the Data Pump export and import utilities to one job at a time but we can have multiple clients attached to the same job.
Support for export and import operations over the network: The NETWORK_LINK parameter initiates an export using a database link. It means that the system, to which expdp is connected, contacts the source database referenced by the source_database_link, retrieves data from it and writes the data to a dump file set back on the connected system.
Ability to change the name of source datafile to a different name: The DBA can change the name of the source datafile to a different name in all DDL statements where the source datafile is referenced.
Support for filtering the metadata: The DBA can filter metadata using the “EXCLUDE” and “INCLUDE” options. If the object is excluded, all of its dependent objects are also excluded. For example, EXCLUDE=CONSTRAINT will exclude all constraints except “NOT NULL” and constraints needed for table creation, which includes: INCLUDE=TABLE:"IN('EMPLOYEES','DEPARTMENTS')"

· Space Estimate: The DBA can estimate how much space an export job will consume, without actually performing the export.

· Query Parameter: The DBA can filter data during the export by specifying a clause for a “SELECT” statement.

· Content Parameter: The DBA can specify what is exported or imported, for example, Meta data only or data only or both.

Init.ora parameters that affect the performance of Data Pump:

Oracle recommends the following settings to improve performance.

Disk_Asynch_io= true

Db_block_checking=false

Db_block_checksum=false

Additionally, the number of processes and sessions allowed to the database must be set to high, to allow for maximum parallelism.

How Data Pump accesses loading and unloading of Data:

Oracle has provided direct path to unload or export operations since Oracle 7.3. This method has been very useful for DBAs that want a quick export of the database and this process has been further enhanced in the Data Pump technology. Oracle uses the direct path method for loading (impdp) and unloading (expdp) when the structure of the tables allows it. If the table is part of a cluster, or it has a global index on a partitioned table, then Data Pump accesses the data in a different method called External Table. Both the direct path load and external table method support the same external data representation, so we can load the data that was unloaded with External Table method and vice versa.

Getting Started...

As stated earlier, Data Pump is a server-based utility, rather than client-based; dump files, log files, and SQL files are accessed relative to server-based directory paths. Data Pump requires you to specify directory paths as directory objects. A directory object maps a name to a directory path on the file system.

1. The following SQL statements creates a user, a directory object named dpump_dir1 and grants the permissions to the user.

SQLPLUS system/manager@TDB10G as sysdba

SQL> create user dpuser identified by dpuser;

SQL> grant connect, resource to dpuser;

SQL> CREATE DIRECTORY dpump_dir1 AS '/opt/app/oracle';

SQL> grant read, write on directory dpump_dir1 to dpuser

2. Let us see how the INCLUDE and EXCLUDE parameters can be used to limit the load and unload of particular objects. When the INCLUDE parameter is used, only the objects specified by it will be included in the export. When the EXCLUDE parameter is used, all objects except those specified by it will be included in the export: Assume we have EMP,EMP_DETAILS and DEPT tables owned by dpuser.

$ expdp dpuser/dpuser@TDB10G schemas=dpuser

include= TABLE:\"IN (\'EMP\', \'DEPT\')\"

  directory=dpump_dir1 dumpfile=dpuser.dmp log=dpuser.log

$expdp dpuser/dpuser@TDB10G schemas=dpuser

exclude=TABLE:\"= \'EMP_DETAILS\'\"

  directory=dpump_dir1 dumpfile=dpuser2.dmp logfile=dpuser.log

As stated earlier, Data pump performance can be significantly improved by using the PARALLEL parameter. This should be used in conjunction with the "%U" wildcard in the DUMPFILE parameter to allow multiple dumpfiles to be created or read:

$expdp dpuser/dpuser@TDB10G schemas=dpuser

   directory=dpump_dir1 parallel=4 dumpfile=dpuser_%U.dmp logfile=dpuser.log

Data Pump API:

The Data Pump API, DBMS_DATAPUMP, provides a high-speed mechanism to move the data from one database to another. Infact, the Data Pump Export and Data Pump Import utilities are based on the Data Pump API. The structure used in the client interface of this API is a job handle. Job handle can be created using the OPEN or ATTACH function of the DBMS_DATAPUMP package. Other DBA sessions can attach to a job to monitor and control its progress so that remote DBA can monitor the job that was scheduled by an on-site DBA.

The following steps list the basic activities involved in using Data Pump API.

1. Execute DBMS_DATAPUMP.OPEN procedure to create job.

2. Define parameters for the job like adding file and filters etc.

3. Start the job.

4. Optionally monitor the job until it completes.

5. Optionally detach from job and attach at later time.

6. Optionally, stop the job

7. Restart the job that was stopped.

Example of the above steps:

Declare

P_handle number; --- -- Data Pump job handle

P_last_job_state varchar2(45);  ---- -- To keep track of job state

P_job_state         varchar2(45);

P_status  ku$_Status ----- -- The status object returned by get_status

BEGIN

P_handle:=DBMS_DATAPUMP.OPEN ('EXPORT','SCHEMA', NULL,'EXAMPLE','LATEST');

-- Specify a single dump file for the job (using the handle just returned)

-- and a directory object, which must already be defined and accessible

-- to the user running this procedure

DBMS_DATAPUMP.ADD_FILE (p_handle,'example.dmp','DMPDIR');

-- A metadata filter is used to specify the schema that will be exported.

DBMS_DATAPUMP.METADATA_FILTER (p_handle,'SCHEMA_EXPR','IN (''dpuser'')');

-- Start the job. An exception will be generated if something is not set up

-- Properly.

DBMS_DATAPUMP.start_job (p_handle);

----The export job should now be running.

The status of the job can be checked by writing a separate procedure and capturing the errors and status until it is completed. Overall job status can also be obtained by querying “SELECT * from dba_datapump_jobs”.

Conclusion:

Oracle Data Pump is a great tool for the fast movement of data between the databases and much of this performance improvement is derived from the use of parameter “parallelism.” Even when the Transportable Tablespace feature of Oracle is used to move self-contained data between the databases, Data Pump is still required for handling the extraction and recreation of the metadata for that tablespace. Whenever possible, Data Pump performance is further maximized by using Direct-Path driver. Otherwise, Data Pump accesses the data using an External Table access driver.Data Pump provides flexibility, with the implementation of parameters such as INCLUDE, EXCLUDE, QUERY, and TRANSFORM that gives the DBA more control of data and objects being loaded and unloaded. With all of these features, Data Pump is a welcome addition to DBA tools in a world that constantly redefines the size of the “large database”.

Ask Anantha (Mirror)