Wednesday, December 31, 2014

ASM disk group administration-Creating a disk group

In general, once an ASM instance completes the ASM disk's discovery operation, you can then use the ASM disks to either build a disk group or add it to a pre-existing disk group. You need at least one eligible ASM disk (with HEADER_STATUS either
CANDIDATE, FORMER, or PROVISIONED) to be able to build a disk group or to add to a pre-existing disk group. There are various methods to create a disk group, such as, DBCA (in pre 11g R2), ASMCA (with 11g R2), Grid Control, and CREATE DISKGROUP SQL statement in the SQLPLUS prompt. Although the easiest and most convenient way to build and manage a disk group is to use the GUI tools such as DBCA, ASMCA, or Grid Control, we are going to demonstrate how to create and manage a disk group using a set of SQL statements.

In order to create a disk group on the local ASM instance, you first need to identify the eligible ASM disks discovered by the instance. The following SQL statements are useful to list the eligible ASM disks:

export ORACLE_SID=+ASM1
export ORACLE_HOME=$ASM_HOME (in 11gR1)
export ORACLE_HOME=$ORA_GRID_HOME (in 11gR2)

SQL> SELECT path,header_status,mode_status,total_mb
FROM v$asm_disk WHERE header_status IN ('CANDIDATE','FORMER','PROVISIONED');

The query lists each disk's information along with the path, the header status, and size of the disk. Once you list the disks, you can use the following set of SQL statements to create a new disk group with different levels of mirroring options.

The following SQL statement creates a new disk group named DATA with one ASM disk, sde1 located under the /dev/ location. The EXTERNAL REDUNDANCY clause in the SQL statement indicates that you are relying on the STORAGE-level mirroring
(protection) option, not using the Oracle-provided mirroring level:

CREATE DISKGROUP data EXTERNAL REDUNDANCY DISK '/dev/sde1';

The following SQL statement creates a new disk group named DATA that consists of two failure groups with one disk to each failure group, using the Oracle-provided NORMAL redundancy (a two-way mirroring) level:

CREATE DISKGROUP data NORMAL REDUNDANCY FAILGROUP fgp1 DISK '/dev/sde1', FAILGROUP fgp2 DISK '/dev/sdf1';

The following SQL statement creates a new disk group named DATA with three failure groups with one disk to each failure group, using the Oracle-provided highest level of redundancy, a three-way mirroring-level option:

CREATE DISKGROUP data HIGH REDUNDANCY
FAILGROUP fgp1 DISK '/dev/sde1', FAILGROUP fgp2 DISK '/dev/sdf1', FAILGROUP fgp3 DISK '/dev/sdg1';

Note: Failure groups are used to copy the redundant copies of every extent. When the mirroring option is skipped, ASM applies the NORMAL REDUNDANCY (a two-way mirroring) level by default.

After a disk group is successfully built, it will then automatically mount in the local instance. If a server parameter file is being used (SPFILE), the name of the disk group is successfully added to the ASM_DISKGROUP initialization parameter in order
to mount the disk group automatically on ASM instance restarts. In order to make the disk group available on other ASM instances in a cluster, you simply need to mount the disk group running the ALTER DISKGROUP data MOUNT statement (ensure the same set of disks are accessible on the other nodes).

Tuesday, December 30, 2014

ASM instance startup/shutdown

Managing an ASM instance is no different from managing the typical RDBMS database instances. The ASM instance could be managed by either using a set of SQLPLUS commands or the cluster aware SRVCTL utility. Nevertheless, it is strongly
recommended that you use the SRVCTL utility for managing the (start/stop) ASM instance in an RAC environment.

The ASM instance can be opened either in NOMOUNT, MOUNT, or RESTRICTED modes with the STARTUP command at the SQLPLUS prompt. When you have a planned maintenance on an ASM instance, you can open the ASM instance in a RESTRICT
mode to avoid any possible connections from the database instances. When the ASM instance is opened gracefully, it first discovers the disks and then mounts all the existing disk groups on the local instance.

To shut down the local ASM instance, you can use the various options available with the SHUTDOWN command in the SQLPLUS prompt. The supported options are NORMAL, IMMEDIATE, TRANSACTIONAL, and ABORT. As mentioned previously, you can use the SRVCTL utility to bring down the ASM instance as well. The following list of examples demonstrates how to start up/shut down an ASM instance using SQLPLUS and SRVCTL utilities:

srvctl stop asm –n raclinux1 –o normal:immediate:transactional:abort

export ORACLE_SID=+ASM1
export ORACLE_HOME=$ASM_HOME (in 11gR1)
export ORACLE_HOME=$ORA_GRID_HOME (in 11gR2)
sqlplus / as sysasm
SQL> SHUTDOWN NORMAL:IMMEDIATE:TRANSACTIONAL:ABORT

This example stops the ASM instance on raclinux1 node. Alternatively, you can also use either of the shutdown options.

srvctl start asm –n raclinux1 –o nomount:mount:restrict

export ORACLE_SID=+ASM1
export ORACLE_HOME=$ASM_HOME (in 11gR1)
export ORACLE_HOME=$ORA_GRID_HOME (in 11gR2)
sqlplus / as sysasm
SQL> STARTUP NOMOUNT:MOUNT:RESTRICT


This startup command starts up the ASM instance on raclinux1 node. When you start the instance in NOMOUNT mode, an ASM instance will be started without mounting the existing disk groups. When the instance is started in RESTRICT mode, no database
instance can attach to the ASM instance and all the disk groups are opened in restricted mode too.

Ensure that the database instances that are currently associated with the local ASM instance are stopped prior to shutting down the local ASM instance to avoid encountering an ASM instance shutdown error. In this case, the ASM instance will remain opened. Alternatively, you could use the ABORT option to forcefully shut down the currently connected database's instance before shutting down the ASM instance. It is also strongly recommended to dismount any existing Oracle Cluster File System (ACFS) to avoid any application I/O errors.

Note: If the voting disk and OCR files are placed in a disk group, you will not be able to stop the ASM instance.
In order to stop the ASM instance, you need to stop the cluster.

Friday, December 26, 2014

ASM dynamic views

To manage and supervise the ASM instance and its primary components (disk and disk group) effectively, you really need to know the ASM specific dynamic views, and how to map them to extract useful information about disks, disk groups, and
so on. This section will help you to understand the use of the ASM specific dynamic views available in ASM.

There are about 19 dynamic views available, as of 11g R2, and each dynamic view provides different helpful information. All the ASM dynamic views are predefined with V$ASM_. In the following section, we are going to focus on a handful of ASM
dynamic views.

V$ASM_DISK
When an ASM instance completes the disk discovery operation by reading the disk header information, all disks (used and usable) will then list in the V$ASM_DISK view. Each individual disk has a row in the V$ASM_DISK dynamic view and contains very useful information. The PATH column specifies the disk's name and location. The HEADER_STATUS column, in most cases, contains the following three possible values:

• CANDIDATE: Indicates that the unused disks are ready for use.
• FORMER: Indicates that the disk was formerly part of a disk group and is now ready for use.
• MEMBER: Indicates that the disk is currently part of an active disk group.

Apart from the preceding values, the view also contains other useful information about the disks, such as total size, free size, physical reads, redundancy level, and so on. On the flipside, every time you run a query against this view, ASM initiates disk
discovery operations for the new disks, where it reads all disk header information. Querying against this view could be an expensive operation at times, and could impact performance.

The following list of SQL statements demonstrates some of the useful queries against the view. However, prior to running these commands, ensure you are connected to an ASM instance through sqlplus with SYSASM privilege:

SELECT path,header_status,total_mb FROM v$asm_disk WHERE header_status in ('CANDIDATE','FORMER','PROVISIONED');

The preceding command displays information about the disks that are eligible to use.

V$ASM_DISKGROUP
After a disk group is successfully created in the local ASM instance, the disk group summary is visible in the V$ASM_DISKGROUP view. Each disk group maintains a row in the view along with the important information, such as disk group number and name, total disk group size, used space, free space, redundancy type, compatibility, mount state, and so on. Every time a query is run against the view, it is likely to have a similar impact to querying the V$ASM_DISK views. The following SQL command extracts the mounted disk group's name, total disk group size, and the free space left in the group:

SELECT name,state,total_mb,usable_file_mb FROM v$asm_diskgroup;

V$ASM_OPERATION
V$ASM_OPERATION is one of the useful views that displays a row for each long running operation in the ASM instance. For example, when a disk is being dropped or attached to an existing disk group, an ASM should initiate and complete the
rebalancing operations just before releasing the subject disk. Therefore, the view will present useful details, such as the amount of work that has been completed, and show the estimated time (in minutes) required to complete the operations. This should help
you to understand how long the operation will take to complete.

V$ASM_DISK_STAT
Although the V$ASM_DISK and V$ASM_DISK_STAT views display nearly identical information, querying the V$ASM_DISK_STAT view results is a less expensive operation in comparison to the V$ASM_DISK view. On the flip side, this view doesn't display the details about new disks on the system that the ASM instance has yet to discover. As querying the view is less expensive, it is strongly recommended that you use this view in order to display the information and statistics (read/write)
about the existing disks.

V$ASM_DISKGROUP_STAT
The V$ASM_DISKGROUP_STAT view displays statistical information about the mounted disk groups in the ASM instance. Unlike the V$ASM_DISKGROUP view, a query against this view doesn't result in new disk discovery operations but is less expensive in terms of performance. Therefore, it is recommended to use this view to display existing disk group information and statistical information about the disk groups.

V$ASM_CLIENT
When the V$ASM_CLIENT view is queried in the ASM instance, it displays the information about the database instances that are using the disk groups mounted and managed by the ASM instance.

ASM background processes

In addition to the typical set of instance background processes, an ASM instance comes with a few additional background processes that assist the ASM instance to perform its course of action. Therefore, understanding the individual role played by
these ASM-specific background processes in order to know how they help the ASM instance to manage and carry out its functionality, will be helpful to DBAs. In this section, we are going to cover the most useful ASM-specific background processes
and the part played by the individual ASM-specific background processes:

image

ASM instance configuration and management

Staring with 10g R1 and onwards, Oracle supports two types of instances: RDBMS and ASM. An ASM instance is a special kind of instance with only Shared Global Area (SGA) that typically consists of memory components such as Shared Pool,
Large Pool, Free Memory and ASM Cache, and a set of default background process, with some additional ASM-specific background processes.
The ASM instance doesn't hold any physical structure and the name of the instance typically starts with +ASM. The instance would be generally named as +ASMn (where n represents the instance number) in a cluster environment. The INSTANCE_TYPE initialization parameter controls the role played by these two instances. The ASM instance configuration
requires only a handful of initialization parameters. These parameters are discussed in the following section.

ASM instances efficiently support both non-RAC and RAC databases. Only one ASM instance is required per node, irrespective of the number of databases/instances running on the node. In an RAC environment, you need to configure one ASM
instance per node and these ASM instances across the cluster communicate with each other using an interconnected network communication. The database instances interact with the ASM instance to manage the database datafiles. ASM instances
typically listen on 1521 port by default.

ASM initialization parameters

In this section, we are going to summarize the essential ASM instance initialization parameters, their usage, and recommended values:

image

ASM disk group

A disk group is a logical container for one or more ASM disks and is the highest level of data structure in ASM. When a database is configured to employ the disk group, the disk group then becomes the default location for its datafiles. The disk group
can be used to place various database file types, such as datafiles, online redo, archivelogs, RMAN backupsets, OCR and Voting disks (in 11g R2), and more. ASM also provides the flexibility of utilizing a single disk group by multiple ASM instances
and databases across a cluster.

After a disk group is successfully created and mounted for the first time in the ASM instance, the name of the disk group is automatically affiliated with the ASM_DISKGROUPS initialization parameter to be able to mount the disk group at ASM
instance restarts.

In general, when a datafile is created in a disk group, the datafile extents are striped/distributed evenly across the available disks of the disk group. Optionally, you can also set the following specified mirroring level at the disk group to protect the data
integrity by storing redundant copies of data (extents) in a separate failure group to cope with the disk outage symptom:

• External redundancy: Relies on the STORAGE (RAID)-level mirroring redundancy option to protect the data
• Normal redundancy: Provides a default two-way mirroring option
• High redundancy: Provides a three-way mirroring redundancy option of ASM files

As of 11g R2, the following limits have been imposed on the ASM instance:
• A maximum of 63 disk groups in a storage system
• 1 million files per disk group


The following diagram illustrates the structure of a disk group with three disks assigned to it:
image

ASM disk

A disk is a primary element of an ASM instance. A disk could be built or formed either from a Logical Unit Number (LUN) by a storage array, a disk partition or an entire disk, or a logical volume or Network Attached File (NFS). An ASM instance discovers the disks within the paths specified with the ASM_DISKSTRING initialization parameter. Whenever a new disk is discovered by the ASM instance, the header status of the disk is set to the CANDIDATE flag and makes the disk available in the instance.
After the disks are successfully discovered by the local ASM instance, they will appear in the V$ASM_DISK dynamic view of the local ASM instance and the disks are ready to use to build a new ASM disk group, or can be added to any pre-existing
ASM disk groups. However, when no particular paths for the disks are specified with the ASM_DISKSTRING initialization parameter, ASM by default look in the following OS specific paths to discover the disks:

image

On a cluster environment, ensure that all ASM disks are visible across all nodes and each node must have the exact set of permissions (660) and ownership (oracle:dba) to avoid running into any sorts of problems. Oracle also strongly advises to have the
same size of disks in a disk group to maintain the disk group balance. On the other hand, you have the flexibility to define a different naming convention for a disk across the nodes in a cluster.

The following limits, as of 11g R2, have been imposed on ASM disks:
• A maximum of 10,000 disks
• Up to 2 TB maximum storage per ASM disk, with EXADATA 4PB per ASM disk

Wednesday, December 24, 2014

Filesystem versus ASM storage architecture

The following image shows the comparison between a typical filesystem and ASM storage involved when a database is created:

image_thumb[3]

Diagram A: In this diagram, an application is being connected to a database that is running on Node 1 and the database datafiles are configured on a typical filesystem storage. As summarized earlier, the filesystem is built from a Logical Volume Group (LVG) and the LVG is prepared on the shared storage. Therefore, to prepare the storage for the database, (filesystem and volume groups), you generally need a third-party tool, such as Volume Manager.

Diagram B: In this diagram, an application is being connected to a database that is running on Node 1 and the database datafiles are configured on ASM storage. ASM storage renders the capabilities of the filesystem and volume manager and manages the storage. In contrast to a filesystem, ASM storage does not require any third-party tools to manage the storage for the database and eliminates the need of building a volume group and filesystem creation. This would bypass the layers involved
between the database and storage, thus improving the read performance.

Overview of Automatic Storage Management (ASM)

Automatic Storage Management (ASM) is an option and a new feature of Oracle 10g and onwards that simplifies the storage management for all Oracle database file types. It renders the capabilities of a volume manager and filesystem together into
the Oracle database kernel. Although, it inherits the Stripe And Mirror Everything (SAME) functionality, it strips the data (extents) evenly across the ASM disks of a disk group by default and provides the mirroring functionality as an option. The
management and administration of ASM is made easy through a well-known set of SQL statements, such as, CREATE, ALTER, DROP, and through GUI tools.

While the ASM was initially intended for managing and maintaining only the Oracle database files and other related files, its functionality has been significantly improved in 11g R1 and R2 versions to manage all types of data. The following are some of the key features and benefits of ASM:

• It simplifies the storage configuration management for Oracle datafiles and other files.
• It eliminates the need for third-party software, (for example, volume manager) to manage the storage for the databases.
When a datafile is created, it is divided into equally sized (1, 2, 4, 8, 16, 32,or 64 MB) extents that are scattered evenly across the disks of a disk group to provide balanced I/O to improve performance and prevent hot spot symptoms.
• It is built on the Stripe and Mirror Everything (SAME) functionality.
• It supports both non-RAC and RAC databases efficiently.
• It has the ability to add and remove ASM disks online without actually disturbing the ongoing operations.
• It can be managed and administrated using a set of known SQL statements.
• It performs automatic online redistribution for the data whenever a disk is being added or dropped.
• With ASM 11g R2, in addition to all database file types, it can also be used to store non-Oracle datafile types such as binaries, images, and so on. Beginning with ASM 11g R2, it provides the ability of a preferred read functionality, when ASM mirroring features are enabled.
• It supports multiversioning of databases.
• It supports a multipathing feature to prevent outages from disk path failures.

Saturday, December 13, 2014

Cassandra - Installing Cassandra

1. First Download the Cassandra installable tarball (as I am going to install in Red Hat Linus machine) from the official Cassandra website to the location in your machine where you want to install it - http://cassandra.apache.org/download/

image

2. Cassandra is written in Java and it requires Java 7 or above is to be installed before we install Cassandra. Ensure in your machine java 7 or above is installed.  Java version 7 naming looks like 1.7.x.x
The command to check what version of java installed is simple, in the terminal type in :   java –version as shown below

image

3. Cassandra installation is very simple – untaring the tarball makes the cassandra installed (or all the executables and tools are available to get started with Cassandra ) and the installable is of around 22MB size only.

Once the tarball is downloaded in the machine where you want to install it, make a directory named – cassandra in your home directory and place the tarball in it.

To untar the tarball use the below command

$ tar -xvzf apache-cassandra-2.1.2-bin.tar.gz

[rvalusa@ol6-11gGG ~]$ cd cassandra/
[rvalusa@ol6-11gGG cassandra]$ ls
apache-cassandra-2.1.2-bin.tar.gz
[rvalusa@ol6-11gGG cassandra]$ tar -xvzf apache-cassandra-2.1.2-bin.tar.gz
apache-cassandra-2.1.2/bin/
apache-cassandra-2.1.2/conf/
apache-cassandra-2.1.2/conf/triggers/
apache-cassandra-2.1.2/interface/
apache-cassandra-2.1.2/javadoc/
apache-cassandra-2.1.2/javadoc/org/
apache-cassandra-2.1.2/javadoc/org/apache/
apache-cassandra-2.1.2/javadoc/org/apache/cassandra/
apache-cassandra-2.1.2/javadoc/org/apache/cassandra/auth/
apache-cassandra-2.1.2/javadoc/org/apache/cassandra/auth/class-use/
apache-cassandra-2.1.2/javadoc/org/apache/cassandra/cache/
apache-cassandra-2.1.2/javadoc/org/apache/cassandra/cache/class-use/
apache-cassandra-2.1.2/javadoc/org/apache/cassandra/cli/
apache-cassandra-2.1.2/javadoc/org/apache/cassandra/cli/class-use/
..........
..........
..........
..........

apache-cassandra-2.1.2/tools/bin/cassandra-stress.bat
apache-cassandra-2.1.2/tools/bin/cassandra-stressd
apache-cassandra-2.1.2/tools/bin/cassandra.in.bat
apache-cassandra-2.1.2/tools/bin/cassandra.in.sh
apache-cassandra-2.1.2/tools/bin/json2sstable
apache-cassandra-2.1.2/tools/bin/json2sstable.bat
apache-cassandra-2.1.2/tools/bin/sstable2json
apache-cassandra-2.1.2/tools/bin/sstable2json.bat
apache-cassandra-2.1.2/tools/bin/sstablelevelreset
apache-cassandra-2.1.2/tools/bin/sstablemetadata
apache-cassandra-2.1.2/tools/bin/sstablemetadata.bat
apache-cassandra-2.1.2/tools/bin/sstablerepairedset
apache-cassandra-2.1.2/tools/bin/sstablesplit
apache-cassandra-2.1.2/tools/bin/sstablesplit.bat
apache-cassandra-2.1.2/tools/bin/token-generator

[rvalusa@ol6-11gGG cassandra]$ ls
apache-cassandra-2.1.2  apache-cassandra-2.1.2-bin.tar.gz
[rvalusa@ol6-11gGG cassandra]$ cd apache-cassandra-2.1.2
[rvalusa@ol6-11gGG apache-cassandra-2.1.2]$ ls
bin          conf       javadoc  LICENSE.txt  NOTICE.txt  tools
CHANGES.txt  interface  lib      NEWS.txt     pylib

[rvalusa@ol6-11gGG apache-cassandra-2.1.2]$


image

Viewing The Main Configuration File:
Cassandra has a main configuration named – cassandra.yaml in conf directory.
Details on YAML are in here - http://en.wikipedia.org/wiki/YAML.  It is a recursive acronym – Yaml Ain’t Markup Language

Below are some of the parameter properties set in the cassandra.yaml config file

[rvalusa@ol6-11gGG conf]$ pwd
/home/rvalusa/cassandra/apache-cassandra-2.1.2/conf
[rvalusa@ol6-11gGG conf]$ cd cassandra.yaml

cluster_name: 'Test Cluster'
num_tokens: 256
partitioner: org.apache.cassandra.dht.Murmur3Partitioner
endpoint_snitch: SimpleSnitch

Providing Permissions to Cassandra directories:
[rvalusa@ol6-11gGG lib]$ su -
Password:
[root@ol6-11gGG ~]# mkdir /var/lib/cassandra <== In here cassandra data files will be residing
[root@ol6-11gGG ~]# mkdir /var/log/cassandra <== In here cassandra system logs will be residing
[root@ol6-11gGG ~]# chown -R rvalusa /var/lib/cassandra
[root@ol6-11gGG ~]# chown -R rvausa /var/log/cassandra

Cassandra - Getting Started With The Architecture

Understanding That Cassandra Is A Distributed Database
Cassandra is a distributed database, all nodes in cluster has same functionality when compared with each other. There is no master or slave nodes thus eliminating the single point of failure. Data is replicated across the nodes to high availability.

In Cassandra, cluster can easily be spread across more than one data center allowing for high availability even if one data center completely goes down.

image

Cassandra Documentation is available at - http://www.datastax.com/docs

Snitch: Snitch is how the nodes in a cluster know about the topology of the cluster
Ways to Define Snitch:
- Dynamic Snitching: Monitors the performance of reads from the various replicas and chooses the best replica based on this history
- SimpleSnitch: For single-data center deployments only.
- RackInferringSnitch: Determines the location of nodes by rack and data center corresponding to the IP addresses.
- PropertyFileSnitch: Determines the location of nodes by rack and data center.
- GossipingPropertyFileSnitch: Automatically updates all nodes using gossip when adding new nodes.
- EC2Snitch: Use with Amazon EC2 in a single region.
- EC2MultiRegionSnitch: Use with Amazon EC2 in multiple regions.
- GoogleCloudSnitch
- CloudstackSnitch


Gossip: Gossip is how the nodes in a cluster communicate to each other.
Every ONE second, each node communicates with up to three other nodes, exchanging information about itself and all the other nodes that it has information about.
Gossip is the internal communication method for nodes in a cluster to talk to each other.

For external communication, such as from an application to a Cassandra database, CQL(Cassandra Query Language) or Thrift are used.

How data distribution is done across the Nodes in Cassandra ?
Data Distribution is done through consistent hashing algorithm, to strive for even distribution of data across the nodes in a cluster.
Rather than all of the rows of a table existing on only on node, the rows are distributed across the nodes in the cluster, in an attempt to evenly spread out the load of the table’s data.
For example, notice the following rows of data, to be inserted in a table within a Cassandra database. (The data will be spread across the nodes based on the hash algorithm used which is illustrated below)

image

To distribute the rows across the nodes, a Partitioner is used.
The Partitioner uses an algorithm to determine which node a given row of data will go to
The default partitioner in Cassandra is Murmur3

Murmur3 takes the values in the first column* (Depending upon the table definition more than one column can also be used by Partitioner Murmur3) of the row to generate a unique number between  -263 and 263.

So based on the hashing algorithm the above table Home_ID column row data turn into as below
H01033638 –>  -7456322128261119046
H01545551 –>  -2487391024765843411
H00999943 –>  6394005945182357732


Similarly each node in a cluster has an end point value assigned to it manually which decides which row data will get distributed to which node. Each node is responsible for the token values between its endpoint and the endpoint of the previous node.

image
Therefore, the –7456322128261119046 data is owned by the –4611686018427387904 node

image

Node token ranges are calculated using the below formula or Murmur3 calculator - http://www.geroba.com/cassandra/cassandra-token-calculator/

image

Replication: A Replication Factor must be specified whenever a database is defined.
The Replication Factor specifies how many instances  of the data there will be within a given database.
Although 1 can be specified, it is common to specify 2, 3, or more, so that if a node goes down, there is at least one other replica of the data, so that the data is not lost with the down node.
 

Virtual Nodes: Virtual nodes are an alternative way to assign token ranges to nodes, and are now the default in Cassandra.
With virtual nodes, instead of a node being responsible for just one token range, it is instead responsible for many small token ranges (by default, 256 of them)
Virtual nodes allow for assigning a high number of ranges to a powerful computer (e.g: 512) and a lower number of ranges (e.g: 128) to a less powerful computer.
Virtual nodes (aka vnodes) were created to make it easier to add new nodes to a cluster while keeping the cluster balanced.
When a new node is added, it receives many small token range slices from the existing nodes, to maintain a balanced cluster.
With the old way, of static token ranges, it was common to double the number of nodes, so that the end-point for the new nodes could be a value half of the value of the existing end-points.

Cassandra–Introduction to Cassandra

Understanding What Cassandra Is:
Cassandra is a

- Open Source
- A NoSQL (Not Only SQL) database Technology
- A distributed database technology
- A big data technology which provides massive scalability
- Commonly used to create a database that is spread across nodes in more than one data center, for high availability
- Based on Amazon Dynamo and Google Big Table
- Fault Tolerant
- Highly Performant
- Decentralized – No Single point of failure
- Durable – Data is not lost even one of the data center goes down
- Elastic – Read/Write throughput increases linearly as new nodes are added to cluster

What Cassandra Is Being Used For:

Use cases of Cassandra are listed in – planetcassandra.org/apache-cassandra-use-cases/

Companies running their applications on Apache Cassandra have realized benefits which have directly improved their business. Cassandra is capable of handling all of the big data challenges that might arise: massive scalability, an always on architecture, high performance, strong security, and ease of management, to name a few

- Product Catalog/Playlist – Coursera, Comcast, Netflix, Hulu, Sky, Soundcloud etc.,
- Recommendation/Personalization - Bazaarvoice, Outbrain, eBay etc.,
- Fraud Detection - Barracuda Networks, F-Secure etc.,
- Messaging -  Accenture, CallFire, eBuddy, The New York Times etc.,
- IOT/Sensor Data - NASA, AppDynamics, Lucid, Aeris etc.,

Saturday, December 6, 2014

Data Pump - Exporting and Importing Directly Across the Network

Sometimes you need to create a testing database and load it with production data. In scenarios like this, the production box is usually located remotely from the development box. Data Pump provides you the ability to take an export and directly import it into your target database without creating any intermediate dump files. This is a fast and efficient way to create new environments from existing environments.

An example will help illustrate how this works. For this example, the production database users are STAR2, CIA_APP, and CIA_SEL. You want to move these users into a testing database and name them STAR_JUL, CIA_APP_JUL, and CIA_SEL_JUL.
This task requires the following high-level steps:

1. Create users in the destination database to be imported into. Here’s a sample script that creates the users in the testing database:

define star_user=star_jul
define star_user_pwd=star_jul_pwd
define cia_app_user=cia_jul_dec
define cia_app_user_pwd=cia_app_jul_pwd
define cia_sel_user=cia_sel_jul
define cia_sel_user_pwd=cia_sel_jul_pwd
--
create user &&star_user identified by &&star_user_pwd;
grant connect,resource to &&star_user;
alter user &&star_user default tablespace dim_data;
--
create user &&cia_app_user identified by &&cia_app_user_pwd;
grant connect,resource to &&cia_app_user;
alter user &&cia_app_user default tablespace cia_data;
--
create user &&cia_sel_user identified by &&cia_app_user_pwd;
grant connect,resource to &&cia_app_user;
alter user &&cia_sel_user default tablespace cia_data;


2. Create a database link in your testing database that points to your production database. The remote user referenced in the CREATE DATABASE LINK statement has the DBA role granted to it in the production database. Here’s a sample CREATE DATABASE LINK script:

create database link dk connect to darl identified by foobar using 'dwdb1:1522/dwrep1';

3. Create a directory that points to the location where you want your log file to go:

SQL> create or replace directory engdev as '/orahome/oracle/ddl/engdev';

4. Run the import command on the testing box. This command references the remote database via the NETWORK_LINK parameter. This command also instructs Data Pump to map the production database user names to the newly created
users in the testing database.

$ impdp darl/engdev directory=engdev network_link=dk schemas='STAR2,CIA_APP,CIA_SEL' remap_schema=STAR2:STAR_JUL,CIA_APP:CIA_APP_JUL,CIA_SEL:CIA_SEL_JUL parallel=4

This technique allows you to move large amounts of data between disparate databases without having to create or copy any dump files or datafiles. This is a very powerful Data Pump feature that lets you quickly and efficiently transfer data.

If you don’t have Oracle Net connectivity between the two databases, then the steps to accomplish the same task are as follows:

1. Export the production database.
2. Copy the dump file to the testing database.
3. Import the dump file into the testing database.

Data Pump - Listing the Contents of Dump Files - SQLFILE option

Data Pump has a very robust method of creating a file that contains all the SQL that’s executed when an import job runs. Data Pump uses the DBMS_METADATA package to create the DDL that you can use to recreate objects in the Data Pump dump file.

Use the SQLFILE option of Data Pump import to list the contents of a Data Pump export file. This example creates a file named expfull.sql that contains the SQL statements that the import process calls (the file is placed in the directory defined by the DPUMP_DIR2 directory object):

$ impdp hr/hr DIRECTORY=dpump_dir1 DUMPFILE=expfull.dmp SQLFILE=dpump_dir2:expfull.sql

In the previous command, if you don’t specify a separate directory (like dpump_dir2 in this example), then the SQL file is written to the location specified in the DIRECTORY option.

Tip You must run the previous command as a user with DBA privileges or the schema that performed the Data Pump export. Otherwise, you get an empty SQL file without the expected SQL statements in it.

When you use the SQLFILE option with an import, the impdp process doesn’t import any data. It only creates a file that contains the SQL commands that would be run by the import process. It’s sometimes handy to generate a SQL file for the following reasons:

• To preview and verify the SQL statements before running the import
• To manually run the SQL to pre-create database objects
• To create a DDL file that you can later inspect to see if there are differences in users from exports at different points in time

Data Pump - Estimating the Size of Export Jobs

If you’re about to export a large amount of data, you can estimate the size of the file that Data Pump creates before you run the export. You may want to do this because you’re concerned about the amount of space an export job needs.

To estimate the size, use the ESTIMATE_ONLY parameter. This example estimates the size of the export file for an entire database:

$ expdp dbauser/foo estimate_only=y full=y logfile=n

Here’s a snippet of the output:

Estimate in progress using BLOCKS method...
Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA
. estimated "REP_MV"."REG_QUEUE_REP" 9.606 GB
. estimated "REP_MV"."CWP_USER_PROFILE" 2.589 GB
. estimated "REP_MV"."PRODUCT_INSTANCES_MV" 1.620 GB
. estimated "REP_MV"."ROLES_MV" 1.550 GB
. estimated "STAR2"."F_DOWNLOADS":"DOWN_P_5" 1.450 GB
. estimated "STAR2"."F_DOWNLOADS":"DOWN_P_11" 1.414 GB


Similarly, you can specify a schema name to get an estimate of the size required to export a user:

$ expdp dbauser/foo estimate_only=y schemas=star2 logfile=n

Here’s an example of estimating the size required for two tables:

$ expdp dbauser/foo estimate_only=y tables=star2.f_downloads,star2.f_installations logfile=n

Data Pump - Use a Parameter File

Instead of typing commands on the command line, in many situations it’s preferable to store the commands in a file and then reference the file when executing Data Pump export or import. Using parameter files makes tasks more repeatable and less prone to errors. You can place the commands in a file once and then reference that file multiple times.

Additionally, some of the Data Pump commands (like FLASHBACK_TIME) require the use of quotation marks; in these situations, it’s sometimes hard to predict how the operating system will interpret the quotation marks. Whenever a command requires quotation marks, it’s highly preferable to use a parameter file.

To use a parameter file, first create an operating text file that contains the commands you want to use to control the behavior of your job. This example uses the Linux/Unix vi command to create a text file named imp.par:

$ vi imp.par

Now, place the following commands in the imp.par file:

userid=darl/foo
directory=dp
dumpfile=invp.dmp
logfile=invp.log
tables=f_sales


Next, reference the parameter file via the PARFILE command-line option:

$ impdp parfile=imp.par

Data Pump import processes the parameters in the file as if they were typed on the command line. If you find yourself repeatedly typing the same commands, then consider using a parameter file to increase your efficiency.

Thursday, December 4, 2014

Data Pump - Exporting Data

A small amount of setup is required when you run a Data Pump export job. Here are the steps:

1. Create a database-directory object that points to an operating-system directory that you want to write/read Data Pump files to (expdp)/from(impdp)
2. Grant read, write on the directory to the database user running the export.
3. From the operating-system prompt, run the expdp utility.

Creating a Database Directory
Before you run a Data Pump job, you must create a database-directory object that corresponds to a physical location on disk that specifies where the dump files and log files are created. Use the CREATE DIRECTORY command to accomplish this. Below creates a directory named dp_dir and specifies that it is to map to the /oradump physical location on disk:

SQL> create directory dp_dir as '/oradump';

To view the details of the newly created directory, issue this query:

SQL> select owner, directory_name, directory_path from dba_directories;

By default, when you install Oracle, one default directory object is created named DATA_PUMP_DIR. If you don’t specify the DIRECTORY parameter when exporting or importing, Oracle by default attempts to use the default database-directory object. The default directory associated with DATA_PUMP_DIR can vary depending on the version of Oracle. On some systems, it may be ORACLE_HOME/rdbms/log; on other systems, it may point to ORACLE_BASE/admin/ORACLE_SID/dpdump. You have to inspect DBA_DIRECTORIES to verify the default location for your system.

Granting Access to the Directory
You need to grant permissions on the database-directory object to a user that wants to use Data Pump. Use the GRANT statement to allocate the appropriate privileges. If you want a user to be able to read from and write to the directory, you must grant security access as follows:

SQL> grant read, write on directory dp_dir to darl;

All directory objects are owned by the SYS user. If you’re using a user account that has the DBA role granted to it, then you have the read and write privileges on the directory object.

Taking an Export
When the directory object and grants are in place, you can use Data Pump to export information from a database. DBAs typically use exports for point-in-time backups of data and metadata. You can use these exports to either restore database objects or move data to different database environments. Suppose you recently created a table and populated it with data:

SQL> create table inv(inv_id number);
SQL> insert into inv values (123);

Now, you want to export the table. This example uses the previously created directory named DP_DIR. Data Pump uses the directory path specified by the directory object as the location on disk to write the dump file and log file:

$ expdp darl/foo directory=dp_dir tables=inv dumpfile=exp.dmp logfile=exp.log

The expdp job creates a file named exp.dmp in the /oradump directory that contains the information required to re-create the INV table and populate it with data as of the time the export was taken. In addition, a log file named exp.log is created in the /oradump directory that contains all the logging information associated with this export job.

If you don’t specify a dump-file name, Data Pump creates a file named expdat.dmp. If a file named expdat.dmp already exists in the directory, then Data Pump throws an error. If you don’t specify a log-file name, then Data Pump creates one named export.log. If a file already exists (named export.log), then Data Pump overwrites it.

Data Pump Architecture

Data Pump consists of the following components:

• expdp (Data Pump export utility)
• impdp (Data Pump import utility)
• DBMS_DATAPUMP PL/SQL package (Data Pump API)
• DBMS_METADATA PL/SQL package (Data Pump Metadata API)

The expdp and impdp utilities use the DBMS_DATAPUMP and DBMS_METADATA built-in PL/SQL packages when exporting and importing data and metadata. The DBMS_DATAPUMP package moves entire databases or subsets of data between database environments. The DBMS_METADATA package exports and imports information about database objects.

Note You can call the DBMS_DATAPUMP and DBMS_METADATA packages independently (outside of expdp and impdp) from SQL*Plus. I rarely call these packages directly from SQL*Plus; but you may have a specific scenario where itfs desirable to interact directly with them.

When you start a Data Pump export or import job, a master operating-system process is initiated on the database server. This master process name has the format ora_dmNN_<SID>. On Linux/Unix systems, you can view this process from the operating-system prompt using the ps command:

$ ps -ef | grep ora_dm
oracle 14950 717 0 10:59:06 ? 0:10 ora_dm00_STAGE


Depending on the degree of parallelism and the work specified, a number of worker processes are also started. The master process coordinates the work between master and worker processes. The worker process names have the format ora_dwNN_<SID>.

Also, when a user starts an export or import job, a database status table is created (owned by the user who starts the job). This table exists for the duration of the Data Pump job. The name of the status table is dependent on what type of job you’re running. The table is named with the format SYS_<OPERATION>_<JOB_MODE>_NN, where OPERATION is either EXPORT or IMPORT. JOB_MODE can be FULL, SCHEMA, TABLE, TABLESPACE, and so on.

For example, if you’re exporting a schema, a table is created in your account with the name SYS_EXPORT_SCHEMA_NN, where NN is a number that makes the table name unique in the user’s schema. This status table contains information such as the objects exported/imported, start time, elapsed time, rows, error count, and so on. The status table has over 80 columns.
The status table is dropped by Data Pump upon successful completion of an export or import job. If you use the KILL_JOB interactive command, the master table is also dropped. If you stop a job with the STOP_JOB interactive command, the table isn’t removed and is used in the event you restart the job. If your job terminates abnormally, the master table is retained. You can delete the status table if you don’t plan to restart the job.

When Data Pump runs, it uses a database-directory object to determine where to write and read dump files and log files. Usually, you specify which directory object you want Data Pump to use. If you don’t specify a directory object, a default is used. A Data Pump export creates an export file and a log file. The export file contains the objects being exported. The log file contains a record of the job activities.

Below shows the architectural components related to a Data Pump export job. Here’s how you initiate the job from the command line:

$ expdp user/pwd dumpfile=exp.dmp logfile=exp.log directory=dp_dir

In this example, a database-directory object named DP_DIR is defined to reference the /oradump operating-system directory. The output files are defined via the command line to be exp.dmp and exp.log.

image

Below displays the architectural components of a Data Pump import job. Here’s how you initiate the job from the command line:

$ impdp user/pwd dumpfile=exp.dmp logfile=imp.log directory=dp_dir

In the above command, the Data Pump import reads from a dump file named exp.dmp, which is located in the operating-system directory referenced by the database-directory object named DP_DIR. The import job reads the dump file and populates database objects.

image

For each Data Pump job, you must ensure that you have access to a directory object.

Data Pump

Data Pump was introduced in Oracle Database 10g. It replaces the older exp/imp utilities. Data Pump enables you to quickly move data and/or metadata from one environment to another. You can use Data Pump in a variety of ways:

• Point-in-time logical backups of the entire database or subsets of data
• Replicating entire databases or subsets of data for testing or development
• Quickly generating Data Definition Language (DDL) required to re-create objects

Sometimes DBAs hold on to the old exp/imp utilities because they’re familiar with the syntax and these utilities get the job done quickly. Even if those legacy utilities are easy to use, you should consider using Data Pump going forward. Data Pump contains substantial functionality over the old exp/imp utilities:

• Performance with large data sets, allowing you to efficiently export and import gigabytes of data
• Interactive command-line utility, which gives you the ability to disconnect and then later re-attach to active Data Pump jobs
• Ability to export and import large amounts of data from a remote database directly into a local database without creating a dump file
• Ability to make on-the-fly changes to schemas, tablespaces, datafiles, and storage settings from export to import
• Sophisticated filtering of objects and data
• Security controlled via database-directory objects
• Advanced features such as compression and encryption

Saturday, November 22, 2014

Making Backups with RMAN

Backup Sets and Image Copies:
The backup command lets you make two types of RMAN backups: backup sets and image copies. By default, all RMAN backups are in the form of backup sets. Each backup set contains one or more backup pieces, which are files in an RMAN-specific format. Backup sets are the default backup type for both disk- and tape-based backups.

A backup set is a logical structure that consists of a minimum of one backup piece, which is a physical, RMAN-specific format file that actually contains the backed-up data. A backup set can contain data from one or more datafiles, archived redo log files, or control files. By default, a backup set contains just one backup piece. However, you can limit the size of a backup piece by using the maxpiecesize parameter. If you do this and the backup set size is larger than the backup piece size specified by the maxpiecesize parameter, there’ll be multiple backup pieces within that backup set.

Each of the objects you back up with the backup command—database, tablespace, archived redo logs, and so on—will result in at least one backup set if you specify backup set as the backup type. RMAN determines the number of backup sets for a backup according to an internal algorithm. However, you can limit the size of a backup set by specifying the maxsetsize parameter. You can also indirectly control the number of backup sets made by RMAN for each backup by specifying the filesperset parameter, which limits the number of input files (datafiles, archived redo log files, and so on) that can be backed up into a single backup set.

The key difference between an image copy and a backup set is that RMAN can write blocks from many files into the same backup set (known as multiplexing) but can’t do so in the case of an image copy—an image copy is identical, byte by byte, to the original datafile,control file, or archived redo log file. An RMAN image copy and a copy you make with an operating system copy command such as dd (which makes image copies) are identical.

Since RMAN image copies are identical to copies made with operating system copy commands, you may use user-made image copies for an RMAN restore and recovery operation after first making the copies “known” to RMAN by using the catalog command. After this point, there’s no difference between those image copies made by you and those made by RMAN. During a restore operation, if you have both image copies and backup sets from the same time period, RMAN prefers to use an image copy over a backup set. This is because there is more overhead involved in sorting through a backup set to get the files to restore. In addition, image copies offer yet another benefit during a restore and recovery operation. If you need to restore a current datafile and happen to have an image copy of that datafile available, you can use the switch command to simply point the database to the replacement file instead of the original datafile. This eliminates the need to restore the datafile, thus speeding up database recovery considerably.

RMAN Backup Modes:
A control file or an archived redo log file is always backed up completely and in a consistent fashion. A datafile, however, may be backed up partly or completely. You can also make consistent or inconsistent backups with datafiles. The various backup types are as follows:

Full vs. incremental backups: A full backup is a backup of a datafile that includes every allocated block in that file. Note that an image copy backup of a datafile will always include every block in that file. A backup of a datafile as a backup set, however, may skip data blocks that aren’t in use. An incremental backup can be one of two different levels: a level 0 backup including all blocks in the datafile except those blocks compressed because they have never been used or a level 1 backup including only those blocks that have changed since the parent backup.

Consistent vs. inconsistent backups: A backup taken after a database was shut down gracefully (as opposed to using the shutdown abort command or a shutdown following an abrupt database crash) and restarted in mount state is said to be consistent. A consistent backup doesn’t require recovery after you restore the database. A backup taken while the
database is online or after it was brought into mount state after being shut down abruptly is called an inconsistent backup. An inconsistent backup always needs recovery to make the backup consistent.

If you’re running in archivelog mode, the target database must be mounted or be open before you can issue an RMAN backup command. If you’re running the database in noarchivelog mode, the database must first be shut down cleanly and started up in mount state before you can use RMAN for backups. If the database was abruptly shut down and restarted, RMAN can’t make the backups. You mustn’t back up a database running in noarchivelog mode while the database is open.

By default, all RMAN backups—whole database, tablespace level, and so on—are full backups. That is, all data blocks in the datafiles that were ever used, even if they are currently empty, are included in the backup. You can specify the command backup full database, for example, to start a whole-database backup, but it’s not necessary to do so. Just use the command backup database to do the same thing. However, when you are performing an incremental RMAN backup, you must specify the keyword incremental in your backup commands since it isn’t the default backup type.

Types of Files That RMAN Can Back Up:
RMAN lets you back up all the files you’d need for a database recovery, such as the following:

• Datafiles
• Control files
• Archived redo logs
• Image copies of datafiles and control files, including those made by RMAN
• Backup pieces that contain RMAN backups

The Oracle database uses three types of “live” files during its operation: datafiles, online redo log files, and control files. Of these three types of files, RMAN backs up only the datafiles and the control files. You can’t use RMAN to back up the online redo log files. If you’re operating in noarchivelog mode, then you won’t need the online redo logs, since the database files are always consistent when you back up the database using the only permitted modes of backing up a database in noarchivelog mode, which are closed whole backups. You won’t need the online redo log backups if you’re operating in archivelog mode either, since RMAN is continually backing up all your archived redo logs. However, you must make sure you always multiplex the online redo log so you won’t lose all members of a group and thus all the committed changes as yet unrecorded in the datafiles.

In addition to the previously mentioned types of files, RMAN also can back up the server parameter file, or spfile, which contains the initialization parameter for starting up your database. You can’t, however, back up the following types of files using RMAN:

• External files
• Network configuration files
• Password files
• Any Oracle home-related files

Use normal operating system copy utilities to back up any of these four types of files.


Saturday, November 15, 2014

Configuring the RMAN Environment - Showing RMAN Configuration Settings

Problem
You want to see your current RMAN configuration settings. For example, you may be seeing unexpected RMAN behavior, or you may be encountering performance issues because of how you’ve configured RMAN in your environment.

Solution
Use the RMAN command show to view the current value of one or all of RMAN’s configuration settings. The show command will let you view the value of a specified RMAN setting. For example, the following show command displays whether the autobackup of the control file has been enabled:

RMAN> show controlfile autobackup;
RMAN configuration parameters are:
CONFIGURE CONTROLFILE AUTOBACKUP OFF;
RMAN>

The show all command displays both settings that you have configured and any default settings. Any default settings will be displayed with a # default at the end of the line. For example, the following is the output from executing the show all command:

RMAN> connect target /
RMAN> show all;
RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP OFF; # default
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUPTYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE COMPRESSION ALGORITHM 'ZLIB'; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO
'C:\ORACLE\PRODUCT\11.1\DB_1\DATABASE\SNCFORCL.ORA'; # default
RMAN>

How It Works
The show command queries the target database control file to retrieve RMAN configuration settings. Configuration settings are stored in the target database control file regardless of whether you are using a recovery catalog. Once configured, settings persist until you change them again. Because RMAN settings are stored in the control file, your target database must
be mounted or open when issuing the show command.

The show all command reveals the present configuration regarding several important RMAN backup and recovery settings. The following list summarizes the meaning of the most important of these settings, shown by issuing the show all command:

configure retention policy to redundancy 1 means that RMAN retains only one set of backup copies.
configure backup optimization off means that by default RMAN won’t skip the backing up of unchanged data blocks in the datafiles.
configure default device type to disk means that by default RMAN sends backup output to a disk drive.
configure controlfile autobackup off means that by default RMAN doesn’t automatically back up the control files when it performs a backup task.
configure device type disk parallelism 1 backup type to backupset means that the default RMAN backup type is a backup set (and not an image copy) and the degree of parallelism is 1.
configure datafile backup copies for device type disk to 1 means that by default RMAN doesn’t make multiple copies of a backup file.
configure maxsetsize to unlimited means that there’s no limit on the size of a backup set by default.
configure encryption for database off means that by default RMAN backups aren’t encrypted.

Notice that the output of the show all command shows the existing RMAN configuration in the form of RMAN commands to re-create that configuration. Therefore, if you are planning to use the same type of configuration on a different database, just save the output from the show all command to a text file that you can then execute from the RMAN command line after connecting to the target database to which you’re planning to migrate those settings.

You can view information about RMAN’s persistent configuration settings by querying the V$RMAN_CONFIGURATION view, as shown here:

SQL> select * from v$rman_configuration;
CONF#                     NAME                                           VALUE
---------- ------------------------------                              ------------------------
        1 RETENTION POLICY                       TO REDUNDANCY 3
        2 BACKUP OPTIMIZATION                                           ON
        3 DEFAULT DEVICE TYPE TO                              sbt_tape
        4 CONTROLFILE AUTOBACKUP                                  ON
        5 DEVICE TYPE                             DISK PARALLELISM 2
5 rows selected.

The NAME column in the V$RMAN_CONFIGURATION view shows the type of RMAN configuration, and the VALUE column shows the present configure command setting for that type, for example, configure retention policy to redundancy 3.

Sunday, November 9, 2014

RMAN-Checking the Syntax of RMAN Commands

Problem
You want to check the syntax of your RMAN commands without actually executing the commands.

Solution

To check the syntax of RMAN commands, you must start the RMAN client with the operating system command-line argument checksyntax. You can easily check the syntax of commands prior to their execution either by entering them at the command prompt or by reading in the commands through a command file. Here’s how you check the syntax of a single RMAN command (run {backup database;}) by first starting the RMAN client with the checksyntax argument:

$. /rman checksyntax
Recovery Manager: Release 11.1.0.1.0 - Beta on Mon Apr 2 08:31:11 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
RMAN> run {backup database;}
The command has no syntax errors
RMAN>


In this example, there were no errors in the syntax of the simple run block, and RMAN confirms that. You can also use the checksyntax argument to check the syntax of RMAN commands that are part of a command file. Simply specify the checksyntax argument before invoking the command file that consists of the RMAN commands. In the following example,
the file goodcmdfile contains a couple of restore and recovery commands:

$ rman checksyntax @/tmp/goodcmdfile
Recovery Manager: Release 11.1.0.1.0 - Beta on Mon Apr 2 08:31:11 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
RMAN> # file with legal syntax
2> restore database;
3> recover database;
4>
The cmdfile has no syntax errors
Recovery Manager complete.
$


You can also open an RMAN session solely for the purpose of checking the syntax of commands
that you type interactively:

$ rman checksyntax

An important point about the checksyntax argument is that you can’t use it after starting RMAN. That is, you can’t include the checksyntax argument from the RMAN command line. You must pass checksyntax as an argument to the rman command when you start the RMAN client and without connecting to any target or recovery catalog.

How It Works
When you either execute an RMAN command file by preceding it with the checksyntax argument or enter any RMAN commands after starting RMAN with the checksyntax argument, RMAN won’t actually execute any RMAN commands. RMAN will check and report only on the syntax of those commands. If the RMAN commands that you type at the command line or that you include as part of a command file have no errors, you get the “the command (cmdfile) has no errors” message from RMAN. Otherwise, RMAN will issue an error, as shown in the following example:

$ rman checksyntax @/tmp/badcmdfile
Recovery Manager: Release 11.1.0.1.0 - Beta on Mon Apr 2 08:31:11 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
RMAN> # file with illegal syntax
RMAN> run (backup database);
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00558: error encountered while parsing input commands
RMAN-01009: syntax error: found "(": expecting one of: "{"
RMAN-01007: at line 1 column 5 file: standard input
RMAN>

The output of the checksyntax command reveals there is a syntax error in your run block. The checksyntax command is handy for checking scripts for syntax errors. With RMAN, there’s no need for a script to fail unexpectedly because you mangled the syntax of a command. If you’re surprised by an error, it’s because you didn’t test with checksyntax first.

RMAN-Creating Dynamic Command Files

Problem
You want to create dynamic command files that can be used for multiple jobs by passing substitution variables.

Solution
You can create dynamic shell scripts by using substitution variables in the RMAN command files inside the shell scripts. You can specify values for use in substitution variables through the new using clause when calling an RMAN command file. You use the &integer syntax (&1, &2, and so on) to indicate to which variable your substitution values should be assigned, just as in SQL*Plus.
Let’s review an example that shows how to create a dynamic backup shell script.

1. Create the RMAN command file that uses two substitution variables:

#backup.cmd
connect target sys/<sys_password>@prod1
run {
backup database
tag &1
format &2
}
exit;


The command file shown here will back up the database using two substitution variables (&1 and &2), one for the backup tag and the other for the string value in the format specification.
2. Create the shell script to run the backup command file you created in step 1:

#!/bin/tcsh
# script name: nightly_backup.sh
set tag=$argv(1)
set format=$argv[2]
rman @backup.cmd using $tag $format


3. Now that you have created a dynamic shell script, you can specify the arguments for the tag and format variables on the command line, thus being able to modify them for different jobs. Here’s an example:

$ nightly_backup.sh longterm_backup back0420

The example shows how to execute the shell script nightly_backup.sh with two dynamic parameters, longterm_backup (tag) and back0420 (format string).

How It Works
The ability to use substitution variables in RMAN scripts is new in Oracle Database 11g. The use of substitution variables in RMAN scripts is similar to the way you specify substitution variables in operating system and SQL*Plus scripts. Specifying substitution variables lets you use the same command file by modifying it appropriately for different backup tasks, thus
making the command file dynamic.

Saturday, November 8, 2014

RMAN-Executing Operating System Commands from Within RMAN

Problem
You’ve invoked the RMAN client, and now you need to issue some operating system commands.

Solution

Use the RMAN command host to invoke an operating system subshell. You can execute this command in two ways: you can issue it from the RMAN prompt, or you can execute it from inside a run block, which is a group of RMAN commands executed as a single unit. If you issue the host command stand-alone, without any parameters, RMAN will take you to the operating
system command line. Thus, the host command works the same in RMAN as it does from within SQL*Plus. If you issue the command host followed by a valid operating system command as a parameter, then RMAN will execute that operating system command and continue to process the rest of the commands in the run block, if there are any.
In the following example, we use the host command to list all files ending with dbf, after backing up a datafile from the RMAN prompt:

RMAN> shutdown immediate;
RMAN> startup mount;
RMAN> backup datafile '/u01/app/oracle/oradata/targ/system01.dbf' format '/tmp/system01.dbf';
RMAN> host 'ls -l /tmp/*dbf';
RMAN> alter database open;


The following example uses the host command with no parameters to temporarily escape to the operating system level during an interactive RMAN session:

RMAN> backup datafile 3 format '/u01/app/oracle/oradata/targ_db/dbs01.cpy';
RMAN> host;
$ ls $ORACLE_HOME/oradata/dbs01.cpy /net/oracle/oradata/dbs01.cpy
$ exit
RMAN>


How It Works
As you can see in the two examples, you can use the host command with or without an operating system command as a parameter. If you run the host command as part of a series of RMAN commands, RMAN executes the host command and continues with the rest of the commands. When you execute the host command by itself, RMAN displays the operating
system command prompt and resumes after you exit the command-line subshell.

Creating an Oracle Password File

You can easily create an Oracle password file with the help of the orapwd utility. Just type orapwd at the operating system command line to view the syntax of the command:

$ orapwd
Usage: orapwd file=<fname> password=<password> entries=<users> force=<y/n> ignorecase=<y/n> nosysdba=<y/n>


where
file - name of password file (required),
password - password for SYS (required),
entries - maximum number of distinct DBA (required),
force - whether to overwrite existing file (optional),
ignorecase - passwords are case-insensitive (optional),
nosysdba - whether to shut out the SYSDBA logon (optional Database Vault only).
There must be no spaces around the equal-to (=) character.

Of the six options for the orapwd utility, the file, password, and entries options are mandatory. You can create a simple Oracle password file using the following syntax:

$ orapwd file=mydb_pwd password=sammyy1 entries=20

This command will create an Oracle password file. The default location for the password file is the $ORACLE_HOME/dbs directory. Once you create the password file, edit your init.ora file or your spfile in the following manner:

remote_login_passwordfile = 'EXCLUSIVE'

Once your restart your database after this, you’ll be able to log in as the sys user.

RMAN - Logging Command-Line RMAN Output

Problem
You want to log the output of RMAN commands you issue in command-line mode.

Solution
If you want RMAN to log all its output when you use RMAN from the operating system command line, just add the keyword log to the command line, and supply the name of the log file to use. For example:

$ rman target / cmdfile commandfile1.rcv log /u01/app/oracle/outfile.txt

In this case, RMAN will write the output of the RMAN commands in the command file named commandfile.rcv to the log file outfile.txt. If you later want to run another set of RMAN commands and want to append the log messages to the same log file, you can do this by using the append option along with the log option. Here’s an example:

$ rman target / cmdfile commandfile2.rcv log /u01/app/oracle/outfile.txt append

The previous command will append the output from executing the command file commandfile2.rcv to the text file outfile.txt.

How It Works

The command-line argument log causes RMAN to send all its output to the log file you specify.Failure to add the keyword append when referring to an already existing log file will result in the overwriting of that older log file. If you are running RMAN interactively and you want to see output on your terminal screen as well as have it written to a log file, you can take advantage of the Unix/Linux tee command. The tee command sends output both to a text file and to the terminal. Here’s how you use the tee command:

$ rman | tee rman.log
RMAN>


All is not lost if you don’t specify a log file to capture the RMAN output. The view V$RMAN_OUTPUT returns detailed information about RMAN jobs in progress. For example, if your media manager runs into a problem with a tape drive, RMAN records the associated error messages in V$RMAN_OUTPUT and also outputs the message to the terminal or to a log file. As with all dynamic performance views, the contents of the V$RMAN_OUTPUT view are refreshed when you restart the database. The V$RMAN_STATUS view contains information about completed RMAN jobs as well as all RMAN jobs in progress.

Friday, November 7, 2014

RMAN–Using RMAN

You can start using RMAN to back up and recover your databases with very little fanfare. When you install the Oracle server software, you’ll automatically install RMAN as well. You only absolutely need two things to start using RMAN: the database you want to back up (referred to as the target database) and the RMAN client, which is the interface you use to interact with the RMAN server processes that perform the actual backup and recovery tasks.

When you use RMAN to back up and recover your database files and objects, you use the RMAN client to interact with the database. The RMAN client interprets the RMAN commands you issue and starts up the necessary server sessions to process those commands. The term RMAN repository refers to the record of RMAN metadata about all backup and recovery actions on the target database. RMAN relies on this metadata when it performs backup and recovery operations.

By default, RMAN always stores a copy of the RMAN repository in the target database’s control file. Optionally, you can also use a recovery catalog for long-term storage of the RMAN repository. Whenever there is a change in the database structure, archived redo logs, or backups, RMAN updates the recovery catalog with the new information from the target database
control file. This way, you have an alternate source for the all-important RMAN repository data if you lose or can’t access the control file of the target database. In addition, the recovery catalog provides a long-term storage capacity for all RMAN backup and recovery information, whereas such older data is liable to be overwritten in the control file. The recovery catalog exists as a separate database schema, located ideally in a database separate from the target database(s). You can simplify your RMAN administration by using a single recovery catalog for all your Oracle databases.

You start up the RMAN client using the RMAN executable rman, which you’ll find in the $ORACLE_HOME/bin directory. In addition to the rman executable, RMAN also comes with two other internal components: one a set of PL/SQL procedures in the target database and the other a file named recover.bsq. RMAN turns the backup and recovery commands you issue into PL/SQL procedure calls using the recover.bsq file to construct the calls. After you start the
RMAN client, you must log in using either operating system credentials or database authentication. After logging in, you can issue backup and recovery instructions either by entering RMAN commands at the command line or by executing a script file that contains RMAN commands. You can also issue several types of SQL commands from the RMAN command line.
After you finish your backup and recovery session, you exit the RMAN client.

recover.bsq file is location in here - ?/rdbms/admin/recover.bsq

In addition to the target database and the RMAN client, the RMAN environment can have other optional elements. If you follow the Oracle’s backup and recovery recommendations you may also have a flash recovery area. In addition, you must have a media
management layer (MML) to interact with tape drives, since RMAN can’t work directly with the tape drives. RMAN can use either a third-party MML or Oracle’s own backup and recovery offering, called Oracle Secure Backup. The MML accesses and controls the tape libraries and manages the loading and unloading of tapes.

Finally, if you plan on working with several databases, it may be a smart idea to use an RMAN catalog database, which is a separate Oracle database dedicated to storing the recovery catalog. Although the recovery catalog isn’t mandatory, it provides two important advantages over using the database control file to store the RMAN metadata relating to backup and recovery
activity: you can store vastly greater amounts of data in the recovery catalog as compared to a control file, and you can store RMAN scripts inside the recovery catalog. By default, all RMAN-related records in the target database’s control file are overwritten after seven days, but you can control the length of retention by setting a higher value for the initialization parameter control_file_record_keep_time.

One may argue that since the control file can record all of RMAN’s metadata, there is no need to create and manage a separate recovery catalog database to store RMAN metadata. However, consider a situation where you lose all your control file copies at once. You can, of course, rebuild the control file quickly using the output of a recent alter database backup controlfile to trace command. However, when you re-create the control file using the output of that command, the one thing you do not get back is all the RMAN metadata that used to be stored in the control file! This and the fact that Oracle may always overwrite even useful
RMAN metadata in the control file means you should seriously consider using the recovery catalog. Oracle recommends using a recovery catalog in order to provide redundancy for your RMAN metadata. Chapter 6 discusses the recovery catalog in detail.

Wednesday, November 5, 2014

Backup and Recovery Instance Architecture–Undo Redo

The Oracle instance consists of the system global area (SGA), which is the memory allocated to the Oracle instance, and a set of Oracle processes called the background processes. The Oracle processes start when you start the instance and keep running as long as the instance is alive. Each of the Oracle background processes is in charge of a specific activity, such as writing
changed data to the datafiles, cleaning up after disconnected user sessions, and so on.
We’ll briefly review the key Oracle background processes that perform critical backup and recovery–related tasks, which are the checkpoint process, the log writer process, and the archiver process.
The Checkpoint Process
The checkpoint process does three things:
• It signals the database write process (DBWn) at each checkpoint.
• It updates the datafile headers with the checkpoint information.
• It updates the control files with the checkpoint information.
The Log Writer Process
Oracle’s online redo log files record all changes made to the database. Oracle uses a “writeahead” protocol, meaning the logs are written to before the datafiles are. Therefore, it is critical to always protect the online logs against loss by ensuring they are multiplexed. Any changes made to the database are first recorded in the redo log buffer, which is part of the SGA.
Redo log files come into play when a database instance fails or crashes. Upon restart, the instance will read the redo log files looking for any committed changes that need to be applied to the datafiles. Remember, when you commit, Oracle ensures that what you are committing has first been written to the redo log files before these changes are recorded in the actual datafiles. The redo log is the ultimate source of truth for all changes to the data in an Oracle database, since an instance failure before the changes are written to the datafiles means that the changes are only in the redo log files but not in the datafiles.
The log writer (LGWR) process is responsible for transferring the contents of the redo log buffer to the online redo log files. The log writer writes to the online redo files under the following circumstances:
• At each commit
• Every three seconds
• When the redo log buffer is one-third full

The important thing to remember here is that the log writer process writes before the database writer does, because of the write-ahead protocol. Data changes aren’t necessarily written to datafiles when you commit a transaction, but they are always written to the redo log.
The Archiver Process
The archiver (ARCn) is an optional background process and is in charge of archiving the filled online redo log files, before they can be overwritten by new data. The archiver background process is used only if you’re running your database in archivelog mode.

Physical Database Structures Used in Recovering Data
You need to deal with four major physical database structures during a database recovery:
• Datafiles
• Redo logs (archived and online)
• Control files
• Undo records
In a basic database recovery situation, you’d need to first restore datafiles by using backups (from a past period, of course).Once the restoration of the datafiles is completed, you issue the recover command, which results in the database rolling forward all committed data and thus bringing the database up-to-date. The database also rolls back any uncommitted data that’s recorded in the undo segments that are part of the undo tablespace. The database server automatically performs the rollback of uncommitted data by using undo records in the undo tablespace to undo all uncommitted changes that were applied to the datafiles from the
redo logs during the recovery process. This rolling back of uncommitted data takes place by using the information about all the changes made since the last database start-up. Oracle records all changes made to the database in files called the online redo log files. Since Oracle uses a round-robin method of writing the online redo log members, it is critical that you save the filled online redo logs before they are written. The process of saving the filled redo log files is called archiving, and the saved redo log files are termed archived redo log files. A media recovery process uses both the archived redo log files and the online redo log files.
The control file is essential for the Oracle instance to function, because it contains critical information concerning tablespace and datafile records, checkpoints, redo log threads in the current online redo log, log sequence numbers, and so on.
RMAN lets you back up all the files you need for a database recovery, including datafiles, control files, and archived redo logs. RMAN also lets you make image copies of both datafiles and control files, in addition to the standard RMAN-formatted backup pieces. You should never back up online redo log files; instead, always duplex these files to protect against the loss of an online redo log.
HOW TO DETECT UNDO:There are some views that show information related to undo activity:
  • V$UNDOSTAT: histogram-like view that shows statistics for 10-minute intervals.
  • V$TRANSACTION: present time view providing information on current transactions.
  • V$SESSTAT: individual session statistics, which includes one for undo usage.
  • V$UNDOSTAT will provide who did hint, recording the longest running query for that 10-interval, through the MAXQUERYID column which may be linked to V$SQL and use columns PARSING_USER_ID or PARSING_SCHEMA_NAME the get a grip on the suspect.
  • V$TRANSACTION linked with V$SESSION will show current used undo blocks for ongoing transactions. This query may help:
SELECT a.sid, a.username, b.used_urec, b.used_ublk

FROM v$session a, v$transaction b

WHERE a.saddr = b.ses_addr

ORDER BY b.used_ublk DESC


  • V$SESSTAT provides another view, which uses the undo kind of view, but we must avoid getting lost in the maze of Oracle statistics and focusing on just one: Undo change vector size, which will accumulate the bytes of undo used during the session lifetime. Following query is designed to pinpoint who is having a high undo activity.


SELECT a.sid, b.name, a.value

FROM v$sesstat a, v$statname b

WHERE a.statistic# = b.statistic#

AND a.statistic# = 176<– Which stands for undo change vector size

ORDER BY a.value DESC

HOW TO DETECT REDO:To find sessions generating lots of redo, you can use either of the following methods. Both methods examine the amount of undo generated. When a transaction generates undo, it will automatically generate redo as well. The methods are: 1) Query V$SESS_IO. This view contains the column BLOCK_CHANGES which indicates how much blocks have been changed by the session. High values indicate a session generating lots of redo.

The query you can use is:


SELECT s.sid, s.serial#, s.username, s.program, i.block_changes

FROM v$session s, v$sess_io i

WHERE s.sid = i.sid

ORDER BY 5 desc, 1, 2, 3, 4;


  1. Run the query multiple times and examine the delta between each occurrence of BLOCK_CHANGES. Large deltas indicate high redo generation by the session.
  2. Query V$TRANSACTION. These view contains information about the amount of    undo blocks and undo records accessed by the transaction (as found in the USED_UBLK and USED_UREC columns).

The query you can use is:


SELECT s.sid, s.serial#, s.username, s.program, t.used_ublk, t.used_urec

FROM v$session s, v$transaction t

WHERE s.taddr = t.addr

ORDER BY 5 desc, 6 desc, 1, 2, 3, 4;

Run the query multiple times and examine the delta between each occurrence of USED_UBLK and USED_UREC. Large deltas indicate high redo generation by the session. You use the first query when you need to check for programs generating lots of redo when these programs activate more than one transaction. The latter query can be used to find out which particular transactions are generating redo.

Archivelog and Noarchivelog Mode of Operation
You can operate your Oracle database in either archivelog mode or noarchivelog mode. In noarchivelog mode, Oracle will overwrite the filled online redo logs, instead of archiving (saving) the online redo logs. In this mode, you’re protected only from instance failures, such as those caused by a power failure, for example, but not from a media failure. Thus, if there is a media failure, such as a damaged disk drive, the changes that were overwritten are gone forever, and the database won’t be able to access those data modifications to recover the database up to the current point in time. The transactions made since the last backup are lost forever, and you can restore the database only to the point of the last backup you made.

If you are running your database in noarchivelog mode and you happen to lose a datafile, for example, you follow these steps to get back to work again:
1. If the instance isn’t already shut down, first shut it down.
2. Restore the entire database (datafiles and control files) from the backups.
3. Restart the database by using the startup (open mode) command.
4. Users lose any data that was changed or newly entered in the database since you took the backup that was just restored. You can enter the data if you have a source, or you’re going to have a data loss situation.

If you are running a production database—or if you want to make sure that all the data changes made to any database, for that matter, are always protected—you must operate your database in archivelog mode. Only a database running in archivelog mode can recover from both instance and media failures. You can’t perform a media recovery on a database running in noarchivelog mode.

If you’re running the database in noarchivelog mode, remember that you can make a whole-database backup only after first shutting the database down. You can’t make any online tablespace backups in such a database. A database in noarchivelog mode also can’t use the tablespace point-in-time recovery technique. Make sure you take frequent whole-database backups if an important database is running in noarchivelog mode for some reason.