In this article, a data center will bring you on concept of data warehouse backup and its various types. It also includes brief reason for using backup or in other word, storage facility.
Data Warehouses are built by the corporate and leading companies, to accommodate and safeguard abundance of data, for successful running of their business. In every company, data are collected from various sources such as online transaction processing (OLTP) applications, data mining, decision support systems (DSS) and others. These data are stored in data warehouses and can be accessed through databases such as an Oracle database and also through press, multimedia, market data, graphs, drawings, order and project data, and links to Internet and Intranet websites. It comes in very large size such as size of one terabytes to tens of terabytes or larger. It is accessible in a running condition to millions of users at home and office i.e. IT professionals, system analysts, business analysts and industrialists as well as to data center community throughout months and years which means 24 hours x 365 days.
Components of Data Warehouse
Data warehouse consists of many components which are mentioned below:
- Global Replication
- Disaster Recovery
- On-the-fly Change
- Backup & Recovery
Let us focus mainly on data warehouse backup.
DATA WAREHOUSE BACKUP
a. Concept of Backup and Recovery
Backup and recovery is one of the most important factors to be taken into consideration while maintaining data warehouses. Backup and recovery is literally means implementing various strategies, methods and procedures to protect databases, data center, data mining against loss and risks, and to recover it or to reconstruct it after failure. A backup is just another copy of original data. It can be a relevant part of database, document, control file and online transactions processing applications. Maintaining backup copies save data from application or processing error and are like a bodyguard against data loss. There is a ‘cold’ and ‘hot’ database backup which do the backing up of databases, related files and links within the whole data warehouse.
b. ‘Cold’ versus ‘Hot’ Database Backup
‘Cold’ database backup is backing up the whole data warehouse that operates continuously or nonstop throughout. But, disadvantage of it is that there is not enough space or window to do the storing. That is when ‘hot’ database backup comes into place. ‘Hot’ database backup means backing up the data warehouse with databases and related files while they are being updated. It requires a high-end backup product, for example Oracle 7 or 8 databases, which has ‘hot’ backup capability and recovery system. It is able to backup huge number of files, links, databases, data and others.
c. Enterprise Software Backup
Again one such example is Oracle database backup, which contained in data warehouses, is widely used by businesses and corporate. Oracle database stores large amount of data and safeguard it against possible losses or failures. This is mainly due to its physical structures which make it possible to backup and recover data. Components of physical data structure are data files, redo logs and control file. Each backup requires Oracle to scan the data warehouse before storing data.
Veritas NetBackup facility is other software that can be used for very fast and full backups. It backup databases and non-databases files to cover up the whole data warehouse and continues to store increasing backup of files and data without scanning them. This Veritas software is useful for keeping updated backup copies on a local or remote site thus making it readily available and instantaneous form of backup, particularly for smaller data warehouses.
d. Online and Offline Storage
Data warehouses also store online and offline data. Offline storage contains old files; multimedia, databases and old documents rarely used by consumers and users but are accessible. When they are accessed by users online, they go onto online storage. They can be viewed on screen or kept on a file server for online use, but in small size i.e. ‘stub’ since the big part of this file has moved onto secondary storage while being offline. Hierarchical Storage Management (HSM) has this ability of online and offline storage.
Calls for Action!
As we know that, data warehouses are running around for 24 hours x 365 days, it has to be kept in good conditions and runs smoothly without much risk involved and avoid of any accidents as much as possible. Data warehouses can take at least six months or more to be built and updated in terms of performance, availability, reliability and other features but beware, in a matter of minutes, if either performance clashes or rising pressure on extracting information etc, data warehouses can come crashing down leading to many losses. Proper maintenance and planning of backup are required in a timely manner, so it can be recovered successfully.
Backup is a critical factor. Performance of data warehouses need to be constantly monitored to meet demands of the users. If data warehouse or data center in it fails and without backup to recover data, companies face a huge loss. It means that companies will lose millions of data which could lead to loss of projects and business, thus loss of revenues and probably users.
You can also keep up to date with current trends and technology by visiting Data Center Talk where we keep you informed on important changes as they occur.