What is Data Warehouse? – Simple Definition

WHAT IT IS:

A data warehouse is a federated repository for all the data that an enterprise’s various business systems collect. The repository may be physical or logical. Data warehousing is an electronic method of organizing information.

HOW IT WORKS (EXAMPLE):

A data warehouse essentially combines information from several sources into one comprehensive database. For example, in the business world, a data warehouse might incorporate customer information from a company’s point-of-sale systems (the cash registers), its website, its mailing lists and its comment cards. Alternatively, it might incorporate all the information about employees, including time cards, demographic data, salary information, etc.

By combining all of this information in one place, a company can analyze its customers in a more holistic way, ensuring that it has considered all the information available. Data warehousing also makes data mining possible, which is the task of looking for patterns in the data that could lead to higher sales and profits.

There are different ways to establish a data warehouse and many pieces of software that help different systems “upload” their data to a data warehouse for analysis. However, the basic idea is to first extract data from all the individual sources (cash registers, time clocks, office computers), remove redundant data and organize the data into a consistent format that can be queried.

WHY IT MATTERS:

Companies with data warehouses can have an advantage in product development, marketing, pricing strategy, production time, historical analysis, forecasting and customer satisfaction. However, data warehouses also can be very expensive to design and implement, and sometimes their construction makes them unwieldy.

What do I need to know about data warehousing?

Data warehouses are typically used to correlate broad business data to provide greater executive insight into corporate performance.

How is a data warehouse different from a regular database?

Data warehouses use a different design from standard operational databases. The latter are optimized to maintain strict accuracy of data in the moment by rapidly updating real-time data. Data warehouses, by contrast, are designed to give a long-range view of data over time. They trade off transaction volume and instead specialize in data aggregation.

What are data warehouses used for?

Many types of business data are analyzed via data warehouses. The need for a data warehouse often becomes evident when analytic requirements run afoul of the ongoing performance of operational databases. Running a complex query on a database requires the database to enter a temporary fixed state. This is often untenable for transactional databases. A data warehouse is employed to do the analytic work, leaving the transactional database free to focus on transactions.

The other benefits of a data warehouse are the ability to analyze data from multiple sources and to negotiate differences in storage schema using the ETL process.

What are the disadvantages of a data warehouse?

Data warehouses are expensive to scale, and do not excel at handling raw, unstructured, or complex data. However, data warehouses are still an important tool in the big data era.

Resources:

https://www.informatica.com/services-and-training/glossary-of-terms/data-warehousing-definition.html#fbid=85J5EGsXI8z

https://www.coursera.org/specializations/data-warehousing

http://searchsqlserver.techtarget.com/definition/data-warehouse

Leave a Reply

Your email address will not be published. Required fields are marked *