Skip to content

Latest commit

 

History

History
110 lines (77 loc) · 3.56 KB

course_info_and_desc.md

File metadata and controls

110 lines (77 loc) · 3.56 KB

Data Warehousing

1. Course Information

1. A data warehouse is a  

   1) subject-oriented, 
   2) integrated,
   3) non-volatile, and 
   4) time-variant 
   
   collection of data in support 
   of management’s decisions.

2. This data-warehousing course introduces the  business,
   technology, and managerial issues related  to  BI  and
   DW solutions.   Students will acquire practical skills
   in    collecting   business   requirements,  planning,
   defining,  designing  and  developing  a  BI solution.

3. Emphasis  is placed on learning how to derive business
   value from BI and  DW solutions.  Hands-on  experience
   will  be obtained  by building a small  data-warehouse
   and using a variety of BI tools.

4. This course is  about  data  warehousing  and its role
   in  carrying  out  modern  business  intelligence  for
   actionable  insight  to  address  new  business needs.
   
5. A data warehouses is the central component of a modern
   data  stack (a modern  data stack  is a combination of
   various software tools  used to collect, process,  and
   store  data  on a  well integrated  cloud  based  data
   platform).

2. Course Description:

1. This course is about data warehousing and
   its role in carrying out modern business
   intelligence for actionable insight to address
   new business needs.

2. What is a data warehouse? A data warehouses
   is the central component of a modern data stack:
   a modern data stack is a combination of various
   software tools used to collect, process, and
   store data on a well-integrated cloud-based data
   platform.

3. Data warehouses have solved the problem of
   analyzing massive amounts of structured,
   semi-structured, and non-structured data and are
   cost-effective, performant and easy to use. Note
   that non-structured (such as images and log data)
   data can not be analyzed directly by SQL.

4. Data warehouses are the foundation for reporting,
   ad hoc analysis, business intelligence and machine
   learning, and enable collaboration among a diversity
   of users and stakeholders across organizations
   of all sizes.

5. This class will provide students with the
   conceptual background and **hands on** data 
   analytics skills needed to utilize a data 
   warehouse effectively.  Throughout the course,
   students will work on an end-to-end development
   project, building a working data platform for 
   **New York City Transit Data**.  Using actual taxi,
   rideshare, bike share and weather data, students 
   will answer real-world analytics questions, such 
   as "How does location and time of day affect trip
   length?" and "How does weather affect transit
   preferences?".

6. By the end, students will be empowered with the
   skills, tools and techniques needed to take a
   real-world data project from problem statement
   to prototype to production.

3. Learning Outcomes/Objectives:

1. Implement data ingestion techniques (ETL)

2. Write simple ETL programs (extract, transform, and load)

3. Write SQL for data analytics, including time series and ranking algorithms

4. Transform data using SQL and Big Data Analytics

5. Compare modern and classic strategies of data modeling: star schema + more

6. Understand data warehouse architecture

7. Maintain data quality

8. Create reports, analysis & visualizations

9. Write OLAP queries

10. Use Join operations for SQL/OLAP queries

11. Implement a small Data Warehouse and Star Schema using ETL

12. Provide SQL/Business Intelligence from a built Data Warehouse