Data Warehouse Layer
The data warehouse layer is comprised of the staging area, which is a temporary storage area utilized in the Extract Load and Transform (ELT) process, and the data warehouse which is the more permanent storage area for data in a denormalized format. The ELT process utilizes the SQL Server Integration Services (SSIS) platform and SSIS Catalog framework for data extraction and for updating the dimension and fact tables in the data warehouse.
There are two sets of SSIS packages:
-
The first set, developed using the Change Data Capture and/or Change Tracking related tasks and components, extracts the initial and incremental change data (flagged as INSERTs, UPDATEs and DELETEs) from the source and stores it in a staging area.
-
The other set transforms the stored data in the staging and lookup tables and updates the dimension and fact tables in the data warehouse.
The staging area and the data warehouse tables reside in the same SQL Server instance and same database. SQL Server Agent jobs are created on this instance to execute the initial and incremental SSIS packages.
Depending on whether Anthology Student or CampusNexus CRM are standalone sources or they are integrated, appropriate steps are added to the jobs. For example, if CampusNexus CRM exists as a single source, then only the steps relevant to the CampusNexus CRM are added to the jobs.
The recurring schedule of the SQL Server Agent job for the incremental updates can be changed as required. An additional SQL Server Agent job is created to purge stale records from the staging tables for performance reasons.
The "Age Refresh" job is scheduled to run at 12.00.00 a.m. on the first day of every month. This job updates all Student, Prospect, CRM Contact, and Lead age data.
Analytics 4.1 adds the “Student Account Aging Snapshot” job, which is scheduled to run every day at 12.00.00 a.m.. This job creates monthly Student Account Aging Snapshots for current and past 1 year. It considers the last day of the month as “As of Date” for creating the monthly snapshots. For example, the January 2021 snapshot is created by considering 31st January 2021 as “As of Date”.
It also creates one snapshot for the current month by considering the yesterday’s date as “As of Date”. On start of a new year, for example, 1st January 2021, the snapshots created for the year 2019 will be deleted.
Analytics 4.2 enables all jobs upon creation except the “StudentAnalytics (<data_warehouse_name>) Process Model”, because processing of semantic model databases is now part of the Incremental Load job. The Process Model job can be used to manually process semantic model databases when needed.
The SQL Server job “StudentAnalytics (Data warehouse database name) Initial Load (Process Model Excluded)” has similar steps as the Initial load job but “Process Model Database” steps are excluded.
The data warehouse schema has a star schema architecture.
Dimension and Fact Tables
The Excel file Dimensions and Facts list_v5.5.0.xlsx shows the list of Dimension and Fact tables that are available in the data warehouse database.
-
Dimension tables contain the textual descriptors of the business and typically contain many columns or attributes. These attributes describe a row in the given table, which is a representation of certain business objects such as Programs, Prospects, and Students.
-
Fact tables contain the measurable facts, trends and quantitative information in relation to the dimensions, such as the count of Prospects who enrolled during a certain period of time, or the time taken for a Prospect to enroll from the time he/she inquired about a Program.
Enterprise Data Warehouse Bus Matrix
The Excel file Analytics EDW Bus Matrix_v5.5.0.xlsx contains the Enterprise Data Warehouse Bus Matrix representing the business analysis/processes and associated dimensionality.
For more information, refer to How to use the Bus Matrix.pptx.
The data warehouse layer is comprised of the staging area, which is a temporary storage area utilized in the Extract Load and Transform (ELT) process, and the data warehouse which is the more permanent storage area for data in a denormalized format. The ELT process utilizes the SQL Server Integration Services (SSIS) platform and SSIS Catalog framework for data extraction and for updating the dimension and fact tables in the data warehouse.
There are two sets of SSIS packages:
-
The first set, developed using the Change Data Capture and/or Change Tracking related tasks and components, extracts the initial and incremental change data (flagged as INSERTs, UPDATEs and DELETEs) from the source and stores it in a staging area.
-
The other set transforms the stored data in the staging and lookup tables and updates the dimension and fact tables in the data warehouse.
The staging area and the data warehouse tables reside in the same SQL Server instance and same database. SQL Server Agent jobs are created on this instance to execute the initial and incremental SSIS packages.
Depending on whether Anthology Student or CampusNexus CRM are standalone sources or they are integrated, appropriate steps are added to the jobs. For example, if CampusNexus CRM exists as a single source, then only the steps relevant to the CampusNexus CRM are added to the jobs.
The recurring schedule of the SQL Server Agent job for the incremental updates can be changed as required. An additional SQL Server Agent job is created to purge stale records from the staging tables for performance reasons.
The "Age Refresh" job is scheduled to run at 12.00.00 a.m. on the first day of every month. This job updates all Student, Prospect, CRM Contact, and Lead age data.
Analytics 4.1 adds the “Student Account Aging Snapshot” job, which is scheduled to run every day at 12.00.00 a.m.. This job creates monthly Student Account Aging Snapshots for current and past 1 year. It considers the last day of the month as “As of Date” for creating the monthly snapshots. For example, the January 2021 snapshot is created by considering 31st January 2021 as “As of Date”.
It also creates one snapshot for the current month by considering the yesterday’s date as “As of Date”. On start of a new year, for example, 1st January 2021, the snapshots created for the year 2019 will be deleted.
Analytics 4.2 enables all jobs upon creation except the “StudentAnalytics (<data_warehouse_name>) Process Model”, because processing of semantic model databases is now part of the Incremental Load job. The Process Model job can be used to manually process semantic model databases when needed.
The SQL Server job “StudentAnalytics (Data warehouse database name) Initial Load (Process Model Excluded)” has similar steps as the Initial load job but “Process Model Database” steps are excluded.
The data warehouse schema has a star schema architecture.
Dimension and Fact Tables
The Excel file Dimensions and Facts list_v5.4.0.xlsx shows the list of Dimension and Fact tables that are available in the data warehouse database.
-
Dimension tables contain the textual descriptors of the business and typically contain many columns or attributes. These attributes describe a row in the given table, which is a representation of certain business objects such as Programs, Prospects, and Students.
-
Fact tables contain the measurable facts, trends and quantitative information in relation to the dimensions, such as the count of Prospects who enrolled during a certain period of time, or the time taken for a Prospect to enroll from the time he/she inquired about a Program.
Enterprise Data Warehouse Bus Matrix
The Excel file Analytics EDW Bus Matrix_v5.4.0.xlsx contains the Enterprise Data Warehouse Bus Matrix representing the business analysis/processes and associated dimensionality.
For more information, refer to How to use the Bus Matrix.pptx.
The data warehouse layer is comprised of the staging area, which is a temporary storage area utilized in the Extract Load and Transform (ELT) process, and the data warehouse which is the more permanent storage area for data in a denormalized format. The ELT process utilizes the SQL Server Integration Services (SSIS) platform and SSIS Catalog framework for data extraction and for updating the dimension and fact tables in the data warehouse.
There are two sets of SSIS packages:
-
The first set, developed using the Change Data Capture and/or Change Tracking related tasks and components, extracts the initial and incremental change data (flagged as INSERTs, UPDATEs and DELETEs) from the source and stores it in a staging area.
-
The other set transforms the stored data in the staging and lookup tables and updates the dimension and fact tables in the data warehouse.
The staging area and the data warehouse tables reside in the same SQL Server instance and same database. SQL Server Agent jobs are created on this instance to execute the initial and incremental SSIS packages.
Depending on whether Anthology Student or CampusNexus CRM are standalone sources or they are integrated, appropriate steps are added to the jobs. For example, if CampusNexus CRM exists as a single source, then only the steps relevant to the CampusNexus CRM are added to the jobs.
The recurring schedule of the SQL Server Agent job for the incremental updates can be changed as required. An additional SQL Server Agent job is created to purge stale records from the staging tables for performance reasons.
The "Age Refresh" job is scheduled to run at 12.00.00 a.m. on the first day of every month. This job updates all Student, Prospect, CRM Contact, and Lead age data.
Analytics 4.1 adds the “Student Account Aging Snapshot” job, which is scheduled to run every day at 12.00.00 a.m.. This job creates monthly Student Account Aging Snapshots for current and past 1 year. It considers the last day of the month as “As of Date” for creating the monthly snapshots. For example, the January 2021 snapshot is created by considering 31st January 2021 as “As of Date”.
It also creates one snapshot for the current month by considering the yesterday’s date as “As of Date”. On start of a new year, for example, 1st January 2021, the snapshots created for the year 2019 will be deleted.
Analytics 4.2 enables all jobs upon creation except the “StudentAnalytics (<data_warehouse_name>) Process Model”, because processing of semantic model databases is now part of the Incremental Load job. The Process Model job can be used to manually process semantic model databases when needed.
The data warehouse schema has a star schema architecture.
Dimension and Fact Tables
The Excel file Dimensions and Facts list_v5.2.0.xlsx shows the list of Dimension and Fact tables that are available in the data warehouse database.
-
Dimension tables contain the textual descriptors of the business and typically contain many columns or attributes. These attributes describe a row in the given table, which is a representation of certain business objects such as Programs, Prospects, and Students.
-
Fact tables contain the measurable facts, trends and quantitative information in relation to the dimensions, such as the count of Prospects who enrolled during a certain period of time, or the time taken for a Prospect to enroll from the time he/she inquired about a Program.