Reporting and Analytics Environment (RAE)
In 2021, the university transitioned to the new Workday system to manage core operational and transactional business functions of the enterprise. With Workday, access to data and reports became much more open and nearly all of a user's data needs can be solved directly within the system. However, there are scenarios where Workday reporting cannot provide the answer because:
- The data needed does not reside in Workday (i.e. Student Life systems data)
- The user requires an output format or functionality that is not possible using Workday (i.e. sophisticated R or Python scripting)
In response to these needs, the Data and Analytics Team has designed and developed a data analytics repository that can be used by analysts across the university to perform more advanced analyses. The Reporting and Analytics Environment (RAE), powered by Amazon Web Services, will hold data needed for extended and detailed analytics. The RAE also provides capabilities for university areas to store & share data as well as allowing data analysts to bring additional datasets into the system for deep, cross-functional analytics.
The RAE will replace the existing Enterprise Data Warehouse (EDW) and Financial Data Warehouse (FinDW).
Historical PeopleSoft data, as well as data from other systems that will not be converted to the new Workday format, will be loaded to the RAE.
Components of the RAE
One or more large flat files (similar to a spreadsheet) into which approximately 75% of Workday data are copied. The Data Lake serves as the staging area for the Enterprise Data Warehouse and as the data exploration and discovery environment for if/then scenarios, new insights, forecasts, etc. Other university data can be included in the Data Lake.
Enterprise Data Warehouse (EDW)
A series of structured tables that contain approximately 25% of Workday data. Data in the EDW are not in raw form as they are in the Data Lake; they have been refined for specific needs like cross-functional strategic scorecards, dashboards and complex reports. Other university data can be included in the EDW.
A robust suite of tools that enable the university to use the data in the Data Lake and the EDW. Tools include reporting and visualization (e.g. Tableau), exploration and mining (e.g. R, Hadoop), movement, federated querying, etc.
Layers of Data within the RAE
Ultimately, the RAE is a repository of data that can be sourced by analysts. Data within the RAE comes from a variety of systems, including Workday, PeopleSoft, SIMS, IDM, and others. As data is loaded into the repository, it will undergo a series of transformations that will increasingly make the data more usable and accessible for analytics needs.
The table below outlines the various layers of data within the RAE and how they are generally classified for use by analysts.
|Load (LD)||This layer is used internally by the RAE in the early stages of the data load process.||Users are not given access to datasets in this layer.|
|History (HST)||Most all datasets that flow through the RAE are run through a process that notes differences between the most current dataset load and the last time the same dataset was loaded. This allows the system to produce "before and after" views of a particular data row, which can be useful in some data analyses.||An analyst can use the History layer view of Employee data to see when a person's marital status has changed over time.|
|Bronze (BRZ)||As data flows into the RAE from various systems, the initial rest area is the Bronze layer. In this layer, the data exists as it was received from the original source system. We use minimal processing to ensure that analysts can access the data in a format that is similar to what would be seen if the data were pulled from the source. Data in the Bronze layer always represents the "most current" version of a particular data row from that dataset.||An analyst can access the SIMS Buildings dataset to see the list of buildings across the university, with their most current set of values and attributes.|
|Silver (SLV)||While the "raw" data from the original source system is highly sought after, analysts also apply common business rules to the base data to build more useful data structures. Also, there are certain data design and modeling techniques that make data much easier to understand for analysts, as well as data query tools. As we apply these business rules and design patterns, we create objects in the RAE in the Silver layer.||To ease the use of the base Journal Lines dataset from Workday, the data was restructured into a standard "fact table" layout in the Workday Silver layer that functions better with tools like Tableau.|
|Gold (GLD)||Gold layer tables are similar to Silver layer tables. They are new structures and datasets created from underlying sources (whether HST, BRZ, or SLV). The primary difference between the Gold and Silver layer structures is that Silver layer datasets exist within one system, where Gold layer datasets are conglomerated from multiple systems. For example, the Workday Silver Fact Journal Lines table is created only using Workday data. Where the Gold Calendar Dimension is created using data from Workday, PeopleSoft, and other systems.||The Gold layer people dimension will be created using data from Workday, PeopleSoft, and IDM, merging all of those data into a cohesive dataset.|
Accessing Data within the RAE
In order to access data within the RAE, analysts should be familiar with relational database, SQL, and the tools that interact with these types of technologies. All data within the RAE is aligned to data domains and the security of the system is managed using the same domain structure. Access to the RAE. RAE Access information is availabe on this site.
Information on how to connect to the RAE and retrieve your database password can be found within the Administrative Resource Center (ARC).