Begin (hopefully not an early end) of DW/BI project:
It’s not a secret that many BI projects are failing, due to incorrect project planning or not the best mix of people. Below we will share with you some tips for increasing the chances of succeeding.
Questions to answer before you start:
– Is the organization ready to proceed? Do I have the sponsorship from the executive business user(s) to proceed further?
– Will the BI project provide the information needed by the business? Is there enough BI desire?
– Are there ‘clean’ data from the sources? Did you take into account the steps needed to make them usable?
– Is the requested data refresh doable? If not, then the user will make decisions based on ‘old’ data
When the answers to the above questions make you happy (enough) then you can start with a project. And you need it for several reasons based on PMBOK guide, but initially to create a scope:[write down all the processes required to ensure that the project includes all the work needed]. Next to that, you need a specific skillset to make that happen. A basic one for a small project should consist of a Project Manager, a senior BI developer, and a smart Business Analyst.
On the Kimball Lifecycle roadmap, after [business requirements definition] there are 3 parallel tasks before the deployment.
-> Technical Architecture Design: Use a diagramming application in order to combine needed applications and connection between them. (Accessing and security should also be a part of that). Timing -Availability – Performance requirements – Identifying new records can be some of the ingredients of the architecture requirement, which is related to the processes of ETL, BI, Metadata, and Infrastructure.
->Dimensional Modeling/Physical Design: Covered on pt1. Additional to that can be Naming conventions (like table, view & schema naming), Indexing & Storage.
->Application Design/Development: This is the layer where the value will be delivered to the end users! Are you going to use Power BI or another software? How are you going to build the security on that? What’s the cost versus value between diff tools? Can you actually find professionals and efficient developers? What’s the maturity of the BI visualization app? All these questions should lead you to create a requirement document.
-> Deployment/Maintenance: Nice you develop everything in your own silo, but now the time came to share it with your colleagues/customer. How are you going to perform a full test? What’s the quality of the delivered data? How your new addition (backend and/or frontend) will affect the performance? How will test the report and what would be the points to check? Who will maintain it and support the users? We are thinking the same word, “Documentation”.
Which top 5 spells to avoid:
1. “I will work on the fly”: No, you need to follow/ processes and write documentation in order to avoid failure. Make a plan, as you do in other aspects of your life.
2. “My BI manager has the experience and the team is ready”: What are the organizational skills of your manager? Did he ever work on a BI project? What are the skills of your team members? You need to find members with a blend of enough character and knowledge, which (hopefully) will help you deliver a nice BI solution.
3. “We are going to use the X best visualization tool”: Are the business users ready for it? Have they been trained? Do they have enough motivation to get engaged? If users do not embrace the BI solution, the value will never be fully utilized.
4. “He/She is best in visualization/backend, that’s enough”: No, you need a combination of frontend and backend or 1 in the back + 1 in the front. Lately, a lot of visualization experts don’t have a computer background. Can they fully debug the solution when an error occurs?
5. “I will follow the Scrum Framework because I am an expert/I have a team of experts”: Once again, It depends on your audience. Can your team members fail fast but learn & adjust? Can the manager/scrum master accept his/her mistakes and change? Maybe a split of assigned responsibilities matches better with the culture of the specific company.
ETL/ ELT
Before you create a nice visualization, make your business users happy and maybe help the company grow you need to put your data somewhere. Connecting straight to a production database isn’t the correct way, as that can cause performance issues/locks to the application of the DB. Before starting to visualize the architecture that you are going to implement, you need beforehand make decisions on
- Data Quality: What is the accepted quality? When the process should fail if the quality isn’t good?
- Backup Strategy: If fails, how fast the restore should take place? How often you should archive?
- Security: How people are going to access the reports? Do we need to comply with a regulation?
- Data Integration: Do you have more than one source? Are you going to establish common dimensions?
- Latency: How quickly the data from the source should be available to the business users?
- Frontend: Which visualization tool are you going to use? You need to match (or not) th; e DWH with the features of the BI app.
- Resources: Does your team support the upcoming architecture? Are the skills available for the implementation or there is a gap?
ETL Process
Below should be the high plan of the ETL procedures.
1. Design the plan, connecting the source(s) with the destination(s).
2. Select an ETL app having in mind: documentation, connectivity to your sources, version control and performance.
3. Build a strategy by selecting how are you going to: stage your source data, do data quality, manage changes to dimensions, audit the flow, meet the availability requirements.
4. Load technique: Historic/incremental load strategy & load frequency.
5. Other: Table design, SCD setup, table partitioning, data profiling, dependencies, precheck to avoid errors, cleanup.