One of the most common items we are tasked with here at MIBAR.net is providing our clients with a solution to provide autonomy when it comes to supplying their Power BI reports with current data. Through some native connections, tweaked with a couple pieces of automation, we are able to completely leverage the Microsoft Power Platform to provide our clients with exactly what they are searching. Below you will read through a quick tutorial on how to set this solution up within the Microsoft Power Platform:
Step 1: Setting Up a SharePoint Folder
More times than not, most of our clients are looking to regularly import Excel files to consume in their Power BI reports. Now, it goes without stating how Microsoft’s Power Query does come with an out-of-the-box connection to Excel Online. However, the issue with this solution is that a user is only able to consume one single sheet from one single Excel file at a time using that connection. So, each time that user would want to import a new file into their set of queries, they would have to establish a new connection to Excel Online, to consume the new sheet from the new file. Not very scalable, right?
So, what we believe to be a better solution for this use-case is to set up a SharePoint Folder. This process can be done by navigating into a user’s local, or shared SharePoint site, and locating the ‘Edit’ option on the left-sided tool bar. Once the user selects that option, the navigation settings will appear, and the user can then elect to create a new folder in their SharePoint site specifically acting as a drop-zone for files that need to end up in Power BI. If a user would like this ‘drop-zone’ to be more generally open to their organization publicly, it is worth noting that they can also leverage the standard ‘Documents’ folder as well. This option will need some require some additional Power Query steps in the future.
Step 2: Creating Tables in Dataverse to Store Back-End Data and the Import Process
The most efficient way to store normalized data coming from SharePoint is to set up a connection to a Microsoft Dataverse back-end via a Dataflow. What this will allow a user to do is efficiently manicure, store, and visualize their data in a space that is both easy to access, and easy to edit if need be.
To start this process, a user will have to first create tables in the Microsoft Dataverse, and fashion all appropriate columns in each table so an accurate data transition can occur. These tables can be set up like any other custom table in the Dataverse. Or, if a user wants to import their SharePoint data into an existing table, such as a standard like the ‘Accounts’ table, they can do so as well. To get these tables to properly match up with the imported SharePoint data, a user will first have to leverage their Dataflow to properly manipulate the inbound data.
Step 3: Configuring the Dataverse Dataflow
Setting up this first initial Dataflow is where a user will connect to the SharePoint folder they established in Step 1 of this process. Once inside of the newly created Dataflow, a user will be able to ‘Get Data’ from several pre-established connections that Microsoft enables to be leveraged in the data import process. For this use-case, a user would select the SharePoint folder connection, and insert their proper SharePoint URL and credentials to create the link between these two sources.
Once the connection has been established, the user will notice that queries come in a specific type of way. Typically, when connecting to data sources, a relationship is established between one table from the data source, and one query in Power Query. However, when using the SharePoint folder connection, the goal is to iterate through multiple files, and import all of those selected into a master query. So, as the user connects to their SharePoint folder, they will see that a set of supporting queries and parameters get created instantly. What this stack processes are meant to do is to provide an opportunity for a user to make any adjustments to the data being imported only one time, and then have those query steps get applied to any subsequent files down the road. Thus, creating an incredibly efficient data mash-up.
Step 4: Configuring the Power BI Dataflow.
Once the data has been successfully imported into the Dataverse, the user will be able to get the data imported right into Power BI. At this stage, the user will have two options, they can either connect directly to a Power BI dataset, or they can source their queries into the cloud with another Dataflow. In my opinion, as a Power BI developer, I believe sourcing data into a Dataflow is almost always the most efficient way to import data into Power BI, especially if the goal of this data is for it to be used is more than one report overall.
Configuring the Power BI Dataflow will be simple overall. Since a user will have already performed most of the manicuring steps, and mash-up of their data coming from SharePoint in the Dataverse Dataflow, the Power BI Dataflow will only require some slight configurations. To connect to the back-end Dataverse data, a user will go into their Power BI Dataflow, and ‘Get Data’ from the Dataverse (aka Common Data Service) connector. They will input their Dataverse site URL and credentials, and will then select the necessary tables they would like to import into their Power BI Dataflow. Once the tables are imported, the user can elect to perform any additional query steps they many need, such as reordering or renaming columns, removing any data they will not need to visualize in their Power BI report, or merge any queries they may need to provide a more unified experience.
Step 5: Connecting the Power BI Dataflow to a Power BI Report
At this stage, all the data will be flowing in the cloud from each data warehouse and should be cleaned up and ready for import. To finally be able to visualize and calculate measures across this data, a user will open a new PBIX file in Power BI Desktop, and finally ‘Get Data’ once more. The user will select the option to import data from a Power BI Dataflow, pass the URL for their instance of the Power BI web service and credentials, and then be able to select which tables from which Dataflow they would like to start to build off of in their report. After their selection and a quick refresh of the data, the Power BI report is ready to go, and your data will have successfully made its way through this entire eco-system.
Now it may be easy to ask, why should I do all these steps? Can’t I just create a connection to a SharePoint folder and connect that directly to my PBIX file? While this may be true, I can assure you that connecting directly from your PBIX file to a SharePoint folder will not be an efficient methodology to implore at all. Once the number of file sin a SharePoint folder exceeds one, a user can expect a PBIX connected directly to that SharePoint folder to take any where from 45 minutes to over an hour and a half to refresh. In terms of Power BI efficiency, this is extremely poor. When leveraging the Dataverse Dataflow, Power BI Dataflow, and PBIX file, even though the data must refresh in all three of these locations, the end-to-end refresh times typically never eclipse 20 minutes from my experience. Obviously, these numbers are subject to the file sizes, data types and other complexities that might exist. However, test this all out for yourself. Power BI and the entire Power Platform is not a one-size fits all model, but this methodology and these steps have prove to be the most efficient process of importing scheduled Excel-based data to be leveraged in one of the leading BI tools on the market, Power BI.