Craig Porteous

Considerations for On-premises data in PowerBI

June 7, 2017 by

Whether you’ve been using PowerBI for a while or you’re just getting started with it, chances are you’ll want to source at least some of your data from on-premises systems. Be that SQL server databases, Oracle, SSAS etc. The same process even applies if you’re sourcing from AWS which I’ll go into in more detail below. All of these data sources will require Microsoft’s On-Premises Data Gateway.

Originally branded the PowerBI Enterprise Gateway (in line with the other option of the “Personal Gateway” you could set up locally when creating a report in PowerBI Desktop). The product was later expanded to encompass several Azure services as shown in the handy diagram below.

The On-Premises Data Gateway is an application which is installed within your network to give encrypted & compressed connectivity between Azure cloud services and your on-premises data.

The application itself can be installed on an existing or separate server/VM. The main performance considerations here are network throughput and uptime. The importance of these factors depends on your environment though.

I.e. The data gateway can be installed on a Windows 7 machine or higher (and further system requirements can be found at the documentation link in the first paragraph) but as you’d expect, as soon as this machine is powered off you lose all connectivity to your on-premises data sources in PowerBI.

I’ll go into more detail on high availability further on though as there’s no built-in solution at this time though the workaround in the Appendix is straight forward.

Licensing restrictions

Although the data gateway application itself is free to download and use, there are licensing considerations when using some functionality, such as Live/DirectQuery, specifically to Analysis Services which I’ve summarised below.

A full list of data sources that can use Live/DirectQuery connections can be found here.

With the recent announcement of PowerBI Premium licensing and changes to the Pro & Free tiers, DirectQuery data from SSAS is now considered a “Pro” feature & will require a Pro or premium license*. More details on the tier restrictions can be found here.

*I’ve added an asterisk here as I think this will change. Microsoft seems to have forgotten small/medium businesses with the new model. There is also talk of a revision of the licensing tiers in PowerBI. Looking at Power BI Ideas, some form of change is at least top of the “Hot Ideas” list!

Authentication

When connecting to on-premises databases you’ll generally connect with your Windows AD account or some kind of SQL/Service account, depending on the actual source. The former gives you the use of AD group security, single sign-on & allowing you to tailor the permissions for different sets of users. I.e. Sales Teams, Finance, Management etc.

Using the latter gives flat access to your data, with no per-user restrictions. That can however, be achieved by multiple service accounts for different business units, teams etc.

With the On-premises data gateway, you provide a static/service account to connect to each data source. This can be an AD account (if using Windows authentication method) or a service account as described above (using Basic authentication)

However, only an SSAS data source will allow “pass through authentication” from PowerBI, similar to Kerberos. All other data sources, (including SQL Server at this point) will also use the provided “service account” for data access.

You could look at implementing row-level security to improve the security model but that has its own performance drawbacks. Another option would be multiple data sources within the data Gateway, pointing to the same database but using different accounts, a unique data source entry point for each group of users. This wouldn’t be the easiest to administer but you could implement a form of “team level” security this way.

There is hope on the horizon though, Adam Saxton (@guyinacube) is aware of “pass through authentication” being worked on for SQL Server & there was a call to users with Kerberos implemented on an Oracle data source in the recent March blog update to help with testing.

We need Volunteers!

If you’re using an Oracle data source with your on-premises data gateway in an environment where you have Kerberos configured, leave a comment here, we’re looking for candidates for a private preview we’re running.

I reference Adam Saxton a lot in this article as he’s been my primary source for new information & techniques to get what I need out of PowerBI & the data gateway. Most of his walkthroughs also detail the relevant PowerShell commands which are great for automation.

Also keep an eye out on Power BI Ideas for user opinions & updates on feature requests. You can also vote on ideas to raise their awareness.

Data Refresh

This is another aspect of PowerBI & the data gateway that is affected by licensing & there are numerous restrictions that affect all data sources (not just on-premises data). It’s worth noting that the licenses described below are required to both consume & create reports.

Requires a PowerBI Pro (Or Premium) License:

  • Data refreshed more than Daily (This is even capped at 8x a day for all license tiers!)

For restrictions specific to On-premises data, the following require a PowerBI Pro (Or Premium) License:

  • ALL On-premises data that has any scheduled refresh will require a PowerBI Pro license.

Installation & Set up

There is a great step by step guide over at the Microsoft PowerBI documentation, on setting up the data gateway in both “Enterprise” and personal modes so I won’t go through the basics here. As the data gateway is connecting to external resources you will likely run into firewall, ACL or configuration issues during set up. Thankfully there’s also a great troubleshooting guide, again over at the PowerBI documentation. I can’t really add anything to what’s described on that page. It actually touches on quite a few of the points I detail below.

Adam Saxton (@Guyinacube) also summarized his troubleshooting steps at SQLBits this year:

  1. Latest version of the gateway
  2. Check Refresh History
  3. Errors in the Event logs
  4. Trace logs for the gateway
  5. Performance counters
  6. Proxy configuration
  7. Fiddler – track HTTPS traffic
  8. Troubleshooting articles

Roadblocks & Pitfalls

There are a few hurdles you’ll come across on the road to publishing PowerBI reports with on-premises datasets. This list is not exhaustive but will hopefully help you avoid hours of fault finding & repeated work. I’m also hoping most of these will be irrelevant with future updates. Fingers crossed!

PowerBI Desktop

  • The first one you’ll come across (if not already) and you won’t receive a warning for, is when you create your first PBIX file (report). You select Get Data and chose the source type. I.e. SQL Server. You will be presented with an option for Data Connectivity Mode

There is an informational message explaining you can’t choose both types for a single PBIX file but what it doesn’t explain is that if you select Import, and then spend hours sourcing data, building out the report you cannot then change the data source to DirectQuery. If you had initially chosen DirectQuery then it is ok to revert to Import. It simply does not allow the change from Import to DirectQuery. This applies to Analysis Services too so plan ahead.

Another restriction in PowerBI desktop crops up when using multiple data sources in a PBIX file, all of these must be added to the Gateway & be working for ANY of them in a PBIX to work.

ODBC Data Source Naming

When dealing with an ODBC Data source type, you MUST ensure that the name of the Gateway data source matches the ODBC connection name EXACTLY. This is also case sensitive.

It’s also worth noting that at this time that an ODBC connection, set up on your Gateway server is the best way to connect to other cloud services such as AWS. I hope we will see direct connectivity in the future though.

Data Gateway Credential encryption

  1. Data sources added to the Gateway will have both the username & password obfuscated (as they are both encrypted) in the Manage Gateway screen so make a note of which account you’ve used. Once you save it you cannot view it again, only update it.

Analysis Services Authentication

Coming back to Authentication, when using an Analysis Services data source which allows pass through authentication, PowerBI will connect to the SSAS instance using the stored “service account” but will authenticate the user using their UPN (User Principal Name). This is where you may run into problems. Your UPN is taken from the account you use to login to PowerBI & this must be the same as your AD login (e.g. Your network login) or the authentication will fail. Here’s a quick flow of what that looks like:

Sometimes these 2 credentials will never match (use of subdomains for example) so Microsoft have created a mapping process that will allow limited substitution/replacement of credentials passed through to Analysis Services & a quick test box to make sure it works.

This setting alters the authentication process as described below, resulting in a match.

Unfortunately there’s no ability to add a blanket regex for all users with the Map Username feature (if, for example your business uses a standard username convention). You can however substitute domains. I.e.

There’s some further documentation on UPN mapping over at Microsoft alongside a great video by Adam Saxton:

Article: Manage your data source – Analysis Services

Video: User Principal Name (UPN) Mapping in Power BI

A more robust solution is to liaise with your Azure Active Directory admins and have them sync your normal AD account up to AAD. This will carry the correct UPN which will usually take the format username@company.com. This means no mapping is required.

Disaster Recovery/High Availability

As per the main PowerBI documentation, this is “in the roadmap” which has been the party line for as long as I recall. Fortunately there is a very simple workaround that I’ve detailed in the Appendix at the end of this post. It’s a manual workaround and assuming your databases are still online or retain their DNS records it should have your data gateway back online in minutes.

Monitoring & Diagnostics

Upon installing the data gateway a collection of perfmon counters are added to the local system, to monitor such metrics as:

  • No. of failed queries
  • No. of query executions/sec
  • No. of connections
  • Etc.

There was a great blog post by Brett Powell linked to in the PowerBI August update last year that goes into detail on how to set up these counters alongside system counters like CPU, memory, network throughput & pull these back into PowerBI. Having implemented it myself I highly recommend checking it out: MONITOR ON-PREMISES DATA GATEWAYS

Beyond the performance metrics in Perfmon the data gateway application itself has a log output. This can be found here:

C:\Users\PBIEgwService\AppData\Local\Microsoft\On-premises data gateway

If your data gateway is running under a domain account you should substitute the PBIEgwService in the above path. The logs can also be accessed from Event Viewer under Application and Services Logs

Opening up the log files you can see a lot of information being logged. There are a lot of undocumented GUIDs and recurring codes in the log files but it does appear to reliably capture data source and user errors if you strip out informational records. You can also increase logging to verbose to include query start and finish entries etc. The troubleshooting article I mention earlier goes into some detail on common Activity types & how to identify query durations from the Gateway log. A typical log error output looks like this:

Looking closer at the opening lines on such an error breaks down into components. Some of these are identified below:

There is a full PowerBI admin portal built into Azure for viewing Usage metrics, audit logs & managing users. The catch is that you need to be an Azure Global Admin or be assigned the PowerBI Service admin role by a global admin to use it. This isn’t always easy to obtain.

Even with that level of access a lot of PowerBI and the data gateway itself is still a black box compared to what you may be used to with on-prem services like SQL & SSRS. This should improve over time but if you’ve used any other cloud services, some control always needs to be relinquished.

Keeping an eye on Adam Saxton’s YouTube channel & the PowerBI blog updates is the best way to keep up to date with new features & techniques.

References

Appendix – Disaster Recovery Workaround

The following steps will demonstrate the “mock failover” of the On-premises data gateway.

  1. Install the data gateway on a server (E.g. cpGATEWAY01) referring to the PowerBI documentation if required.

  2. Login with a PowerBI account & confirm successful connection to PowerBI.

  3. Choose to Register a New Gateway

  4. You will be prompted for a Recovery Key. You MUST make a note of this. It is key to this process.

The Gateway is now setup on a single server.

  1. Install the data gateway on a second server. (E.g. cpGATEWAY_DR)

  2. Login with a PowerBI account & confirm successful connection to PowerBI.

  3. Now choose to Migrate, restore or takeover an existing Gateway

  4. Pick the name of the Gateway from the drop down list. NOTE: The user you’re setting this up with must be an administrator of the Gateway you are restoring.

  5. Enter the Recovery Key.

  6. The Gateway setup should complete successfully and you should see a screen similar to the following:

  7. You can now repeat steps 6-9 on your original Gateway (cpGATEWAY01) installation to move ownership back. It’s that simple!

Craig Porteous

Craig Porteous

Craig is a Microsoft certified BI Developer & Administrator and has worked with the Microsoft SQL stack for over 8 years. From working with cloud technologies like AWS & PowerBI to managing enterprise level Projects & deployments, Craig is passionate about developing his skills. He enjoys contributing to a personal blog to give back to the SQL community. In his spare time Craig is an avid runner

View all posts by Craig Porteous
Craig Porteous
Business Intelligence

About Craig Porteous

Craig is a Microsoft certified BI Developer & Administrator and has worked with the Microsoft SQL stack for over 8 years. From working with cloud technologies like AWS & PowerBI to managing enterprise level Projects & deployments, Craig is passionate about developing his skills. He enjoys contributing to a personal blog to give back to the SQL community. In his spare time Craig is an avid runner

View all posts by Craig Porteous

907 Views