Rajendra Gupta
External table and relational DB table in Azure Data Studio

Enhanced PolyBase SQL 2019 – External tables using t-SQL

October 31, 2018 by

In this article on PolyBase, we will explore more use case scenarios for external tables using T-SQL.

SQL Server 2019 provides Data Virtualization through PolyBase to various data sources such as SQL Server, Oracle, Teradata, and ODBC based data sources etc. In my previous articles, we explored

  • Installation of PolyBase in SQL Server 2019 Preview (CTP 2.0)
  • Installation of Azure Data Studio
  • Installation and setup database and objects in Oracle 11g Express Edition
  • Creating with Azure Data Studio
  • Accessing data

The external table works only if PolyBase is able to connect to the external data source. We might get the below error if there is an issue in the connection.

08:32:21 Started executing query at Line 1

OLE DB provider “SQLNCLI11” for linked server “(null)” returned message “Login timeout expired”.

OLE DB provider “SQLNCLI11” for linked server “(null)” returned message “A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online.”

Msg 10061, Level 16, State 1, Line 0 TCP Provider: No connection could be made because the target machine actively refused it.

Total execution time: 00:01:04.003

Error while accessing External Table

It might be due to the following reasons

  • The external database is not running.
  • PolyBase Services are not running. If the External database is up and running but we do not have a PolyBase service in running status, we also get the same error message.

In part 2 of the series, we saw that the external table could be accessed similarly to a relational database table. One more advantage is that we can join them with any relational tables.

Let us see how we can join the external table with the relational DB tables. I have saved the data into a CSV file so we will import the table using my earlier article, SQL Server Data Import using Azure Data Studio. Therefore, you can follow the article in the same way in the Azure Data Studio also. I will just give high-level steps to import data from flat file into Azure Data Studio in this article.

  • Launch Import Wizard in Azure Data Studio

    Import Wizard in Azure Data Studio

  • Specify the input file, table name, and schema name.

    Import Flat File Wizard in Azure Data Studio

  • Preview data

    Preview Data

  • Review the column properties and make changes if required

    Review the column properties

  • Import Data into SQL Server and we can see the successful message below:

    Summary of Import Flat file wizard

We now have two tables in our table as highlighted below:

  • Dbo.Departments: SQL Database relational table
  • Dbo.Employees: External table pointing to Oracle Database

External table and relational DB table in Azure Data Studio

Now let us run the below query in Azure Data Studio to pull up data from both the tables.


Join External table with relational table

We can see here that we can join the external data with a relational database table like a normal join operation only. This provides a good way to use the external tables with the way we want in our queries. In the applications as well we can pull up the required information from external data sources using Polybase without going to each data source separately.

View execution plan of Join operation with external table

PolyBase external tables do not allow the update or insert data as of now. We can only select the data from that table. If we try to update one, we get the below error message:

Msg 46519, Level 16, State 16, Line 1 DML Operations are not supported with external tables.

DML operation on External Table.

Until now, we have used the External Table Wizard to create tables for Oracle DB. Now let us create the table using t-SQL. We will also use the SQL Server Management Studio 18.0 Preview 4.

In this example, we will use the below:

  • SQL Server Database: ExternalTableDemo
  • External DataSource: Oracle Database (Service Name – XE)

Steps to configure using t-SQL

  1. The first step is to create a database master key. We need to specify the password to encrypt the master key in the specified DB. We should use complex password meeting the password policy.

    Use the below query to create a database master key in the ExternalTableDemo database.


    create a database master key

  2. Now we need to create a database-scoped credential. SQL Server uses this credential to access the external data source. We can use the below query to create database scoped credentials. We need to specify the Identity (name of the account ) and SECRET (password) to connect to Oracle Data Source.


    create a database-scoped credential

  3. In this step, we will configure the external data source. We need to specify the connection string for the data source along with the credential created in above step. The connection string should be in format of ‘<vendor>://<server>[:<port>]’.


    configure the external data source

    We can see the Location specified as oracle: //192.168.225.185:1521′ in the format specified above.

    Vendor: Oracle
    Server: 192.168.225.185
    Port: 1521

  4. Now, we have completed setting up the external data source pointing to Oracle DB with the credentials specified. In this step, we need to create an external table in the SQL Server. We need to specify the table columns, data types, properties similar to a relational database table. We also need to specify the table location in the format of <database_name>.<schema_name>.<object_name>.

    In the below query, we can see the location as [XE].[DEMOUSER].[REGIONS] which is in the line of the format specified above.

    Note: If there are any issues in the authentication to Oracle Data Source due to invalid username or password in the credential step, we will get the below error. You can see the error code is ORA-01017 that shows error raised by Oracle DB due to invalid credentials.

    Create the external table

    You can either recreate the credential or alter the credential with the correct password.

    Alter the credentials

    Now let us run the query again to create the external table. We can see below that query is executed successfully now.

    Create External Table

  5. We can view the table now in the database.

    View newly created external table

    Run the select statement to view the data into the table.

    view the records from external table

    To verify the records, let us view this table directly into the Oracle DB. We can see that the records are same in both the Oracle DB and the SQL DB. This also shows that the External Tables do not contains data instead they pull up the data from the external data source.

    view the records in Oracle DB

  6. Create Statistics to improve performance.

    This is an optional step.


    Create Statistics

    We will not see much performance improvement in this example due to only a few records, but you can see the difference as shown.

    Compare execution plan before and after statistics

Conclusion

In this article on PolyBase, we explored the additional use case of the external case along with creating an external table with t-SQL. You can now create them using both the External table Wizard in Azure Data Studio and using t-SQL as well. I will cover creating an external table with SQL Server as Data Source in my next article. Stay tuned!

Table of contents

Enhanced PolyBase SQL 2019 – Installation and basic overview
Enhanced PolyBase SQL 2019 – External tables for Oracle DB
Enhanced PolyBase SQL 2019 – External tables using t-SQL
Enhanced PolyBase SQL 2019 – External tables SQL Server, Catalog view and PushDown
Enhanced PolyBase SQL 2019 – MongoDB and external table
Rajendra Gupta
Latest posts by Rajendra Gupta (see all)
168 Views