With SQL Server 2016 literately just days away from being officially available to public for download (Get ready, SQL Server 2016 coming on June 1st). I wanted to highlight one important new key future called Row-Level Security. So what is Row Level Security? Row-Level Security enables customers to control access to rows in a database table based on the characteristics of the user executing a query (e.g., group membership or execution context).

sql-server-2016-row-level-security-04

Row-Level Security (RLS) simplifies the design and coding of security in your application. RLS enables you to implement restrictions on data row access. For example ensuring that workers can access only those data rows that are pertinent to their department, or restricting a customer’s data access to only the data relevant to their company.

The access restriction logic is located in the database tier rather than away from the data in another application tier. The database system applies the access restrictions every time that data access is attempted from any tier. This makes your security system more reliable and robust by reducing the surface area of your security system.

You implement RLS by using the CREATE SECURITY POLICY T-SQL statement, and predicates created as inline table valued functions.

To better understand SQL Server 2016 Row-Level Security let’s go through some code sample:

Create three user accounts that will demonstrate different access capabilities.

Create a quick table to hold data.

Populate the table with 10 rows of data, showing 5 orders for each sales person.

View the rows in the table

Grant read access on the table to each of the users.

Create a new schema, and an inline table valued function. The function returns 1 when a row in the SalesRep column is the same as the user executing the query (@SalesRep = USER_NAME()) or if the user executing the query is the Manager user (USER_NAME() = ‘Manager’).

Create a security policy adding the function as a filter predicate. The state must be set to ON to enable the policy.

Let’s now test the filtering predicate, by selected from the Orders table as each user. The Manager should see all 10 rows. The Sales1 and Sales2 users should only see their own sales.

To alter and disable the security policy.

In general, row-level security will work as expected across features. However, there are a few exceptions. The Cross-Feature Compatibility section documents several notes and caveats for using row-level security with certain other features of SQL Server. Details and information on RLS can be found here on the online documentation

Enjoy

With SQL Server 2016 just around the corner, a series of new features and enhancements will be introduced. More specifically some important Security Upgrades; Always Encrypted, Row Level Security, and Dynamic Data Masking. This post covers SQL Server 2016 – Dynamic Data Masking (DDM).

Note: DDM is also available in Azure SQL Database see the following link for details -> http://bit.ly/1eG7Que

Dynamic Data MaskingWhat is Dynamic Data Masking? Dynamic Data Masking enables you to anonymize sensitive data, real-time obfuscation of data to prevent unauthorized access by controlling how the data appears in the output of database queries. It is implemented within the database itself, the logic is centralized and always applies when the sensitive data is queried. Traditionally, obfuscation of sensitive data as been coded in the application layer, using views to limits the exposure of sensitive pieces of data, and third-party tools… Bottom line, it is now quite simple to configure DDM rules on sensitive columns, which can be done on an existing database without affecting database operations or requiring changes in application code.

In SQL Server 2016 there are four types of masks rules: (details)

  • Default: Full masking – replaces characters with XXXX, numbers with 0 and date/time data types with 01.01.1900 00:00:00.0000000
  • Email: Masking method which exposes the first letter of an email address and the constant suffix “.com”, in the form of an email address. aXXX@XXXX.com.
  • Custom String: Partial Masking – method which exposes the first and last letters and adds a custom padding string in the middle. prefix,[padding],suffix
  • Random: A random masking function for use on any numeric type to mask the original value with a random value within a specified range.

Creating a Dynamic Data Mask
This example creates a table with three different types of dynamic data masks, inserts data and queries the table.

query output:
q1

Let’s create a new user and grant SELECT permission on the table. Queries executed as the testuser view masked data.

As you can see the testuser database user is unable to see the real data values for each of the columns that have been masked
query output:
q2

Now let’s grant UNMASK permission to a specific user to see unmasked data on the table.

The output shows all columns of data in clear text, even the ones that have been masked by rule definition
query output:
q1

Summary
DDM is not a replacement for security access control mechanisms, nor is it a method for encrypting physical data. Dynamic Data Masking is about simplifying the obfuscation of sensitive data by streamlining the logic in the database itself (no external dependencies). Dynamic Data Masking is complementary to other SQL Server security features (auditing, encryption, Row-Level Security, etc.) and it is highly recommended to use it in conjunction with them to better protect your sensitive data in your databases.

Happy New Year! It as been a while my friends… here goes 😉
This post assumes you are familiar or have already been introduced to Azure SQL Data Warehouse. If not I strongly encourage to read and follow-up on the SQL Data Warehouse Documentation to get you going.

Optionally if you would like to have a local on premise copy of the ContosoRetailDW database – A fictitious retail demo dataset used for presenting Microsoft Business Intelligence products. You can download here -> https://www.microsoft.com/en-ca/download/details.aspx?id=18279

As much as we love AdventureWorks, it is also worthwhile to explore and work with bigger data volumes, to develop and test new concepts! Hence the entire ContosoRetailDW database contains more than 34M rows of records. So I decided to make it happen…ContosoRetailDW on Azure SQL Data Warehouse.

Before we start the scripts (code) for this post are available here on GitHub -> https://github.com/sfrechette/contosoretaildw-azuresqldw

First thing let’s start by creating a blank SQL Data Warehouse database named ContosoRetailDW (Create a SQL Data Warehouse)

azure_sqldw

Next we need to create a slightly redesigned and modified ContosoRetailDW schema for the data to be imported and used in Azure SQL Data Warehouse. I have removed the following 3 columns from each tables; ETLLoadID, LoadDate and UpdateDate and in addition columns of data type geography and geometry which are not supported in SQL Data Warehouse.

After successfully creating the SQL Data Warehouse database connect with Visual Studio (Connect to SQL Data Warehouse with Visual Studio) to your newly created blank database and run the following dsql script to create the schema for the ContosoRetailDW database.

For detailed information on table design -> Table design in SQL Data Warehouse

ContosoRetailDW DDL schema script – ContosoRetailDW_ddl.dsql

Got Data!
We need to load some data! The GZip data files, 25 of them totalling approx. 413MB are located here on OneDrive -> http://1drv.ms/1J2vZ6Y. You will need to load these files to an Azure blob storage. James Serra has a great post to help you get started: Getting data into Azure Blob Storage. For the purpose of this exercise I created a blob storage container name ‘contosoretaildw’ with a ‘data’ folder and imported the GZip files using the AzCopy Command-Line Utility (Transfer data with the AzCopy Command-Line Utility)

blob_contoso

Will be loading data from blob storage to SQL Data Warehouse using Polybase (Load data with PolyBase in SQL Data Warehouse) The following script assumes we already have defined and created a schema for the ContosoRetailDW database. You could also load data using the CTAS (Create Table As Select) feature only if you have not yet defined and created the ContosoRetailDW schema. The script for this scenario PolybaseLoadCTAS.dsql is available on GitHub.

Loading data from Azure Blog Storage – PolybaseLoadInsert.dsql

To conclude we now need to update statistics on our ContosoRetailDW SQL Data Warehouse. Without proper statistics, you will not get the performance that SQL Data Warehouse is designed to provide. Tables and columns do not have statistics automatically generated by SQL Data Warehouse and so you need to create them yourself. Will be using a stored procedure named dbo.prc_sqldw_create_stats to create statistics on all columns in a database.(Manage statistics in SQL Data Warehouse)

To create statistics on all columns in the table with this procedure, simply call the procedure:
exec prc_sqldw_create_stats;

dbo.prc_sqldw_create_stats

We are done! Explore, write some queries, scale compute resources up and down, connect with PowerBI and create awesome data visualizations. Just remember to pause your SQL Data Warehouse when not using it.
Enjoy!

A short tale of… data vizes. It’s about Tableau, R, and Shiny. How to prepare/build an existing and well-known Tableau visualization with R and provide web analysis interactivity. Will be using the raw data from the sample workbook named “Regional” that comes with Tableau Desktop. The following image below is a snapshot of the visualization that we will try to reproduce with R. You can also view and interact with this visualization on my Tableau Public profile page -> http://tabsoft.co/22IOMjD

Tableau_SATScore

You can download a copy of the dataset in CSV format available here on Dropbox

Let’s start by recreating the same visualization (static) with R. Open RStudio, copy and paste the following R code and run it.

The output will produce the following, a nice simple static plot… of course with no interactivity!
Rplot01

Then comes Shiny by RStudio which is a web application framework for R that turns your analyses into interactive web applications. Check out the Getting Started Guide for all the details.

Shiny apps have two important components:

  • a user-interface script (ui.R) which controls the layout and appearance of your app
  • a server script (server.R) which contains the instructions needed to build the app
  • You can actually run the app locally in RStudio by copying the following 2 R scripts

    I have published this app on Shiny for you to explore and interact with https://sfrechette.shinyapps.io/college_admissions

    shinyapps.io-college_admissions

    So from Tableau to R and Shiny we have *almost the same data visualization and interactivity
    *I am using the ggvis package and it seems impossible to add a horizontal line to a plot for the average score… feature apparently not available yet!