In-memory distributed processing for large datasets… How to connect to SQL Server using Apache Spark? The Spark documentation covers the basics of the API and Dataframes, there is a lack of info. and examples on actually how to get this feature to work.


First, what is Apache Spark? Apache Spark is an open source big data processing framework built around speed, ease of use, and sophisticated analytics. A fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark’s standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.

Read Full Post →

ottawapass Hope you are all enjoying the last stretch of summer! Just wanted to inform you that The Ottawa SQL Server User Group will be back on Thursday September 15, 2016 for yet another great season. We are looking forward to share and deliver great content about SQL Server 2016, Power BI, Azure Cloud and many more related topics…

Speakers, if you would like to present at one of our scheduled meetups, please submit your session title and abstract to with “Ottawa SQL Server User Group” in the subject header

To register and confirm your attendance and for more details please visit our Meetup page ->

Read Full Post →

With SQL Server 2016 literately just days away from being officially available to public for download (Get ready, SQL Server 2016 coming on June 1st). I wanted to highlight one important new key future called Row-Level Security. So what is Row Level Security? Row-Level Security enables customers to control access to rows in a database table based on the characteristics of the user executing a query (e.g., group membership or execution context).


Read Full Post →

With SQL Server 2016 just around the corner, a series of new features and enhancements will be introduced. More specifically some important Security Upgrades; Always Encrypted, Row Level Security, and Dynamic Data Masking. This post covers SQL Server 2016 – Dynamic Data Masking (DDM).

Note: DDM is also available in Azure SQL Database see the following link for details ->

Dynamic Data Masking Read Full Post →

Happy New Year! It as been a while my friends… here goes 😉
This post assumes you are familiar or have already been introduced to Azure SQL Data Warehouse. If not I strongly encourage to read and follow-up on the SQL Data Warehouse Documentation to get you going.

Optionally if you would like to have a local on premise copy of the ContosoRetailDW database – A fictitious retail demo dataset used for presenting Microsoft Business Intelligence products. You can download here ->

As much as we love AdventureWorks, it is also worthwhile to explore and work with bigger data volumes, to develop and test new concepts! Hence the entire ContosoRetailDW database contains more than 34M rows of records. So I decided to make it happen…ContosoRetailDW on Azure SQL Data Warehouse.

Before we start the scripts (code) for this post are available here on GitHub ->

First thing let’s start by creating a blank SQL Data Warehouse database named ContosoRetailDW (Create a SQL Data Warehouse)


Read Full Post →

A short tale of… data vizes. It’s about Tableau, R, and Shiny. How to prepare/build an existing and well-known Tableau visualization with R and provide web analysis interactivity. Will be using the raw data from the sample workbook named “Regional” that comes with Tableau Desktop. The following image below is a snapshot of the visualization that we will try to reproduce with R. You can also view and interact with this visualization on my Tableau Public profile page ->


Read Full Post →