Search for a String in Multiple FilesIn support of an earlier post Fetching NHL Play by Play game data, I was recently asked how could one quickly search for a specific string in multiple JSON files recursively? Well if you are running macOS or Linux grep is your best friend! In a nutshell grep prints lines that contain a match for a pattern. The following is a sample grep and cut command that will list out (output) the games (files) that contains the following string -> “Montréal Canadiens”:

[Want to try it out! You can download and extract sample data which contains all the play by play games from the 2016-2017 season]
 
 
grep -H -R "Montréal Canadiens" /data/20162017/*.json | cut -d: -f1

Read Full Post →

First some housekeeping… A time series database is a software system that is optimized for handling time series data, arrays of numbers indexed by time (a datetime or a datetime range). In some fields these time series are called profiles, curves, or traces.


Lately I discovered TimescaleDB, an open source time-series database engineered up from PostgreSQL and packaged as an extension. It is optimized for fast ingest and complex queries. It’s scalable, reliable and easy to use! Please read How it works to understanding how they made it happen and most importantly the following paper: TimescaleDB: SQL made scalable for time-series data. I would also recommend you read the following blog post What the heck is time-series data (and why do I need a time-series database)?

Read Full Post →


I was recently asked to provide a simple demo/proof-of-concept on how to quickly create a real-time streaming dashboard with Power BI. Did you know that you can stream data and update dashboards in real-time? Any visual or dashboard that can be created in Power BI can also be created to display and update real-time data and visuals. The devices and sources of streaming data can be factory sensors, social media sources, service usage metrics, and anything else from which time-sensitive data can be collected or transmitted.
 
 

There are three types of real-time datasets which are designed for displaying visuals on real-time dashboards:

  • Push dataset
  • Streaming dataset
  • PubNub streaming dataset

This post is about the PubNub streaming dataset, With a PubNub streaming dataset, the Power BI web client uses the PubNub SDK to read an existing PubNub data stream, and no data is stored by the Power BI service. For more more information on the other types and their capabilities read the following documentation -> Real-time streaming in Power BI

Read Full Post →

As I mentioned several times before… Graphs are everywhere! Graph features are being introduced in SQL Server 2017. Offering graph database capabilities to model many-to-many relationships. The graph relationships are integrated into Transact-SQL and receive the benefits of using SQL Server as the foundational database management system.

What is a graph database?

In context, a graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data. A key concept of the system is the graph (or edge or relationship), which directly relates data items in the store. The relationships allow data in the store to be linked together directly…

Whats is a SQL Graph in SQL Server 2017?

A collection of node and edge tables. Node or edge tables can be created under any schema in the database, but they all belong to one logical graph. Only one graph can be created per database.

Read Full Post →