THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Davide Mauri

A place for my thoughts and experiences on SQL Server, Business Intelligence and .NET

PowerQuery, ExpandTableColumn and the parent data

I’ve been playing with PowerQuery in the last days and I’ve come across the need of accessing “parent” object data when using the ExpandTableColumn function. The solution is more tricky then one could imagine and I have to say a BIG “THANK YOU” to Miguel Llopis for showing it me.

Now, let’s say you have a bunch of files in a directory. All files has the same structure (csv, json…anything you want) and you need to import all their content in one PowerPivot table.

image

Of course the PowerQuery “From Folder” option is exactly what you need

image

And there’s the result

image

If you then click on the “Content” header  you will have the content of the files merged in just one resulting table.

image

Cool, isn’t it? Now, let’s say they you also want to have the information about the source of your data. Like, for example, the name of the source file, so that you can know from which file the data comes from. Now things become much more complex since there is no object that allows you to access the “parent object” information. In order to solve this, you have to start to use PowerQuery functions (http://office.microsoft.com/en-us/excel-help/learn-about-power-query-formulas-HA104003958.aspx page 11).

First of all you need to create a function that will do the same operation you just did for all files in a folder, but limiting it only to a specific file, and turn that file in the parameter of such function. Here’s an example to do that:

(filepath) =>
let
    Source = Table.FromColumns({Lines.FromBinary(File.Contents(filepath))}),
    SplitColumnDelimiter = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByDelimiter(","),2)
in
    SplitColumnDelimiter

the function will take a file, whose path will be passed in the filepath parameter, turn the file into a table and split the values using comma as a delimiter. Since it is a function, you don’t need to have it visible in an Excel sheet, so you can deselect the “Load To Worksheet” option. Let’s give a name to our function and call it DoStuff.

Now that our function is in place we just have to

  1. Load the folder content again
  2. Create a new column with the file path that will be passed to the freshly created function
  3. Remove all unneeded columns
  4. Create an additional column where we’ll call the created function for each file in the folder
  5. Expand the loaded content into columns

Translated into PowerQuery language it becomes:

let
    Source = Folder.Files("D:\Temp\PowerQuery"),
    InsertedCustom = Table.AddColumn(Source, "File Path", each [Folder Path] & [Name]),
    RemovedOtherColumns = Table.SelectColumns(InsertedCustom,{"File Path"}),
    InsertedCustom1 = Table.AddColumn(RemovedOtherColumns, "Custom", each DoStuff([File Path])),
    #"Expand Custom" = Table.ExpandTableColumn(InsertedCustom1, "Custom", {"Column1.1", "Column1.2"}, {"Custom.Column1.1", "Custom.Column1.2"})
in
    #"Expand Custom"

and here’s the result:

image

Enjoy!

Published Wednesday, December 04, 2013 2:56 PM by Davide Mauri
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit

About Davide Mauri

Davide Mauri - MCP, MCAD, MCDBA, MCT, MVP on SQL Server - has worked with SQL Server since version 6.5, and his interests cover the whole platform, from the Relational Engine to Analysis Services, from architecture definition to performance tuning. He also has a strong knowledge of XML, .NET and the Object Oriented Design principles, which allows him to have the correct vision and experience to handle development of complex business intelligence solutions. Having worked as a Microsoft Certified Teacher for many years, Davide is able to pass all his knowledge to his co-workers, allowing his team to deliver high-quality solutions. He currently works as a Mentor for SolidQ and can be found speaking in many Italian and internationals events.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement