fncCombineSheets
Beyond Excel's (BXL's) Function:
Combine Worksheets
Name: fncCombineSheets()
Use:
Combine all sheets in a workbook into one table.
Description:
It is a common practices to have workbooks where each worksheet contains data for a customer or month. For analysis and reporting the modeler may want each worksheet's data placed into one table. This function does that.
Assumptions:
All worksheets in the workbook have the same format except those sheets excluded (see ExcludeSheets below)
Each worksheet has sheet title information in the first column (See ExcludeTitles below)
The sheet name identifies the customer, month, or what-have-you and must be included in the data
Parameters
This function has four parameters: sWorkbookPath, lColumnCount, ExcludeSheets, and ExcludeTitles .
WorkbookPath is the Excel workbook's full name including path. ex. "C:\temp\MyWorkbook.xlsx"
ColumnCount is the number of columns in each worksheet to include, thus, if each worksheet has 5 columns of data, enter 5 here.
ExcludeSheets are the names of worksheets not to combine separated by commas. Ex. "Sheet1,Sheet2"
ExcludeTitles are the first characters of each the title rows for each worksheet separated by commas. Ex. "Trial Balance, For Month"
Example:
In the sample workbook is a query called "Invoked Function" which looks something like below. It demonstrates how to call this function.
let
Source = fncCombineSheets("C:\Customers.xlsx", 5, "Customer:")
in
Source
Code:
Copy this code and paste it into PQ's advanced editor.
/* Name: fncCombineSheets
Description: Combine all sheets in a workbook into one table.
Requisites: The workbook must be saved
All worksheets in the workbook have identical formats
The worksheet name identifies what the data pertains to (customer, month, etc.)
Inputs: WorkbookPath: Workbook's full name with path
ColumnCount: Number of columns containing data in each worksheet
ExcludeSheets: Worksheets to exclude from combining
ExcludeTitles: First few characters of worksheet titles to exclude from the data
Outputs: A single table containing all data from all worksheets
Example: Source = fncCombineSheets("C:\MyCustomers.xlsx", 5, "Sheet1,Sheet2", "Customer:")
Date Ini Description
02/01/23 CWH Original Development
02/02/23 CWH Added ExcludeSheets
02/04/23 CWH Added ExcludeTitles
*/
(WorkbookPath as text, ColumnCount as number, ExcludeSheets as text, ExcludeTitles as text) =>
let
// Open Excel Workbook
Source = Excel.Workbook(File.Contents(sWorkbookPath), null, true),
// Select Sheets
RmvSheetSpaces = Text.Replace(Text.Replace(ExcludeSheets, " ,", ","), ", ", ","),
SheetList = Text.Split(RmvSheetSpaces, ","), /* Convert CSV to List */
SelSheets = Table.SelectRows(Source,
each not(List.Contains(SheetList,[Name])) and [Kind]="Sheet"),
// Only keep "name" and "data" columns. Name contains sheet name, Data contains worksheet data
RmvCols = Table.SelectColumns(SelSheets,{"Name", "Data"}),
// Create a list of column names ("Column1", "Column2", etc.)
ColHdrs = List.Generate(() => 1,
each _ <= lColumnCount,
each _ + 1,
each "Column" & Number.ToText(_)),
// Expand the data (worksheet columns)
ExpData = Table.ExpandTableColumn(RmvCols, "Data", ColHdrs),
// Remove excluded Title rows
// fnBeginsWith credit: Sergei Baklan - This is like List.Contains but only looking at the first characters
// https://techcommunity.microsoft.com/t5/excel/power-query-filter-table-based-on-list-with-wildcards/m-p/2418638
fnBeginsWith = (list, str) => List.Contains(
List.Transform(list, each Text.StartsWith(str, _)),
true),
RmvTitleSpaces = Text.Replace(Text.Replace(ExcludeTitles, " ,", ","), ", ", ","),
TitleList = Text.Split(RmvTitleSpaces, ","), /* Convert CSV to List */
RmvTitles = Table.SelectRows(ExpData,
each not fnBeginsWith(TitleList, [Column1])),
// Remove Empty Rows
RmvNulls = Table.SelectRows(RmvTitles,
each List.NonNullCount(Record.ToList(_)) <> 1),
// Promote column headings
PromoteColHdrs = Table.PromoteHeaders(RmvTabHdrs, [PromoteAllScalars=true]),
// Rename first column to "Sheet"
FirstCol = Table.ColumnNames(PromoteColHdrs){0},
RnmCols = Table.RenameColumns(PromoteColHdrs,{{FirstCol, "Sheet"}}),
// Remove repeated column headings from other worksheets
SecondCol = Table.ColumnNames(RnmCols){1},
RmvDupColHdrs = Table.SelectRows(RnmCols,
each not(List.Contains({SecondCol}, Record.Field(_, SecondCol))))
in
RmvDupColHdrs
Download:
The workbook at right contains sample data and this function. To download it:
Click it
Click Pop-out
Click Download
Move to c:\Temp
Open and enable content when asked
Use the menu option Data > Queries & Connections
Right click on Invoked Function and select Edit
You should see the results it produces.