PAW - Process & Analytics Workbench

Processes and Steps

One of the core concepts in PAW is the concept of a process consisting of processing steps. PAW creates the process by capturing the processing steps as the user manipulates the data for their needs. This process can then be re-run in an automated fashion.

What is a process?

A process results in a dataset. Each time the process is run, the resulting dataset is overriden with the new results. Often, processes are chained together, so that one process will use the results of another process. This is often the case when the same dataset needs to be processed in multiple ways.


Timesheet Summary is a process, consisting of the following steps: import data from an Excel file, merge it, filter by project, and then summarize it.

Iterative and Data Centric
As a user creates the process, the resulting dataset is displayed immediately with each step. Hence, as you import data from somehere (ex. database), the result of the query is shown. Then, as you join it with another dataset, the joined result is shown. In this manner, with each step, you see exactly how your data will look. You end up focusing on the data rather than the process, allowing you to stay with the business context.

Types of Steps

There are 3 types of steps that are part of a process.

  • Data sourcing step - each process starts with one step that sources the data from somewhere. This can either be an external data source like a website or an internal source such as another dataset.
  • Data processing step - there can be one or many processing steps as part of a process. These are like filtering, summarizing, joining, calculating fields, etc. The data is passed through a chain of these steps with each step adding some data or changing some data to the existing sourced data.

  • Data output step - there can be an optional data output step for a process that sends the data out to a file. A lot of times, data output can be treated as a data processing step, with an example being loading data into a database.