Run Different Processes in Parallel and Gather the Results

With a Scatter-Gather card, you can send a message through more than one set of cards simultaneously, collect the results from each set, and pass those results (along with the original message) to the next card in a flow.

Figure 1. This schematic diagram shows a Scatter-Gather card that includes the minimum of two routes.

1	Any card can pass messages to a Scatter-Gather card. The card that passes messages can be the trigger of the current flow or any other card in the flow.
2	The Scatter-Gather card is displayed in a larger rectangle that is called the scope of the card. Within this scope are two or more sets of cards, each set called a route. The Scatter-Gather card processes all routes at the same time, and the routes are all independent of each other. Each route can have one or more cards.
3	When all routes are finished processing data, the results from the routes are included in the payload of the original message. The revised message is passed to the next card in the flow.

Messages that move through a flow have this general structure:

payload
attributes
vars

After a message leaves a Scatter-Gather card that contains two routes, the message has a new subpayload from each route:

payload
  0
    payload
  1
    payload
attributes
vars

Example of Aggregating Contacts from Two Salesforce Organizations

In this example, an HTTP Listener triggers the flow when it receives a request. The Scatter-Gather card initiates two routes that retrieve the lists of contacts for two Salesforce organizations. The Scatter-Gather card combines the payload from each instance of Anypoint Connector for Salesforce (Salesforce Connector) into a single payload. This payload is stored in a variable that concatenates the results of the two queries. Finally, the fields in the concatenated results are mapped to a data type that creates a CSV file.

Figure 2. This is the flow that creates the CSV file aggregating the contacts from the two Salesforce organizations.

1	An HTTP Listener card for triggering the flow.
2	A Scatter-Gather card with two routes, each route consisting of an instance of Salesforce Connector. The instances query different Salesforce organizations.
3	A Set Variable card for storing the concatenated query results from the Scatter-Gather card in a variable.
4	A Transform card for mapping the fields in the variable to the fields in a CSV file.

First, you create the custom data type like this:

The dialog for creating the custom data type

Figure 3. The Edit Custom Type dialog lets you create the custom data type.

1	The Format field is set to CSV.
2	The Type field is set to From schema.
3	The schema is this:

#%RAML 1.0 DataType
type: array
items:
  properties:
    Name:
      type: string
    ID:
      type: string
    Email:
      type: string

Next, you add the HTTP Listener card, which listens at the path that you specify in the Path field. The name of the path is given to the name of the CSV file that is downloaded to your computer:

Figure 4. The Path field is in the General settings.

Although this example uses an HTTP Listener card, you can send data to a Scatter-Gather card from any card.

Next, you add the Scatter-Gather card and place an instance of Salesforce Connector in each of two routes. The instances of Salesforce Connector use different configurations because they connect to different Salesforce organizations and therefore require different connection information.

You specify this query in the Salesforce Query field in each instance:

Query and location of the Salesforce Query field

Figure 5. The Salesforce Query field is in the General settings.

The message that leaves the Scatter-Gather card has these parts:

payload
  0
    payload
      Name
      ID
      Email
  1
    payload
      Name
      ID
      Email
attributes
vars

The payload from the Scatter-Gather card consists of two subpayloads, one payload for each route in the card. The subpayload under 0 is the payload from the first route in the card, and the subpayload under 1 is the payload from the second route.

At this point in the flow, you can combine the two subpayloads into a CSV file with a DataWeave script. However, if you would rather not write a DataWeave script, you can concatenate the two subpayloads into a single payload that you store in a variable, and then map the fields in the single payload to the fields in a CSV file.

The Set Variable card includes 1) a name for the variable and 2) an expression in the Value field. The expression (at 3) for concatenating the two subpayloads is this:

payload["0"]["payload"] ++ (payload["1"]["payload"])

You can create this expression by dragging the Custom Expression pill from the right side of the card into the Value field, clicking the arrow in the pill, and clicking Expand. Then, drag the first Payload pill into the text box, add the two plus signs, and then drag the second Payload pill into the text field.

Figure 6. The Set Variable card looks like this.

1	The name specified for the variable
2	The pill created by dragging Custom Expression into the Value field
3	The expression that concatenates the query results

The message that the Transform card receives from the Set Variable card contains these parts:

payload
  0
    payload
      Name
      ID
      Email
  1
    payload
      Name
      ID
      Email
attributes
vars
  SG_payload
    Name
    ID
    Email

The fields that are in the variable SG_payload can be mapped to the fields of a custom data type that uses the CSV format:

Mapping the fields of the variable to the output

Figure 7. You can map the fields of the variable to the fields of the custom data type.

1	The location of the variable in the Input pane
2	The mappings from the fields in the variable to the fields in the custom data type

You drag the Name field from the input pane to the Name field in the output pane. You do the same for the ID and Email fields.

The result looks like this, with the actual content differing according to the Salesforce organizations that are queried. The first twelve contacts are from organization A, their IDs all beginning with 0036g000009C. The remaining contacts are from organization B, their IDs beginning with 0028t000001J:

Name,ID,Email
Rose Gonzalez,0036g000009C4WwAAK,rose@edge.com
Sean Forbes,0036g000009C4WxAAK,sean@edge.com
Lauren Boyle,0036g000009C4X4AAK,lboyle@uog.com
Babara Levy,0036g000009C4X5AAK,b.levy@expressl.net
Josh Davis,0036g000009C4X6AAK,j.davis@expressl.net
Jane Grey,0036g000009C4X7AAK,jane_gray@uoa.edu
Arthur Song,0036g000009C4X8AAK,asong@uog.com
Ashley James,0036g000009C4X9AAK,ajames@uog.com
Tom Ripley,0036g000009C4XAAA0,tripley@uog.com
Liz D'Cruz,0036g000009C4XBAA0,ldcruz@uog.com
Edna Frank,0036g000009C4XCAA0,efrank@genepoint.com
Avi Green,0036g000009C4XDAA0,agreen@uog.com
Alden Macfarlane,0028t000001J3BAB2,a.macfarlane@trundle.org
Tiffani Davalos,0028t000001J3FAH1,t.davalos@trundle.org
Elly Fenley,0028t000001J7JUp9,efenley@tripple.com
Richie Fore,0028t000001J3HKJ0,rfore@tripple.com
Ricarda Cieslak,0028t000001J3WLl9,rcieslak@tripple.com
Rosaria Jorgensen,0028t000001J3BNM0,rosaria.jorgensen@huttrick.com
Treasa Barbeau,0028t000001J3BNk3,treasa.barbeau@huttrick.com
Dwayne Sgro,0028t000001J3EPO2,dsgro@sgroflo.com
Maude Wilber,0028t000001J4UUy4,mwilber@zhipe.com
Lashell Rouleau,0028t000001J3NIK6,lrouleau@zhipe.com
Agnes Diehm,0028t000001J3TCX8,agnes.diehm@aipail.com
Pamella Fite,0028t000001J1LOQ3,pamella.fite@aipail.com
Kennith June,0028t000001J3POI0,kennith.june@aipail.com
Keiko Woods,0028t000001J3SIL1,k.woods@fiephdom.org
Natashia Horvat,0028t000001J3KSV0,n.horvat@fiephdom.org
Margarita Ptak,0028t000001J3VVe5,ptakm@greelo.com
Delena Hodgdon,0028t000001J3OSQ2,hodgdond@greelo.com
Britt Lafferty,0028t000001J6HUT7,britt.lafferty@redooler.com
Irving Bates,0028t000001J3IUe6,irving.bates@redooler.com
Merle Cater,0028t000001J3PPl0,merle.cater@redooler.com

Saving Message Data into Variables

If you want a route to act on only part of a message, you can store what you need for the route in a variable before the message reaches the Scatter-Gather card or at the beginning of the route.

The following configuration is useful if your variables need to be used in more than one route:

Saving data for use by more than one route

Figure 8. You can save data in one or more variables that one or more routes can use.

In this case, the first card in each route receives the same message, which might have these parts if you create two variables before the Scatter-Gather card:

payload
attributes
vars
  var1
  var2

You can use both variables in both routes, if you need to.

The following configuration is useful if your variables need to be used only in single routes:

Figure 9. You can save data in a variable that is used by only one route.

Suppose that you create var1 at the start of the first route, and you create var2 at the start of the second route. Then, the second card in the first route receives a message that has these parts:

payload
attributes
vars
  var1

And the second card in the second route receives a message that has these parts:

payload
attributes
vars
  var2

In both of the two previous diagrams, the final card could be a Transform card, which might require you to know how to write scripts in DataWeave, or another Set Variable card that, as in the example about aggregating contacts, combines somehow the results from the routes in the Scatter-Gather card and then passes the variable to another card, such as a Transform card.

Differences Between Parallel For-Each, For-Each, and Scatter-Gather

For-Each iterates through a collection of elements, processing one element at a time through a single route.

Parallel For-Each iterates through a collection of elements n times (where n is less than or equal to the number of elements in the collection) with a single process. For example, if there are 10 elements and the maximum number of processes is set to 2, a Parallel For-Each card processes two elements at a time, five times.

Scatter Gather runs two or more parallel processes, called routes. When all routes are finished, the results from each route are gathered into a new payload.

Fields in a Scatter-Gather Card

Timeout: The maximum length of time to wait, in milliseconds, for the results from each of the parallel processes to be gathered into a new payload and sent to the next card. A value of 0 or lower means that there is no timeout. The default value is 9,223,372,036,854,776,000 milliseconds, which is the maximum value for a 64-bit float.
Max concurrency: The maximum number of routes to process in parallel. By default, all of the routes that you create in an instance of the Scatter-Gather card run in parallel.

The default value that appears in the card is 2,147,483,647. You do not need to worry about why this is the default. Just understand that, if you don’t set the value to the number of routes in the scope of your Scatter-Gather card, the effective value is in fact the number of routes in the scope.

If you set this value to 1, Scatter Gather takes the first element in the payload, evaluates it against the condition set up for each route, and sends the element through the route for which it meets the condition. When a route is done processing an element, Scatter Gather evaluates the next element against the routing conditions, continuing until the elements in the payload have all been processed.

When you configure a Scatter-Gather card, you may, but are not required to, specify values in these fields:

Target: Specifies a variable to use for storing the gathered results. By default, the output is saved in the flow’s payload.
Target Value: Specifies an expression to evaluate against the operation’s output value. The outcome of this expression is stored in the target variable. By default, this is the same as the operation’s output value.

Run Different Processes in Parallel and Gather the Results

Example of Aggregating Contacts from Two Salesforce Organizations

Saving Message Data into Variables

Differences Between Parallel For-Each, For-Each, and Scatter-Gather

Fields in a Scatter-Gather Card

See Also