GROUP

GROUP( recordset [, breakcriteria [, ALL ] ] [, LOCAL ] [, UNORDERED | ORDERED( bool ) ] [, STABLE | UNSTABLE ] [, PARALLEL [ ( numthreads ) ] ] [, ALGORITHM( name ) ] )

recordsetThe set of records to fragment.
breakcriteriaOptional. A comma-delimited list of expressions or key fields in the recordset that specifies how to fragment the recordset. You may use the keyword RECORD to indicate all fields in the recordset, and/or you may use the keyword EXCEPT to list non-group fields in the structure. You may also use the keyword ROW to indicate each record in the recordset is a separate group. If omitted, the recordset is ungrouped from any previous grouping.
ALLOptional. Indicates the breakcriteria is applied without regard to any previous order. If omitted, GROUP assumes the recordset is already sorted in breakcriteria order.
LOCALOptional. Specifies the operation is performed on each supercomputer node independently, without requiring interaction with all other nodes to acquire data; the operation maintains the distribution of any previous DISTRIBUTE.
UNORDEREDOptional. Specifies the output record order is not significant.
ORDEREDSpecifies the significance of the output record order.
boolWhen False, specifies the output record order is not significant. When True, specifies the default output record order.
STABLEOptional. Specifies the input record order is significant.
UNSTABLEOptional. Specifies the input record order is not significant.
PARALLELOptional. Try to evaluate this activity in parallel.
numthreadsOptional. Try to evaluate this activity using numthreads threads.
ALGORITHMOptional. Override the algorithm used for this activity.
nameThe algorithm to use for this activity. Must be from the list of supported algorithms for the SORT function's STABLE and UNSTABLE options.
Return:GROUP returns a record set.

The GROUP function fragments a recordset into a set of sets. This allows aggregations and other operations (such as ITERATE, DEDUP, ROLLUP, SORT and others) to occur within defined subsets of the data--the operation executes on each subset, individually. This means that the boundary condition code written in the TRANSFORM function for those functions that use them will be different than it would be for a recordset that has simply been SORTed.

The recordset must be sorted by the same elements as the breakcriteria if the ALL option is not specified.

The recordset gets 'ungrouped' by use in a TABLE function, by the JOIN function in some circumstances (see JOIN), by UNGROUP, or by another GROUP function with the second parameter omitted.

Example:

personRecord := RECORD
STRING UID;
STRING first_name;
STRING last_name;
STRING address;
STRING city;
STRING state;
STRING zip;
END;
person := DATASET([{'923','James','Jones','123 Elm Street','Hollywood','FL','33022'},
{'924','Sally','Jones','22 Main Street','Tampa','FL','33604'},
{'925','Jose','Gomez','111 Biscaya Lane','Miami','FL','33101'},
{'926','Adam','Wesson','77 Sunset Blvd','Boston','MA','02108'},
{'927','Evelyn','Murray','740 SW 10th Street','Boston ','MA','02116'},
{'928','Joe','Yung','7511 Simson Avenue','Chicago','IL','60131'}], personRecord);

SortedSet := SORT(Person,last_name); //sort by last name
GroupedSet := GROUP(SortedSet,last_name);  //then group them
OUTPUT(GroupedSet);

SecondSort := SORT(GroupedSet,first_name);
   //sorts by first name within each last name group
   // this is a "sort within group"
OUTPUT(SecondSort);

UnGroupedSet := GROUP(GroupedSet); //ungroup the dataset
OUTPUT(UnGroupedSet);

See Also: REGROUP, COMBINE, UNGROUP, EXCEPT