aggregateByKey

Class: matlab.compiler.mlspark.RDD
Namespace: matlab.compiler.mlspark

Aggregate the values of each key, using given combine functions and a neutral “zero value”

Syntax

result = aggregateByKey(obj,zeroValue,seqFunc,combFunc,numPartitions)

Description

result = aggregateByKey(obj,zeroValue,seqFunc,combFunc,numPartitions) aggregates the values of each key, using given combine functions specified by seqFunc and combFunc, and a neutral “zero value” specified by zeroValue. The input argument numPartitions is optional.

Input Arguments

expand all

`obj` — Input RDD
`RDD` object

An input RDD, specified as a RDD object.

`zeroValue` — Neutral “zero value”
cell array of numbers

A neutral “zero value”, specified as a cell array of numbers.

Data Types: cell

`seqFunc` — Function to aggregate the values of each key
function handle

Function that aggregates the values of each key, specified as a function handle.

Data Types: function_handle

`combFunc` — Function to aggregate results of seqFunc
function handle

Function to aggregate results of seqFunc, specified as a function handle.

Data Types: function_handle

`numPartitions` — Number of partitions to create
scalar value

Number of partitions to create, specified as a scalar value. This argument is optional.

Data Types: double

Output Arguments

expand all

`result` — RDD containing elements aggregated by key
`RDD` object

An RDD containing elements aggregated by key, returned as a RDD object.

Examples

expand all

Aggregate the Values of Each Key

%% Connect to Spark
sparkProp = containers.Map({'spark.executor.cores'}, {'1'});
conf = matlab.compiler.mlspark.SparkConf('AppName','myApp', ...
                        'Master','local[1]','SparkProperties',sparkProp);
sc = matlab.compiler.mlspark.SparkContext(conf);

%% aggregateByKey
x = sc.parallelize({'a','b','c','d'},4);
y = x.map(@(x)({x,1}));
z = y.aggregateByKey(10,@(x,y)(x+y),@(x,y)(x+y));
viewRes = z.collect()  % { {'d',11},{'a',11},{'b',11},{'c',11}}

Version History

Introduced in R2016b

aggregateByKey

Syntax

Description

Input Arguments

obj — Input RDD RDD object

zeroValue — Neutral “zero value” cell array of numbers

seqFunc — Function to aggregate the values of each key function handle

combFunc — Function to aggregate results of seqFunc function handle

numPartitions — Number of partitions to create scalar value

Output Arguments

result — RDD containing elements aggregated by key RDD object

Examples

Aggregate the Values of Each Key

Version History

See Also

`obj` — Input RDD
`RDD` object

`zeroValue` — Neutral “zero value”
cell array of numbers

`seqFunc` — Function to aggregate the values of each key
function handle

`combFunc` — Function to aggregate results of seqFunc
function handle

`numPartitions` — Number of partitions to create
scalar value

`result` — RDD containing elements aggregated by key
`RDD` object