You basically need to make a Pipeline and build a ParamGrid with different algorithms as stages.  Here is an simple example:


val dt = new DecisionTreeClassifier()
.setLabelCol("label")
.setFeaturesCol("features")


val lr = new LogisticRegression()
.setLabelCol("label")
.setFeaturesCol("features")


val pipeline = new Pipeline()


val paramGrid = new ParamGridBuilder()
.addGrid(pipeline.stages, Array(Array[PipelineStage](dt), Array[PipelineStage](lr)))


val cv = new CrossValidator()
.setEstimator(pipeline)
.setEstimatorParamMaps(paramGrid)

More info in https://issues.apache.org/jira/browse/SPARK-19357
Thank you Brian Cutler.
https://bryancutler.github.io/cv-pipelines/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s