About how to install Cassandra latest version in Ubuntu 16.04

Open a terminal and paste this commands. Latest version of Cassandra until this day is 3.9. Check the datastax page to see which is the latest version or you can go to this link and see the latest release_X.Y version. In this case i can see a release_3.9.conf file, so latest version is 3.9. echo…

Advertisement

Playing with Word2Vec, my cv, spark and scala

Hi,i just have to play and learn how to use this algorithm provided by spark-ml to do some feature extractions from some text using Google`s Word2Vec algorithm, i mean, why not to use my actual cv? Before that, probably you will have to convert the pdf file to text file. Actually i am working with…

About how to parallelize multiple Machine Learning Algorithm using a pipeline with spark.

You basically need to make a Pipeline and build a ParamGrid with different algorithms as stages.  Here is an simple example: val dt = new DecisionTreeClassifier() .setLabelCol("label") .setFeaturesCol("features") val lr = new LogisticRegression() .setLabelCol("label") .setFeaturesCol("features") val pipeline = new Pipeline() val paramGrid = new ParamGridBuilder() .addGrid(pipeline.stages, Array(Array[PipelineStage](dt), Array[PipelineStage](lr))) val cv = new CrossValidator() .setEstimator(pipeline) .setEstimatorParamMaps(paramGrid)…