A proposal with a script to defend ourselves against modified ssh public keys.
About capturing log data in a distributed system
Recently I did a test for a company in which they asked for two apparently very simple exercises but which contain a great deal of complexity as soon as you realise, so I'm going to do an exercise in analysis of how I saw that problem and how I would deal with it. I give…
About how to parallelize multiple Machine Learning Algorithm using a pipeline with spark.
You basically need to make a Pipeline and build a ParamGrid with different algorithms as stages. Here is an simple example: val dt = new DecisionTreeClassifier() .setLabelCol("label") .setFeaturesCol("features") val lr = new LogisticRegression() .setLabelCol("label") .setFeaturesCol("features") val pipeline = new Pipeline() val paramGrid = new ParamGridBuilder() .addGrid(pipeline.stages, Array(Array[PipelineStage](dt), Array[PipelineStage](lr))) val cv = new CrossValidator() .setEstimator(pipeline) .setEstimatorParamMaps(paramGrid)…