Hi, these last days i was working developing a solution related with MongoDb, Twitter4j, spark streaming and machine learning (kmeans) using scala. The project needs sbt to build it, and it is the continuation of the previous project related with cassandra, spark streaming and machine learning with scala, so if you want that sbt test works, you are going to need that a cassandra server and a mongo server is running in your local machine.

The project is based on databricks reference app and spark mongodb stratio library, basically i just adapted the necessary to store json tweets in a mongo instance using the library of stratio. I started using the casbah library but i found it unclear to use it, stratio library is much easier to use, instead i found that stratio provides a cassandra connector, it looks promising, so in a near future, i will use it.

The next step is to integrate this project with a kafka broker…

Have fun and be nice with people.



I have passed a stomach flu, so forgive me if this post is not clear, i think the project is self explanatory and you will be able to change sources for your needs without a problem.


One thought on “About how to interact with Mongo and Spark Streaming using scala

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s