Simplify Machine Learning on Spark with Databricks

As many data scientists and engineers can attest, the majority of the time is spent not on the models themselves but on the supporting infrastructure.  Key issues include on the ability to easily visualize, share, deploy, and schedule jobs.  More disconcerting is the need for data engineers to re-implement the models developed by data scientists for production.  With Databricks, data scientists and engineers can simplify these logistical issues and spend more of their time focusing on their data problems.

Simplify Visualization

An important perspective for data scientists and engineers is the ability to quickly visualize the data and the model that is generated.  For example, a common issue when working with linear regression is to determine the model’s goodness of fit.  While statistical evaluations such as Mean Squared Error are fundamental, the ability to view the data scatterplot in relation to the regression model is just as important.

Click further to continue reading Simplifying Machine Learning on Spark with Databricks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s