Presentation: Jump Start into Apache® Spark™ and Databricks

These are the slides from the Jump Start into Apache Spark and Databricks webinar on February 10th, 2016. — Apache Spark is a fast, easy to use, and unified engine that allows you to solve many Data Sciences and Big Data (and many not-so-Big Data) scenarios easily. Spark comes packaged with higher-level libraries, including support for SQL queries, streaming data, machine learning, and graph processing. We will leverage Databricks to quickly and easily demonstrate, visualize, and debug our code samples; the notebooks will be available for you to download. You can view the on-demand webinar Jump Start into Apache® Spark™…

Rate this:

Data Exploration with Databricks

Today, it was also featured on InsideBigData: Data Exploration with Databricks.  Awesome!   This Data Exploration on Databricks jump start video will show you how go from data source to visualization in a few easy steps. Specifically, we will take semi-structured logs, easily extract and transform them, analyze and visualize the data using Spark SQL, so we can quickly understand our data. For more information and to check out other Spark notebooks, check out Selected Notebooks > Databricks Jump Start.  

Rate this:

Simplify Machine Learning on Spark with Databricks

As many data scientists and engineers can attest, the majority of the time is spent not on the models themselves but on the supporting infrastructure.  Key issues include on the ability to easily visualize, share, deploy, and schedule jobs.  More disconcerting is the need for data engineers to re-implement the models developed by data scientists for production.  With Databricks, data scientists and engineers can simplify these logistical issues and spend more of their time focusing on their data problems. Simplify Visualization An important perspective for data scientists and engineers is the ability to quickly visualize the data and the model…

Rate this:

Notebook Gallery

Here are some of the notebooks created to showcase various Apache Spark use cases. These are all using Databricks Community Edition which you can get at Try Databricks. You can also access the source from : https://github.com/dennyglee/databricks. JSON Support GLM in SparkR Window Functions  Random Forests DataFrame API ML Operations   Decision Trees Statistical Functions  Data Import  Data Exploration Quick Start Python Quick Start Scala  Ad-Tech Example Flight Delays  Genomics Mobile Sample   Pop vs. Price LR  Pop vs. Price DF  Salesforce Leads Spark 1.6 (Multiple)   Spark 1.6  

Rate this: