Developing Apache Spark Applications

Training for developers who want to build Big Data Applications using Apache Spark.

Request now
Apache Spark Logo

At a glance

General information

3 days practical training

Target group

Software Architects, Software Developers

Application examples

– Developing Big Data Applications based on Apache Spark

– Loading and analysing records from various sources and formats

Description

The Training sessions are usually held in German. Please contact us if you are interested in Training sessions in English.

This Training course teaches all the knowledge required to develop Big Data Applications based on Apache Spark. Participants first learn how to use the Spark Shell to load and interactively analyse records from various sources and formats. Building on this, participants develop a stand-alone Spark application to process data in the form of datasets and data frames locally or on a computing cluster. The Training concludes with an introduction to Spark streaming to process data streams, GraphFrame to analyse graphs and the MLlib Machine Learning library.

Prerequisites:

  • Basic Hadoop skills
  • Basic Linux skills (including command line options such as ls, cd, cp and su)
  • Good Java or Scala skills
  • Good SQL skills

Agenda

1. Apache Spark Basics (DEV 360)

  • Apache Spark features
  • Spark framework components
  • Case studies

 

2. Creating datasets

  • Defining data sources, structures and schemas
  • Working with datasets and data frames
  • Converting data frames to datasets

       Practice:

  • Loading data and creating datasets using Reflection
  • Simple case study: word count with datasets (optional)

 

3. Operations for datasets

  • Basic operations on datasets
  • Caching datasets
  • User-defined functions (UDFs)
  • Partitioning datasets

       Practice:

  • Analysing SFPD data
  • Creating and applying UDFs
  • Analysing data with the help of UDF and queries

 

4. Developing a simple Apache Spark Application (DEV 361)

  • Spark Application lifecycle
  • Using SparkSession
  • Starting Spark Applications

       Practice:

  • Importing and configuring Application files
  • Building, deploying and starting Applications

 

5. Monitoring Apache Spark Applications

  • Logical and physical Spark schedules
  • Spark Web UI for monitoring Spark Applications
  • Debugging and tuning Spark Applications

       Practice:

  • Using Spark UI
  • Interpreting Spark system properties

 

6. Creating Apache Spark streaming Applications (DEV 362)

  • Introduction to the Spark streaming architecture
  • Developing Spark structured streaming Applications
  • Applying operations to streaming data frames
  • Developing your own Windows functions

       Practice:

  • Loading and analysing data using the Spark Shell
  • Spark streaming in the Spark Shell
  • Building and running a streaming application with SQL
  • Building and running a streaming application with Windows function and SQL

 

7. Using Apache Spark GraphFrames

  • Introduction to GraphFrame
  • Defining regular, directed and property graphs
  • Creating property graphs
  • Perform operations on charts

       Practice:

  • Graph analysis with GraphFrames

 

8. Using Apache Spark MLlib

  • Introduction to Apache Spark MLlib (Machine Learning Library)
  • Collaborative filtering for user selection prediction

       Practice:

  • Data analysis using the Spark Shell
  • Developing a Spark application for film recommendations
  • Analysing a simple flight system with decision trees

Typical questions we answer:

  • What is Spark and for what purposes is it suitable?
  • How do I implement an ETL pipeline based on Spark?
  • How can I debug a Spark job and identify performance bottlenecks?
  • How can I improve the duration of my savings job?
  • How do I develop streaming applications with Spark Structured Streaming?
  • How do I develop machine learning applications based on Spark ML?
Developing Apache Spark Applications
€2,100.00 (p.p., plus VAT)
  • signed certificate
  • in-house training
  • Customization possible (agenda, tech stack, language, etc.)
  • small training groups
Request now
€2,100.00 (p.p., plus VAT)

Why inovex Academy?

Our offer

The inovex Academy has set itself the task of passing on knowledge about methods and technologies that we already use successfully in our projects.

Curated content

Our trainers create a customized training offer based on your requirements.

Customizable tech stack

In exclusive trainings, we can consider your tech stack for the training content.

Individual assistance

If needed, we can tailor the training to a specific use case of your company and work directly based on your data.

Trainers

Our trainers are field-tested experts in their areas of expertise. Through their work in projects, they expand their knowledge day by day and pass on this know-how in their trainings - application-oriented and practice-oriented.

Portraitfoto von Marcel Spitzer

Marcel Spitzer

Google Cloud Certified Data Engineer Badge
Databricks Machine Learning Practitioner Associate Certificate
Databricks Developer Associate Certification
Marcel Spitzer is a Data Engineer at inovex. He is involved in the development of streaming and batch pipelines for data processing in distributed systems and uses machine learning to make data products smart

Our training approach

From the needs analysis to the awarding of certificates, we offer customized training courses, flexibly designed and carried out according to your requirements.

If you are interested in in-house training, we will start by identifying your needs and discussing your objectives. This discussion forms the basis for an initial offer.

As soon as the framework data has been clarified, our trainers start adapting the training content. Many of our training courses have a modular structure and offer the opportunity to design the agenda flexibly. Training courses that prepare for certifications, on the other hand, are less flexible. Here, however, you can set the content focus according to your wishes.

You will receive all relevant information in advance of the training. The training will then take place in the room of your choice and at the agreed time. Our trainers will adapt to your requirements.

After completing the training, all participants receive a certificate confirming their participation. You will also have the opportunity to give us feedback on the content and the course. We are always happy to receive praise and suggestions for improvement.

Frequently Asked Questions

Will I receive a certification as a result of the training?
All participants will receive a certificate of participation from the inovex Academy after the training.
When does the training start?
Our trainings start at 09:00 Central European Time.
Do I get an invitation? When do I get it?
The trainer sends out the invitations about 1 week before the start of the training. In addition to the agenda and the schedule, any preparations (installation of software, etc.) will be pointed out again.
Portraitbild von Collin Rogowski
Collin Rogowski
Head of inovex Academy
inovex Logo
Go back
Portraitbild von Collin Rogowski

I look forward to your inquiry.

Collin Rogowski

We are your partner for successful training

We would be happy to talk to you personally about your concerns. Get in touch now!

Portraitbild von Collin Rogowski
Collin Rogowski
Head of inovex Academy
  • Customized training courses for your company
  • Over 25 years of experience