Introduction to Python for
Data Analysis

A 4-Day Remote Seminar Taught by Jason Anastasopoulos, Ph.D.

 

NOTE: this course is designed for those who have no previous experience with Python. If you have some previous experience with Python, you may want to consider Statistical Computing with Python.

Download a sample of the course materials

DOWNLOAD

Python is a premier language for modern data science and data analysis. It is a free, open-source language that has a simple, easy-to-understand syntax and an incredible range of data analysis and visualization libraries. In four days, this seminar provides a comprehensive introduction to Python. The goal is to get participants to fully understand many of the basic elements of Python and immediately apply them to practical data analysis and data collection problems.

Starting May 10, we are offering this seminar as a 4-day synchronous*, remote workshop. Each day will consist of a 3-hour live lecture held via the free video-conferencing software Zoom. Participants are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if they are unable to attend at the scheduled time.

Each day will include a hands-on exercise to be completed on your own after the lecture session is over. An additional lab session will be held Tuesday and Thursday afternoons, where you can review the exercise results with the instructor and ask any questions.

*We understand that scheduling is difficult during this unpredictable time. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session and will be accessible for four weeks after the seminar, meaning that you will get all of the class content and discussions even if you cannot participate synchronously.

Closed captioning is available for all live and recorded sessions.

More Details About the Course Content

Python is rapidly becoming the preferred language of data scientists in both industry and academia. It’s used by Google, Facebook and other tech giants to perform data analysis and run machine learning algorithms that can handle hundreds of thousands of terabytes of data per day.

Python can be used for:

  • Storing and analyzing large and small datasets.
  • Web scraping and data collection using APIs.
  • Beautiful data visualization.
  • Natural language processing and text analysis.
  • General machine learning.
  • Deep learning.
  • Image analysis and much, much more…

How you will benefit from this seminar:

This seminar is a foundational course in Python. The goal is to get participants to fully understand many of the basic elements of Python and immediately apply them to practical data analysis problems.

By the end of this seminar you will be able to:

  • Program using Python (Jupyter) notebooks and IDEs.
  • Understand and use basic data analysis and visualization libraries such as NumPy, Pandas, Matplotlib, Seaborn and statsmodels, among others.
  • Use basic data structures needed to do data analysis: variables, lists, loops, dictionaries, Boolean operators, functions.
  • Perform data analysis and basic statistical inference: GLMs, ANOVA, hypothesis testing.
  • Produce beautiful data visualizations.

Computing

This is a hands-on class that will involve at least two hours of structured and supervised assignments. To ensure that you are prepared, you must do the following BEFORE the first class:

➔ Download and Install Anaconda Python 3.7+ Individual Edition for your operating system: https://www.anaconda.com/products/individual

➔ Familiarize yourself with Google Colaboratory Python Notebooks: https://colab.research.google.com/notebooks/intro.ipynb

You should also know how to access the command prompt (Windows users) or the terminal (Mac users). We will briefly review how to access these in class, but it will save you time and effort if you come already knowing these basics. You can get resources on the internet that will help you get started with the Windows Command Prompt or the Mac Terminal.

Materials

Participants receive access to a private repository containing all of the lecture notes, code and data needed for the class.

Participants interested in getting a jump start on some of the material should consider reading the free book “Python for Everybody” by Dr. Charles R. Severance. This book is not required but is recommended as optional reading and as a useful reference.

Who Should Register?

This seminar is designed for anyone who wants to quickly and efficiently obtain a solid foundation in the Python language that will allow them to begin using the language for their research, data analysis or visualization needs.

This seminar does not assume any previous programming experience. However, those at an intermediate or advanced level in other packages or languages can also benefit greatly from this course.

Seminar Outline

Introduction to Python

1. Getting started with Python:

  • Why Python?
  • Introduction to Anaconda Python.
  • Introduction to Python (Jupyter) notebooks.
  • Overview of basic libraries used: NumPy, Pandas, Matplotlib, SciPy, statsmodels.

2. Python basics and data structures:

  • Variables: numbers, strings values, using variables.
  • Lists and loops: lists basics, simple loops, pythonic loops.
  • Logical statements in Python.
  • Using and creating dictionaries.
  • Creating functions.

3. Python basics assignment solutions and review.

Data Analysis and Manipulation

1. Data analysis and statistical inference:

  • Handling arrays with Pandas and NumPy.
  • Basic data analysis:
    A. Summary statistics: mean, median, mode, variance and standard deviation.
    B. Hypothesis testing: t-tests, confidence intervals.

2. Data analysis and manipulation assignment solutions and review.

Statistical Inference and Data Visualization

1. Statistical inference:

  • Linear regression, logistic regression, generalized linear models.

2. Data visualization:

  • Basic plots: scatterplots, line plots, heatmaps.
  • Distributions: densities, box plots, histograms.

3. Statistical inference and visualization assignment review.

Reviews of Introduction to Python for Data Analysis

“A very nice  interactive course that provided me with a good introduction to Python.”
  Tahereh Dehdarirad, Chalmers University of Technology 

“This class is the perfect way to jump right into doing data analysis with Python. I am certain that taking this class has saved me weeks of slowly grinding through teaching myself Python! A huge time saver!”
  Anonymous

“Great applications. Very approachable, but also not too simplistic where I could easily learn it on my own. Instructor was very inclusive, and offered great responses.”
  Anonymous

Seminar information

Tuesday, May 10, 2022 –
Friday, May 13, 2022

Each day will follow this schedule:

11:00am-2:00pm ET (New York time): Live lecture via Zoom

4:00pm-5:00pm ET: Live lab session via Zoom (Tuesday and Thursday only)

Payment Information

The fee of $895 includes all course materials.

PayPal and all major credit cards are accepted.

Our Tax ID number is 26-4576270.