Articles

Pandas

Selecting and Indexing

These articles all cover how to index and select data in pandas Series and DataFrames. If you read the articles in order, the topics will build upon one another.

Indexing and Selecting in Pandas (part 1) – the basics, start here

Indexing and Selecting in Pandas – slicing – it turns out pandas slicing can be a bit different than regular Python

Boolean Indexing in Pandas – a core part of pandas usage

Indexing and Selecting in Pandas by Callable

Selecting in Pandas using where and mask – these methods can be quite useful for updating data based on what is in your existing data

Selection in pandas using query – a useful way to select data, and it has performance ramifications

General pandas usage

Views, Copies, and that annoying SettingWithCopyWarning

Overview of I/O tools in Pandas

Converting types in Pandas

Removing duplicate data in Pandas

Iterating over rows in a DataFrame (and should you even do that?)

Basic pandas

Basic Pandas: Moving a DataFrame column

Basic Pandas: Renaming a DataFrame column

Basic Pandas: How to add a column to a DataFrame

Python

Profiling and performance

Profiling Python with cProfile, and a speedup tip – the basic profiler is covered here

Profiling Python code with line_profiler – using line_profiler to diagnose performance issues within individual functions

Indexing in pandas can be so confusing

There are so many ways to do the same thing! What is the difference between .loc, .iloc, .ix, and []?  You can read the official documentation but there's so much of it and it seems so confusing. You can ask a question on Stack Overflow, but you're just as likely to get too many different and confusing answers as no answer at all. And existing answers don't fit your scenario.

You just need to get started with the basics.

What if you could quickly learn the basics of indexing and selecting data in pandas with clear examples and instructions on why and when you should use each one? What if the examples were all consistent, used realistic data, and included extra relevant background information?

Master the basics of pandas indexing with my free ebook. You'll learn what you need to get comfortable with pandas indexing. Covered topics include:

  • what an index is and why it is needed
  • how to select data in both a Series and DataFrame.
  • the difference between .loc, .iloc, .ix, and [] and when (and if) you should use them.
  • slicing, and how pandas slicing compares to regular Python slicing
  • boolean indexing
  • selecting via callable
  • how to use where and mask.
  • how to use query, and how it can help performance
  • time series indexing

Because it's highly focused, you'll learn the basics of indexing and be able to fall back on this knowledge time and again as you use other features in pandas.

Just give me your email and you'll get the free 57 page e-book, along with helpful articles about Python, pandas, and related technologies once or twice a month. Unsubscribe at any time.

Invalid email address

Profiling Python code with py-spy – using the py-spy sampling profiler against running code

Profiling Python code with memory_profiler – for profiling memory usage in Python code, including line by line

Tools

Using pyenv to manage multiple versions of Python

Managing virtual environments with pyenv

Jupyter

Connecting to your notebook kernel using Jupyter console

Unit testing in Jupyter notebooks

Using the %autoreload magic to make IPython and Jupyter development easier

How to view all your variables in a Jupyter notebook

How to use ipywidgets to make your Jupyter notebooks interactive

4 ways to run Jupyter notebooks

Finance

Data

How to connect to Interactive Brokers using Python – the basics

How to get historical market data from Interactive Brokers using Python – pulling down historical bars

3 ways to get historical market data from IEX Cloud

Backtesting

Backtrader, a Python backtesting and trading framework

Zipline, an open source backtesting library from Quantopian