It is possible to unit test Python code that lives in a Jupyter notebook. This article looks at three reasonable choices.
You can use py-spy to profile already running Python code without restarting your process or modifying the source code.
Removing one or more columns from a pandas DataFrame is a pretty common task, but it turns out there are a number of possible ways to perform this task. I found that this StackOverflow question, along with solutions and discussion in it raised a number of interesting topics. It is worth digging in a little bit to the … Continue reading How to remove a column from a DataFrame, with some extra detail
Once we have debugged, working, readable (and hopefully testable) code, it may become important to examine it more closely and try to improve the code's performance. Before we can make any progress in determining if our changes are an improvement, we need to measure the current performance and see where it is spending its time. … Continue reading Profiling Python code with line_profiler
We would love for our Python programs to run as fast as possible, but figuring out how to speed things up requires gathering information about the current state of our code and knowing techniques to speed things up. First and foremost, we need to know where our program is spending its time, and what is … Continue reading Profiling Python with cProfile, and a speedup tip
If you've done any work in pandas, you've surely seen the SettingWithCopyWarning. This is an explanation of what's happening and how to fix it.
The query method in pandas DataFrames provides some flexibility in code, and potential speedups using numexpr.
This is the fifth post in a series on indexing and selecting in pandas. If you are jumping in the middle and want to get caught up, here's what has been discussed so far: Basic indexing, selecting by label and locationSlicing in pandasSelecting by boolean indexingSelecting by callable Once the basics were covered in the … Continue reading Selecting in Pandas using where and mask
In pandas, you can use callables where indexers are accepted. It turns out that can be handy for a pretty common use case.
This is the third post in the series on indexing and selecting data in pandas. If you haven't read the others yet, see the first post that covers the basics of selecting based on index or relative numerical indexing, and the second post, that talks about slicing. In this post, I'm going to talk about boolean … Continue reading Boolean Indexing in Pandas