Once we have debugged, working, readable (and hopefully testable) code, it may become important to examine it more closely and try to improve the code's performance. Before we can make any progress in determining if our changes are an improvement, we need to measure the current performance and see where it is spending its time. … Continue reading Profiling Python code with line_profiler
We would love for our Python programs to run as fast as possible, but figuring out how to speed things up requires gathering information about the current state of our code and knowing techniques to speed things up. First and foremost, we need to know where our program is spending its time, and what is … Continue reading Profiling Python with cProfile, and a speedup tip
If you've done any work in pandas, you've surely seen the SettingWithCopyWarning. This is an explanation of what's happening and how to fix it.
The query method in pandas DataFrames provides some flexibility in code, and potential speedups using numexpr.
This is the fifth post in a series on indexing and selecting in pandas. If you are jumping in the middle and want to get caught up, here's what has been discussed so far: Basic indexing, selecting by label and locationSlicing in pandasSelecting by boolean indexingSelecting by callable Once the basics were covered in the … Continue reading Selecting in Pandas using where and mask
In pandas, you can use callables where indexers are accepted. It turns out that can be handy for a pretty common use case.
This is the third post in the series on indexing and selecting data in pandas. If you haven't read the others yet, see the first post that covers the basics of selecting based on index or relative numerical indexing, and the second post, that talks about slicing. In this post, I'm going to talk about boolean … Continue reading Boolean Indexing in Pandas
Slicing data in pandas This is second in the series on indexing and selecting data in pandas. If you haven't read it yet, see the first post that covers the basics of selecting based on index or relative numerical indexing. In this post, I'm going to review slicing, which is a core Python topic, but has … Continue reading Indexing and Selecting in Pandas – slicing
The topic of indexing and selecting data in pandas is core to using pandas, but it can be quite confusing. One reason for that is because over the years pandas has grown organically based on user requests so there are multiple way to select data out of a pandas DataFrame or Series. Reading through the documentation can be … Continue reading Indexing and Selecting in Pandas (part 1)
Jupyter notebooks are a great way to explore data using Python (and other languages as well). Having a visual representation of your code and output, along with documentation and formatting in one view can be extremely helpful. However, there are some things that are just much better to do in a console session. In this … Continue reading Connecting to your notebook kernel using Jupyter console