Bring up the subject of Jupyter notebooks around Python developers and you’ll likely get a variety of opinions about them. Many developers think that using notebooks can promote some bad habits, cause confusion, and result in ugly code. A very common problem raised is the idea of hidden state in a notebook. This hidden state can show up in a few ways, but one common way is by executing notebook cells out of order. This often happens during development and exploration. It can be common to modify a call, execute it multiple times, and even delete it. Once a cell is deleted or modified and re-executed, the hidden state from that cell remains in the current session. Variables, functions, classes, and any other code will continue to exist and possibly affect code in other cells.
This causes some obvious problems, first for the current session of the notebook, and second for any future invocations of the notebook. In order for a notebook to reflect reality, it should contain valid code that can be executed in order to produce consistent results. Practically, you can work towards this goal in a couple of ways.
If your notebook is small, and runs quickly, you can always restart your kernel and run all the code again. This mimics the more typical development of unit testing or running scripts from the command line (or in an IDE integration). If you just run a new Python instance with the saved code, no hidden state can exist and the output will be consistent. This will make sense for small notebooks where you can quickly visualize all the code and verify it on inspection.
But this may not be practical for all cases.
If a developer doesn’t want to continually restart their interpreter, they can also view what the current state is. Let’s walk through a few ways to do this, from the simple to more complex. Note that this code example uses Jupyter 6.15 with IPython 7.19.0 as the kernel.
First, let’s make some data.
import numpy as np def a_function(): pass class MyClass: def __init__(self, name): self.name = name var = "a variable" var2 = "another variable" x = np.ones(20)
Now once a cell with the above Python code has been executed, I can inspect the state of my current session by either executing a single cell with one of the variables, in it, or using the IPython
display function. A cell will display the value of the last row in the cell (unless you append a
; at the end of the line). If using the default interpreter,
display is not available, but executing any variable will show you the value (based on its
display(a_function) display(var2) display(MyClass) display(x) var
<function __main__.a_function()> 'another variable' __main__.MyClass array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]) 'a variable'
But what if the code is gone?
OK, this above method is obvious, we can view items that we know exist. But how do we find objects that we don’t know exist? Maybe we deleted the cell that created the values, or if we’re using an IPython command line, our history is not visible anymore for that code. Or maybe we edited the cell a few times and re-executed it, and changed some variable names.
One function to consider is the
dir builtin. When you invoke this function with no arguments, it will return a list of all the variable names in the local scope. If you supply a module or class, it will list the attributes of the module or the class (and its subclasses).
When we do this, we can see that our variables are all present. Note this is available in standard Python, not just IPython.
['In', 'MyClass', 'Out', '_', '_2', '__', '___', '__builtin__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', '_dh', '_i', '_i1', '_i2', '_i3', '_ih', '_ii', '_iii', '_oh', 'a_function', 'exit', 'get_ipython', 'np', 'quit', 'var', 'var2', 'x']
Woah, there’s also a lot of other stuff in there. Most of the variables are added by IPython and relate to command history, so if you run this sample with the default interpreter, there won’t be quite as many variables present. Also, some functions load up at startup (and you can configure IPython to load others as well). Other objects exist because Python places them in the global scope.
Note that the special variable
_ is the value of the last executed cell (or line).
There are two other functions that are helpful:
globals. These will return the symbol table, a dictionary keyed by the variable names and containing the values. For
globals this is the values for the current module (when invoked in a function or method, the module is the one where the function was defined, not where it was executed).
locals is the same as
globals when invoked at the module level, but free variables are returned when invoked in function blocks.
Note, don’t modify these tables, it will impact the running interpreter.
locals() # get the full dictionary globals()['var'] # grab out a single value
Can I see something a little nicer?
Working with a big dictionary that has some extra values added by IPython might not be the easiest way to inspect your variables. You could build a function to beautify the symbol table, but luckily there’s already some nice magics for this. (Magics are special functions in IPython, look here for a quick intro to magics, and specifically the
Jupyter/IPython provide three helpful magics for inspecting variables. First, there is
%who. With no arguments it prints all the interactive variables with minimal formatting. You can supply types to only show variables matching the type given.
%who MyClass a_function np var var2 x # just functions %who function a_function
%who_ls magic does the same thing, but returns the variables as a list. It can also limit what you see by type.
%who_ls ['MyClass', 'a_function', 'np', 'var', 'var2', 'x'] %who_ls str function ['a_function', 'var', 'var2']
The last magic is
%whos, it provides a nice formatted table that will show you the variable, type, and a string representation. It includes helpful information about Numpy and pandas data structures.
%whos Variable Type Data/Info ---------------------------------- MyClass type <class '__main__.MyClass'> a_function function <function a_function at 0x10ca51e50> np module <module 'numpy' from '/Us<...>kages/numpy/__init__.py'> var str a variable var2 str another variable x ndarray 20: 20 elems, type `float64`, 160 bytes
Now if you want to get fancy, Jupyter has an extension available through nbextensions. The Variable Inspector extension will give you a nice option for viewing variables in an output similar to the
%whos output above. For developers used to an IDE with an automatically updating variable inspector, this extension may prove useful and worth checking out.
After looking at the variables defined in your local scope, you may want to remove some of them. For example, if you deleted a cell and want the objects created by that cell to be removed, just
del them. Verify they are gone with any of the methods above.
del var2 %whos
Variable Type Data/Info ---------------------------------- MyClass type <class '__main__.MyClass'> a_function function <function a_function at 0x10ca51e50> np module <module 'numpy' from '/Us<...>kages/numpy/__init__.py'> var str a variable x ndarray 20: 20 elems, type `float64`, 160 bytes
Now you know of a few tools that you can use to look for variables in your current Python session. Use them to better understand the code you’ve already executed and maybe save yourself a little bit of time.