Reading date arguments to a Python script using argparse

I bet I’ve written hundreds of Python scripts during my career that take a date parameter on the command line. I often need short utility scripts that can run on a subset of data limited by dates. My preferred way to parse command line options in Python is to use the core argparse module. But argparse doesn’t have built in support for date handling. The first time I used it, I did all the date parsing after I had done the argument parsing. In this article, I’ll show you two simple ways to add date argument parsing directly to your command line parsing.

First, let’s do a quick review of argparse and how it works. Then we’ll add a simple date parser, followed by a slightly more complex one.

argparse basics

The argparse module has a good tutorial that I’d recommend as a starting point if you’ve never written a command line script in Python before, or haven’t used argparse. I’m going to assume that you at least know how to create an ArgumentParser and have it parse some basic arguments of different types. Let’s look at an example. Note that since this example code is running inside a Jupyter notebook, I’ll always pass in my own arguments here, but in your command line script, you’ll just call parser.parse_args() to parse the arguments from the command line.

import argparse

parser = argparse.ArgumentParser()
# since this example is running in Jupyter, I'll always pass in the arguments 
try:
    parser.parse_args(["-h"])
except SystemExit: # calling help will call SystemExit, we can catch this instead
    pass
usage: ipykernel_launcher.py [-h]

optional arguments:
  -h, --help  show this help message and exit

OK, let’s add a few arguments of different types.

parser = argparse.ArgumentParser()
parser.add_argument('-n', '--number', type=int, help='Number of times to run')
parser.add_argument('-x', '--extension', type=str, help='File extension to search for')
parser.add_argument('-d', '--debug', action='store_true', help='Turn on debug logging')

parser.parse_args(["-n", "10", "--extension", ".xls", "-d"])
Namespace(debug=True, extension='.xls', number=10)

Now let’s try it with some sort of date argument. Let’s do the naive thing and just see if we could set the type to a date. Will that just work?

import datetime

parser = argparse.ArgumentParser()
parser.add_argument('-s', '--start', type=datetime.date, help='Set a start date')

try:
    parser.parse_args(["-s", "2022-01-01"])
except:
    pass
usage: ipykernel_launcher.py [-h] [-s START]
ipykernel_launcher.py: error: argument -s/--start: invalid date value: '2022-01-01'

Alas, it’s not quite that easy. I didn’t show you the full output (I put this in a try/except), but the reason it doesn’t work is because datetime.date doesn’t take a single string argument in its contructor. The type argument in parser.add_argument can be any callable that takes a single string, and in this scenario argparse is just passing that string into the date constructor, which doesn’t work. It is expecting three int arguments instead.

So let’s do some basic date parsing like this:

Indexing in pandas can be so confusing

There are so many ways to do the same thing! What is the difference between .loc, .iloc, .ix, and []?  You can read the official documentation but there's so much of it and it seems so confusing. You can ask a question on Stack Overflow, but you're just as likely to get too many different and confusing answers as no answer at all. And existing answers don't fit your scenario.

You just need to get started with the basics.

What if you could quickly learn the basics of indexing and selecting data in pandas with clear examples and instructions on why and when you should use each one? What if the examples were all consistent, used realistic data, and included extra relevant background information?

Master the basics of pandas indexing with my free ebook. You'll learn what you need to get comfortable with pandas indexing. Covered topics include:

  • what an index is and why it is needed
  • how to select data in both a Series and DataFrame.
  • the difference between .loc, .iloc, .ix, and [] and when (and if) you should use them.
  • slicing, and how pandas slicing compares to regular Python slicing
  • boolean indexing
  • selecting via callable
  • how to use where and mask.
  • how to use query, and how it can help performance
  • time series indexing

Because it's highly focused, you'll learn the basics of indexing and be able to fall back on this knowledge time and again as you use other features in pandas.

Just give me your email and you'll get the free 57 page e-book, along with helpful articles about Python, pandas, and related technologies once or twice a month. Unsubscribe at any time.

Invalid email address
parser = argparse.ArgumentParser()
parser.add_argument('-s', '--start',
                    type=lambda d: datetime.datetime.strptime(d, '%Y-%m-%d').date(),
                    help='Set a start date')
parser.parse_args(["-s", "2022-01-01"])
Namespace(start=datetime.date(2022, 1, 1))

I’ll break that down a bit. The lambda takes one parameter. It will be passed the command line string token (which is 2022-01-01 in this case) and pass it into datetime.datetime.strptime with the right format, and then call date() on it to return just the date portion.

More complex date parsing

But what if you want to accept dates in multiple formats?

You could create a more complex lambda or a separate function to do this. But it turns out that someone has of course already created this, and it’s called dateutil. You can install it if it’s not already in your environment with pip install python-dateutil. The parse method will do a pretty good job of getting a valid date out of a string.

To wire this parsing up to our argument parsing, the argparse API can be extended via custom Action classes. The action class only needs to implement the __call__ method to take the correct arguments and deal with the values passed to it properly. The function signature includes the argparse.Namespace object and it’s recommended that you store the results of your processing in the namespace by the correct name, using setattr. We connect it to the argument parser by specifying it as the action for our argument.

from dateutil.parser import parse, ParserError

class DateParser(argparse.Action):
    def __call__(self, parser, namespace, values, option_strings=None):
        setattr(namespace, self.dest, parse(values).date())

parser = argparse.ArgumentParser()
parser.add_argument('-s', '--start',
                    action=DateParser,
                    help='Set a start date')
parser.parse_args(["-s", "1/1/2022"])
Namespace(start=datetime.date(2022, 1, 1))

This works with a pretty wide variety of date formats. It will also raise a ValueError (actually a subclass of it) if there’s a parsing error.

assert parser.parse_args(["-s", "1/1/2022"]).start == datetime.date(2022,1,1)
assert parser.parse_args(["-s", "2022-1-1"]).start == datetime.date(2022,1,1)
assert parser.parse_args(["-s", "Jan1,2022"]).start == datetime.date(2022,1,1)
assert parser.parse_args(["-s", "1-Jan-2022"]).start == datetime.date(2022,1,1)
assert parser.parse_args(["-s", "1-Jan-22"]).start == datetime.date(2022,1,1)
assert parser.parse_args(["-s", "January, 1 2022"]).start == datetime.date(2022,1,1)
assert parser.parse_args(["-s", "January 1st, 2022"]).start == datetime.date(2022,1,1)
try:
    parser.parse_args(["-s", "tomorrow"])
except ValueError as ex:
    print(ex)
Unknown string format: tomorrow

There you go, you can now easily add flexible date parsing of command line options to your Python scripts. I have to do this so often that I usually copy this code out of an existing script. I hope you find this article useful.

Have anything to say about this topic?