pd.to_datetime: The Complete Guide to Parsing Dates in Python


If you work with data in Python, you will hit a moment where your dates are stored as strings and nothing works the way it should. Filtering by date fails. Sorting produces nonsense. Grouping by month returns an error. The fix almost always starts in the same place: pd.to_datetime. This function is Pandas’ built-in tool for converting strings, integers, and other date-like objects into proper datetime objects that Python and Pandas can actually work with. This guide covers how the function works, how to write pd.to_datetime format strings correctly, what format codes like yyyy mean in practice, and the edge cases that trip people up most often.

pd.to_datetime


What Is pd.to_datetime and Why Does It Matter?

pd.to_datetime is a Pandas function that parses date and time information from a variety of input types and converts them into datetime64 objects, which is Pandas’ internal datetime format. These objects support comparison, arithmetic, filtering, grouping, and resampling operations that plain strings do not.

The basic signature looks like this:

python
import pandas as pd

pd.to_datetime(arg, format=None, errors='raise', utc=False, unit=None, dayfirst=False, yearfirst=False)

The most important parameters for everyday use are arg (what you’re converting), format (the pattern of the date string), and errors (what happens when parsing fails).

Without this conversion, a column like "2024-01-15" is just text. With it, Python knows that value is a point in time, and you can do things like extract the month, filter for dates after a certain point, or calculate the number of days between two events.


Python String to Datetime: The Basic Cases

The simplest use of pd.to_datetime is passing it a string or a list of strings:

python
import pandas as pd

# Single string
date = pd.to_datetime("2024-01-15")
print(date)
# 2024-01-15 00:00:00

# List of strings
dates = pd.to_datetime(["2024-01-15", "2024-02-20", "2024-03-10"])
print(dates)
# DatetimeIndex(['2024-01-15', '2024-02-20', '2024-03-10'], dtype='datetime64[ns]', freq=None)

# Series column in a DataFrame
df = pd.DataFrame({"date_str": ["2024-01-15", "2024-02-20"]})
df["date"] = pd.to_datetime(df["date_str"])
print(df.dtypes)
# date_str    object
# date        datetime64[ns]

When the string follows the ISO 8601 format (YYYY-MM-DD), Pandas parses it without any additional instruction. This is the format Python and Pandas expect by default, and it is the format worth standardizing to wherever you have control over the data source.

The python string to datetime conversion works because pd.to_datetime uses an inference engine that tries common patterns automatically. This is convenient but can produce wrong results with ambiguous date formats, which is exactly why the format parameter exists.


pd.to_datetime Format: How to Write Format Strings

The format parameter tells Pandas the exact structure of your date string so it does not have to guess. Format strings use directive codes borrowed from Python’s datetime module. Here are the most common ones:

Directive Meaning Example
%Y 4-digit year 2024
%y 2-digit year 24
%m Month as number, zero-padded 01, 12
%d Day as number, zero-padded 05, 31
%H Hour (24-hour clock) 00, 23
%M Minute, zero-padded 00, 59
%S Second, zero-padded 00, 59
%B Full month name January
%b Abbreviated month name Jan
%A Full weekday name Monday
%p AM or PM AM, PM
%I Hour (12-hour clock) 01, 12

Using the format parameter looks like this:

python
import pandas as pd

# Date in DD/MM/YYYY format
date = pd.to_datetime("15/01/2024", format="%d/%m/%Y")
print(date)
# 2024-01-15 00:00:00

# Date with full month name
date = pd.to_datetime("January 15, 2024", format="%B %d, %Y")
print(date)
# 2024-01-15 00:00:00

# Date and time together
dt = pd.to_datetime("2024-01-15 14:30:00", format="%Y-%m-%d %H:%M:%S")
print(dt)
# 2024-01-15 14:30:00

The format string must match the input string character for character, including separators. If the input is "15-01-2024" and you write format="%d/%m/%Y", the function raises an error because it expects slashes but finds hyphens.


What Does yyyy Mean in Date Format Codes?

The yyyy notation appears in date format documentation outside of Python, particularly in Excel, SQL, Java, and many reporting tools. It represents a four-digit year in those environments.

In Python’s datetime formatting system, the equivalent is %Y (capital Y). Here is the direct mapping:

Other tools Python / Pandas
yyyy %Y
yy %y
MM %m
dd %d
HH %H
mm (minutes) %M
ss %S

So when you see yyyy-MM-dd in Excel or SQL documentation, the Python equivalent is %Y-%m-%d. This is a common point of confusion for people moving between tools. The yyyy meaning is always “four-digit year,” but the way you write it in Python requires the percent-sign directive format rather than the repeated letter format.


datetime python: The Standard Library vs Pandas

Python has two datetime systems: the standard library’s datetime module and Pandas’ pd.to_datetime function. They serve different purposes.

The datetime module works on individual datetime objects:

python
from datetime import datetime

# Parse a single string
dt = datetime.strptime("2024-01-15", "%Y-%m-%d")
print(dt)
# 2024-01-15 00:00:00

# Get current datetime
now = datetime.now()
print(now)
# 2025-05-09 14:22:00.000000 (example)

pd.to_datetime is built for working with Pandas Series and DataFrames, handling entire columns at once:

python
import pandas as pd

df = pd.DataFrame({"date": ["2024-01-15", "2024-02-20", "2024-03-10"]})
df["date"] = pd.to_datetime(df["date"])

# Now you can do datetime operations across the whole column
df["month"] = df["date"].dt.month
df["year"] = df["date"].dt.year
df["day_of_week"] = df["date"].dt.day_name()
print(df)

For single conversions, datetime.strptime from the standard library is fine. For columns in a DataFrame, pd.to_datetime is the right tool because it is vectorized and handles the entire column in one operation.


The errors Parameter: What Happens When Parsing Fails

Real data has messy dates. Some rows might have invalid dates, missing values, or unexpected formats mixed in. The errors parameter controls what pd.to_datetime does when it encounters a value it cannot parse.

python
import pandas as pd

messy_dates = ["2024-01-15", "not a date", "2024-03-10"]

# errors='raise' (default): raises an error on any invalid value
# pd.to_datetime(messy_dates, errors='raise')  # ParserError

# errors='coerce': converts invalid values to NaT (Not a Time)
result = pd.to_datetime(messy_dates, errors='coerce')
print(result)
# DatetimeIndex(['2024-01-15', 'NaT', '2024-03-10'], dtype='datetime64[ns]', freq=None)

# errors='ignore': returns the original input unchanged if parsing fails
result = pd.to_datetime(messy_dates, errors='ignore')
print(result)
# Index(['2024-01-15', 'not a date', '2024-03-10'], dtype='object')

errors='coerce' is the most useful setting for real-world data. It lets the conversion run on the whole column, marking unparseable values as NaT (which is Pandas’ equivalent of NaN for datetime data). You can then filter or fill those values separately.


Common Patterns and Real Examples

Here are the format patterns that come up most often in actual data work:

python
import pandas as pd

# ISO 8601 (default, no format needed)
pd.to_datetime("2024-01-15")

# US format: MM/DD/YYYY
pd.to_datetime("01/15/2024", format="%m/%d/%Y")

# European format: DD.MM.YYYY
pd.to_datetime("15.01.2024", format="%d.%m.%Y")

# With abbreviated month: 15-Jan-2024
pd.to_datetime("15-Jan-2024", format="%d-%b-%Y")

# With full month and year only: January 2024
pd.to_datetime("January 2024", format="%B %Y")

# Unix timestamp (seconds since 1970-01-01)
pd.to_datetime(1705276800, unit='s')
# 2024-01-15 00:00:00

# Datetime with timezone
pd.to_datetime("2024-01-15T14:30:00+00:00", utc=True)

Handling Mixed Formats in a Column

Sometimes a column contains dates in different formats from different data sources. The format='mixed' option (available from Pandas 2.0+) handles this:

python
import pandas as pd

mixed = ["2024-01-15", "15/01/2024", "January 15, 2024"]

# Older approach: errors='coerce' with a best-guess format, then clean up NaTs
result = pd.to_datetime(mixed, format='mixed', dayfirst=True)
print(result)

For anything more complex, the cleanest approach is to standardize the date strings before passing them to pd.to_datetime, using a preprocessing step that normalizes the format across the column.


Extracting Date Components After Conversion

Once you have a proper datetime column, the .dt accessor unlocks a set of properties that make time-based analysis straightforward:

python
import pandas as pd

df = pd.DataFrame({"date": pd.to_datetime(["2024-01-15", "2024-06-20", "2024-11-05"])})

df["year"] = df["date"].dt.year
df["month"] = df["date"].dt.month
df["day"] = df["date"].dt.day
df["day_name"] = df["date"].dt.day_name()
df["week"] = df["date"].dt.isocalendar().week
df["quarter"] = df["date"].dt.quarter

print(df)

This is where the datetime conversion pays off. None of these operations are possible on a string column. Once you convert, you can group by month, filter for weekdays, calculate age in days, or resample time series data by week or quarter.


Key Takeaways

  • pd.to_datetime converts strings, lists, and DataFrame columns from text or numeric formats into proper datetime objects that support date arithmetic and filtering.
  • Use the pd.to_datetime format parameter whenever your date strings are not in ISO 8601 (YYYY-MM-DD) format. The format string must match the input exactly, including separators.
  • yyyy meaning in other tools maps to %Y in Python. The full mapping: yyyy = %Y, MM = %m, dd = %d.
  • The python string to datetime conversion works automatically for standard ISO formats, but ambiguous formats (like 01/02/2024, which could be January 2nd or February 1st) need an explicit format string.
  • datetime python standard library handles single objects with datetime.strptime(). Pandas’ pd.to_datetime handles columns and is the right choice for DataFrame work.
  • Use errors='coerce' for messy real-world data. It converts unparseable values to NaT instead of crashing on the first bad row.
  • After conversion, the .dt accessor gives you year, month, day, day name, week, and quarter as simple column operations.