David Hewitt

@davidhewitt

Hello world, and the future for PyO3

davidhewitt

Hello everyone! 👋

I've been thinking about starting a blog for some time. Working on PyO3 creates ideas that I think many people would find interesting to read about. Finally I have found the time to get started and to choose a platform to publish on.

I really like Polar's vision to empower Open Source developers to create a sustainable ecosystem with their community. Accordingly, I've decided to join as a creator on this platform, both to support Birk and the rest of the team at Polar and my own aspiration to establish deeper connections with the PyO3 community. Birk has very generously agreed to personally be my sponsor until June, which makes a significant impact in funding my choice to work 80% part time and devote the remainder of my week to PyO3 and its ecosystem.

I plan to publish a mixture of tutorial material and my thoughts related to PyO3, Rust, and Python software development. If that interests you, please consider joining me as a subscriber:

Today, we'll begin with a high-level tour: who am I, a look at PyO3's growth, and where I want PyO3 to be going.

Introduction

First, a quick introduction. I'm David, and I've been working for the past four-and-a-bit years on PyO3. I think it's fair to call PyO3 the de-facto way to write software that is implemented in a mixture of Rust and Python, whether that's Rust-embeds-Python or Python-embeds-Rust. The most well-known examples of Python packages that have a PyO3 core are:

  • pydantic, for which I work as part of the team since middle of last year,
  • polars, which is having immense growth as a replacement for pandas, and
  • cryptography, which is a long-established cornerstone of Python's security ecosystem.

While I was not PyO3's original creator, at this point I am the longest-serving active contributor and the largest by volume too.

image.png

(The graph extends back to 2015, which is when rust-cpython was first created. PyO3 was created as a fork in 2017 for a mixture of maintenance and design reasons. I am not part of that original history.)

I still haven't worked out how to describe myself in this role. PyO3 is a project owned by and for the community. I don't think "lead maintainer" is strictly correct to describe me, as I haven't been voted into any such lead position. "Representative", maybe? "Evangelist"? "Key typewriter"? "Enthusiast"? I can't decide. If you have a good suggestion, I'll gladly hear it!

The point is, if you're using Python software with a Rust component, or Rust software with a Python component, it's probably built on top of code I've been involved in writing. 😊

PyO3's growth since 2019

When I first started contributing to PyO3 in late 2019, PyO3 was available on "nightly" Rust only. The impression I got is that it was used mostly by the scientific community for research pieces that wanted a Rust core for Python code. I can't recall download numbers of the time, but I am sure they were nothing like today.

My own interest in PyO3 came from having used C++ and Python as a combination professionally, using pybind11. I had been a hobbyist Rust user for a couple years and decided that I was able to contribute to the ecosystem by helping develop PyO3 based on what I'd used previously.

Independent of the effort which has gone into PyO3, Rust has really taken off in the past few years. You probably don't need me to show you the examples of Rust adoption at major companies like Microsoft or Amazon, nor to tell you that the Linux kernel is now accepting Rust code. It's no surprise that PyO3 has grown alongside Rust, yet this would not have happened without dedication from myself and my fellow PyO3 maintainers.

Let's do a quick summary of some high-level stats of PyO3 today (Feb 2024):

Just over 10 thousand stars on GitHub

image.png

60 thousand direct downloads per day on crates.io

image.png

This works out at 1.3 million downloads per month, according to lib.rs, which also lists PyO3 as the #1 crate in the FFI category for the Rust ecosystem

image.png

(FFI is short for "Foreign Function Interface", which is a term used to describe software which calls into software written in another language. In other words, this suggests that Python is the most popular language to pair with Rust!)

Finally, Python packages containing PyO3 code are downloaded billions of times every year! 🤯

How do I get to that count? Well, as a very crude measure, we can just sum the monthly downloads of cryptography (253 million), pydantic-core (60 million), and polars (3 million) according to PyPI download stats (credit hugovk), and multiply by 12 to get to an annual figure:

(213 + 60 + 3) million ✕ 12 = 3.31 billion annual downloads
(Numbers correct as of Feb 2024)

This doesn't even begin to try to count the long tail of other software out there using PyO3, nor the distribution that doesn't go through PyPI.

I am absolutely blown away to be working on something that is used so widely, whether the vast majority of users are aware of it or not.

Ok, time to talk about the future.

Short term

Looking ahead to what I would immediately like to see for PyO3, there are two big milestones for the year:

  1. First we need to release PyO3 0.21, which I am equally excited and nervous to send downstream.
  2. Second, we need to be ready for Python 3.13's release later this year.

Let's talk a little bit more about each of these.

PyO3 0.21

The main feature coming in PyO3 0.21 is what we're now calling the new "Bound" API, to replace the to-be-deprecated "GIL Refs" API. This is something that I've been thinking about on-and-off as a must-do thing since I started on PyO3. It solves a fundamental inefficiency in PyO3's current "GIL Refs" API in a way that makes the migration for users as gentle as possible. I believe this needed to be done before we could consider a PyO3 1.0.

I could write more than a post about the "Bound" and "GIL Refs" APIs (and I probably will at least once), so we won't go into detail about these APIs right now. If you're interested you can read more on GitHub and see the tracking issue which shows what work we're yet to finish.

Perhaps the most emphatic image I can present to showcase the potential of this new API is of the branch in pydantic-core that I've been basing against a near-complete prototype of this "Bound" API. It shows ~15%+ performance wins across a wide range of benchmarks, and has encouraged a bunch of refactors that should make it easier for us to optimize pydantic-core further in the future:

image.png

(By the way, I've been really enjoying codspeed.io for continuous benchmarking on GitHub Actions. Go check it out!)

To enable the ecosystem to migrate to the new API, PyO3 0.21 will contain a lot of backwards-compatibility code. When we remove this in the future we should be able to unlock a few smaller further performance gains. I'm optimistic that this new API is a great step in efficiency for the whole ecosystem!

Python 3.13

Python 3.13 is scheduled to be released in October this year, and one of the major features that is expected to arrive is the "freethreaded" variant of Python, aka "nogil" or PEP 703.

This variant removes the Global Interpreter Lock (GIL) from CPython, making it possible for several Python threads to execute code simultaneously. This brings huge potential for Python code to step up massively in performance, at the cost of increased complexity. This complexity is expected to be felt most intensely by the people maintaining "native" extension modules (i.e. those typically implemented in C, C++, or Rust). This is because introducing such a change breaks many assumptions about the order in which code executes, leading to data races that create incorrect results, security bugs, and fatal crashes.

Rust is well known to be great at protecting developers from introducing data races. I think there is a huge potential for PyO3 and the Rust ecosystem to offer Python extension module maintainers a way forward through the additional complexity. First, however, we PyO3 maintainers need to design new APIs suitable for this "nogil" mode. This is a sizeable research task, as we are well aware of many parts of PyO3 that need to be redesigned to account for these changes.

Long Term

Looking further ahead, I have come to think that the most useful thing I can do for PyO3 is build its community (yes, it's not the first time I've chosen to use "community" in this post). My motivation for working on PyO3 is because I think Python and Rust are a great pair of languages, and PyO3 is a tool that should be available for everyone to use to work with this pair. Ideally, everyone has a good experience working in this space. Let me try to explain why that means a community is what I want to see most.

I began contributing to PyO3 when it was a much smaller project with a lot of potential. The thing PyO3 needed most was the rough edges rounding out and missing functionality implemented. I love writing Rust code; there are few things I enjoy better than sitting down with a fresh coffee and working on PyO3's codebase. I've done a fair bit of that now, and I plan to do more yet in the years ahead. I want to see PyO3 reach 1.0, maybe even this year or the year after (and perhaps 2.0 and beyond someday too).

Somewhere along the way, what PyO3 needs most has changed (and I have been slow to realise this). In the past year, I have spoken to many people who tell me how they enjoy using PyO3 in their projects. There are blog posts, videos, and conference tutorials teaching use of PyO3. It's used in production worldwide. It's really, truly, heartwarming to see.

All this usage adds up to many unique experiences people have had with PyO3. I think there is a need for a community space for people to connect.

  • Where people can share their feedback, positive or negative, to help us build a better PyO3.
  • Where people can share projects built with Rust and Python and bounce ideas off others who're doing similar things.
  • Where people can find encouragement to contribute to PyO3, whether that be helping its community, maintaining its codebase, or coordinating offshoot projects which make Rust and Python better.
  • Where Python programmers new to Rust (or vice versa) can ask for help from likeminded individuals.

We have a Gitter channel, but it's nearly dead at time of writing. I've been thinking about setting up a Discord server for PyO3 users. Or maybe something like Zulip would fit better; I haven't made a final decision, and would be glad to hear your opinion.

For now, I'm trying to be more connected on social media. I managed to almost completely stay away from social media for the first 30 years of my life. I'm definitely not the sort of person to post continuous updates about my life; even if I had cats I wouldn't be filling your feeds with images of them. Despite this, the interactions I've had on these sites in the past few months have been invaluable. You can find me on LinkedIn, X, and Mastodon.

A month ago I started streaming PyO3 development on YouTube. The goal is to do a regular Friday morning slot at 10am UTC. I want it to be a channel to show people what it's like to work on the boundary of these two languages; I think there's a lot which people would be interested to see. Doing this has already welcomed new people into the PyO3 community and strengthened my connections with Python core developers.

I'm doing all this to expand the ways I'm connected to PyO3's community. I want to hear from you and welcome your feedback and ideas. If you're interested in cultivating our community, or becoming a fellow PyO3 contributor, please take that leap! There are more possibilities for Rust and Python than my fellow maintainers and I can find time to build, and it's so rewarding to offer a useful tool to so many people. My wife and I are expecting our second child in the summer; my time is only going to be stretched further in future.

Finally, I am also a user of Python and Rust. I'm having experiences with these languages and have my own feedback. I should have sent this feedback upstream from the beginning (again I was late to realise this; I used to think I would first see PyO3 "done" and then start contributing to the languages to build an even better future PyO3). Starting this blog is one of the ways I can send that feedback. As well as social media, I'm also trying to do better at checking in on the Rust Zulip and the Python Discourse.

I think there's a great future ahead for PyO3, and the wider Rust and Python crossover ecosystem. Building a community is a necessary step to realising that future.

Hopefully you'll join me in growing this community! Start by watching this space.