Improving Python Ecosystem Security
In Collaboration with Trail of BitsWebsite
Python is a programming language that’s widely used for developing websites and software, task automation, data analysis, and data visualization. It often ranks as second in surveys of most widely used programming languages, and first in languages developers want to learn. Most Python developers rely on FOSS packages to develop their applications in a safer and more efficient manner, by reusing tested and specialized code. PyPI (The Python Package Index) is one of the most used repositories to obtain those packages.
STF is investing in some critical engineering, maintenance, and support to the PyPI package repository, as well as some of its highly used and critical packages that developers rely upon to secure their applications. To supplement this work, this project will also support the Sigstore code-signing community and improve integrations with the Python, Ruby, and Rust ecosystems. This ensures that developers can rely upon package repositories like PyPI without falling to supply chain and other dependency-related attacks.
Why is this important?
This work targets some of the most highly depended upon Python packages, relied upon by the private sector, academia and civil society. Critical improvements to the infrastructure that delivers these components to developers helps ensure the safe development and future maintainability, leading to long-term security for these highly depended upon components.
Finally, the Sigstore integrations will help protect the Python, Ruby, and Rust ecosystems from supply chain attacks. The German Federal Ministry of Information Security declared these as one of the top three threats to the economy in its 2022 report on the situation of IT Security in Germany.
What are we funding?
The following projects that will be affected by this work:
- PyCA Cryptography is a top-20 Python package, providing modern and high-quality APIs for both high-level and low-level cryptographic operations. A notable limitation of PyCA Cryptography’s current APIs is the lack of X.509 Certificate Validation. Successful completion of this work would have a significant and positive impact on the overall security posture of cryptographic code written in Python: PyCA Cryptography is the single most popular cryptography library for Python, with nearly 160 million PyPI downloads per month. The availability of a mature and misuse-resistant X.509 Certificate Validation API would allow thousands of critical Python packages to migrate away from ad-hoc and bug-prone solutions built on top of pyOpenSSL and other legacy packages, improving the overall security posture of the PKI ecosystems that secure TLS and other critical components of the Internet.
- BoringSSL is a light-weight and modern fork of OpenSSL, and is one of PyCA Cryptography’s supported backends. However, the BoringSSL backend is not at feature parity with the OpenSSL or LibreSSL backends. By improving feature parity between BoringSSL and OpenSSL, it would improve the applicability and usability of PyCA Cryptography across platforms and deployments where OpenSSL is not an acceptable or usable backend (for security, policy, or other reasons). It would also improve the overall security of PyCA Cryptography and all downstream projects by reducing unnecessary dependencies on OpenSSL.
- pyOpenSSL is a top-50 package on PyPI with approximately 62 million downloads per month, but is also considered a legacy package: many of its APIs are misuse-prone and functionally duplicated against safer, easier-to-use APIs in PyCA Cryptography. As part of discouraging unnecessary usage of pyOpenSSL and encouraging users to migrate to PyCA Cryptography, the pyOpenSSL maintainers have identified a variety of obsolete APIs that are suitable for deprecation and removal. Another objective is to update the pyOpenSSL test suite, which helps the OpenSSL ecosystem catch serious bugs and defects before they reach end users. It would also improve the overall maintainability and health of pyOpenSSL itself.
- M2Crypto is a historic low-level Python wrapper for OpenSSL, with many legacy users. Its public APIs are direct interfaces to OpenSSL’s low-level and unsafe APIs, meaning that use of M2Crypto can lead to unsafe implementations. M2Crypto’s maintainer considers the project to be in a “legacy” state, with compatibility and bug fixes dominating maintenance time. This project will provide compatibility engineering tasks, aimed at migrating many of M2Crypto’s APIs to safe shims via PyCA Cryptography. Additionally, a survey will be done of M2Crypto’s dependents, identifying APIs that can be safely deprecated (and eventually completely removed) due to lack of use. This work would allow a significant portion of the Python cryptographic ecosystem to transparently switch to a safer and more actively maintained dependency for cryptographic primitives and building blocks.
- PyPI is the official third-party software repository for Python. Warehouse is a rewrite of the original PyPI codebase, and inherited significant legacy behavior from its predecessor. Much of that behavior no longer serves a significant purpose in the current codebase and should be removed or refactored to improve overall maintainability and development velocity, but has not been due to a lack of maintainer time. This work would improve Warehouse’s overall maintainability by removing or refactoring components that have become burdensome for maintainers and also help simplify security critical components of Warehouse (the AuthN/Z layers and API tokens), making them easier to audit and comprehensively test.
- The Python programming language includes the ssl module, which exposes an SSL/TLS API. This API is widely adopted in the Python ecosystem due to its default availability, but also presents significant usability, security, and maintainability risks. This effort will help revive the Unified TLS standardization effort, include much needed updates to the API from the last 6 years, and deprecate APIs that have replacements. This would have a substantial impact on the overall health, usability, and maintainability of the Python standard library.
- Sigstore is an open source project that allows developers and users to sign and verify code. This work would align Warehouse with the overall package management ecosystem shift towards sigstore, and away from ad-hoc PGP signatures, and further securing the software supply chain for downstream users. Furthermore, the development of a unified packaging client policy across ecosystems and improvements to the Rust and Ruby sigstore clients will help bring more value to the sigstore ecosystem, and encourage adoption of this new effort among FOSS software developers.