Vulnerability in Legacy Document Deletion on PyPI¶
An exploitable vulnerability in the mechanisms for deleting legacy documentation hosting deployment tooling on PyPI was discovered by a security researcher, which would allow an attacker to remove documentation for projects not under their control.
- Disclosure date: 2021-07-25 (Reported via security policy on pypi.org)
- Disclosed by: RyotaK
- Bounty awarded to discloser: $1,000 USD for multiple reports in 2021-07
Summary¶
At one point PyPI supported uploading documentation in addition to distribution files. This functionality was under-utilized and slowly deprecated/removed starting in 2016 and was not included in the 2018 re-write of PyPI.
Instead, for projects that had previously hosted documentation on PyPI, the new PyPI presented them with the ability to remove/destroy the existing documentation on PyPI in favor of using an external service.
To quote the discloser:
This feature is added a few years ago by this pull request: https://github.com/pypa/warehouse/pull/3413 As you can see from the pull request above, there is an endpoint located at
manage/project/{project_name}/delete_project_docs/
that deletes the legacy documentation. And this endpoint calls thedestroy_docs
function which passesproject.name
intoremove_documentation
function.Then,
remove_documentation
passesproject_name
into theremove_by_prefix
function ofS3DocsStorage
.Since
remove_by_prefix
uses list_objects_v2 with the prefix, all files that start with the specified project name will be returned. (e.g. Ifp
is specified in the prefix, it will returnpypi
,pip
,python
… etc.)As far as I can see from these codes, there is no suffix in the project name (e.g.
/
).This means that if there is a project called
examp
, and their owner decides to delete the legacy documentation, documentation for projects that have a name starting withexamp
will be deleted. (e.g.example
)
Analysis¶
Many projects implement “psuedonamespaces” on PyPI, for discoverability and
organizational purposes, particularly those which implement plugin or extension
frameworks. In our analysis, the only impact of this vulnerability appears to
have been accidental, in which maintainers for a top-level project (e.g.
framework
) intentionally initiated documentation deletion for their
project, which then cascaded to plugin/extension projects which shared the
prefix (e.g. framework.foo
, framework-bar
).
Mitigation¶
This vulnerability was fixed in https://github.com/pypa/warehouse/pull/9839 via
https://github.com/pypa/warehouse/pull/9839/commits/3afcac795619b0b06007d0fb179d3ca137ed43b7
by adding a trailing slash to the project name used with remove_by_prefix
.
Audit¶
A dump of Project name
and has_docs
flags from the database, Journal
and Project Event records implemented by PyPI, along with a full listing of the
documentation hosting S3 bucket were collected for audit and analysis.
By comparing the has_docs
flag for each Project with the status of matching
documentation in the S3 bucket listing, we were able to identify 96 Projects
out of 3,632 for which the flag in the database was incorrect.
This delta represents projects for which documentation on the legacy hosting service is “missing”.
77 of the missing Project documents were identified as being accidentally deleted due to the extension/plugin concern discussed in the Analysis section.
The remaining 19 missing Project documents are not explainable via the
vulnerability discussed here, as no docdestroy
events are recorded which
share the prefix for their name. The legacy document hosting service
administration has varied over the years, and it is very likely that these
documents were directly removed by administrators or lost during migrations and
recovery attempts.
Timeline¶
- 2018-03-25: “Destroy documentation” feature added in (PR #3413)
- 2021-07-25: Issue reported by RyotaK following guidelines in security policy on pypi.org)
- 2021-07-26 (+1days): Fix is implemented and deployed in commit 036fdc