Prevent DoS by large str-int conversions

Warning

This resource is maintained for historical reference and does not contain the latest vulnerability info for Python.

The canonical database for vulnerabilities affecting Python is available on GitHub in the Open Source Vulnerability (OSV) format. This vulnerability can be viewed online at the Open Source Vulnerability Database.

A Denial Of Service (DoS) issue was identified in CPython because we use binary bignum’s for our int implementation. A huge integer will always consume a near-quadratic amount of CPU time in conversion to or from a base 10 (decimal) string with a large number of digits. No efficient algorithm exists to do otherwise.

It is quite common for Python code implementing network protocols and data serialization to do int(untrusted_string_or_bytes_value) on input to get a numeric value, without having limited the input length or to do log("processing thing id %s", unknowingly_huge_integer) or any similar concept to convert an int to a string without first checking its magnitude. (http, json, xmlrpc, logging, loading large values into integer via linear-time conversions such as hexadecimal stored in yaml, or anything computing larger values based on user controlled inputs… which then wind up attempting to output as decimal later on). All of these can suffer a CPU consuming DoS in the face of untrusted data.

Everyone auditing all existing code for this, adding length guards, and maintaining that practice everywhere is not feasible nor is it what we deem the vast majority of our users want to do.

This issue has been reported to the Python Security Response Team multiple times by a few different people since early 2020, most recently a few weeks ago while I was in the middle of polishing up the PR so it’d be ready before 3.11.0rc2.

After discussion on the Python Security Response Team mailing list the conclusion was that we needed to limit the size of integer to string conversions for non-linear time conversions (anything not a power-of-2 base) by default. And offer the ability to configure or disable this limit.

The fix adds PYTHONINTMAXSTRDIGITS=digits environment variable, -X int_max_str_digits=digits command line option and sys.set_int_max_str_digits(digits) function to configure the new limit. Use a limit of 0 digits to disable the limit. The fix also adds sys.get_int_max_str_digits() function and sys.int_info.default_max_str_digits (compiled-in default limit) and sys.int_info.str_digits_check_threshold (lowest accepted value for the limit) variables

The json.load() denial of service was first reported as a public pydantic issue in May 2020. Then it was reported to the Python Security Response Team by multiple persons:

  • Larry Yuan (May 5, 2020)
  • Tom Christie (May 6, 2020) via Sebastián Ramírez
  • Mike Gagnon (August 3, 2022)

Dates:

  • Disclosure date: 2022-08-08 (Python issue gh-95778 reported)
  • Reported at: 2020-05-05 (PSRT email)
  • Reported by: Larry Yuan

Fixed In

Python issue

CVE-2020-10735: Prevent DoS by large int<->str conversions.

  • Python issue: gh-95778
  • Creation date: 2022-08-08
  • Reporter: gpshead

CVE-2020-10735

A flaw was found in python. In algorithms with quadratic time complexity using non-binary bases, when using int(“text”), a system could take 50ms to parse an int string with 100,000 digits and 5s for 1,000,000 digits (float, decimal, int.from_bytes(), and int() for binary bases 2, 4, 8, 16, and 32 are not affected). The highest threat from this vulnerability is to system availability.

Timeline

Timeline using the disclosure date 2022-08-08 as reference: