Beware: Overpowered plug-ins like PyPI can burn it all down

Beware-COVER
Here's why your software team needs to think twice before using a powerful third-party plug-in.

Developers are becoming very popular targets for malicious actors, as noted in one of our previous blog posts. They are being targeted in different ways, both direct and indirect. One widespread method is smuggling malicious payload into developers' information systems by hiding it in publicly hosted package repositories. Malicious actors are abusing the developers’ lack of awareness about the risks related to third-party software and the high level of trust they have in the code that's shared in such repositories. The mentioned blog also demonstrated how ReversingLabs Secure Software Platform can help developers protect against such threats, and prevent them before they can cause any real damage.

But malicious software components are not the only way to introduce unwanted security risks into your software solutions. When developers decide to use a software library, they often don’t think about all of the functionalities and complexity those libraries can bring. They just check if a software component implements their functionality, and don’t care if it is capable of doing a thousand other unnecessary things.

In most cases, when libraries are statically or dynamically linked into the solution, this doesn’t make a difference. However, during our research on public package repositories, we came across two very interesting third-party plug-ins. I would dare to call them “superhero” plug-ins because of the variety of functionalities they provide.

The reason why these specific plug-ins are dangerous is because they get registered as COM (Component Object Model) objects. Once registered, these COM objects can be used by any other process running in the same environment. With uncontrolled access to various powerful functionalities these plug-ins export can be abused by malicious programs to achieve their objectives, while also evading threat detection.

Recognizing and eliminating unneeded functionality from the products you publish reduces software security risks for you and your users. Here's how ReversingLabs Secure Software Platform helps developers detect and handle unneeded or unwanted complexity and behaviors in third-party software components.

Spotting unwanted code

For the purposes of this blog, we processed the Python Package Index (PyPI) repository containing almost 3,000,000 releases of over 300,000 different PyPI packages.Software packages consist of metadata files containing information about the package itself (package name, version, author…) and a whole pile of files used to implement the software functionality. In the case of the PyPI repository, the packages are expected to contain mainly Python source code files. But quite often it happens that the actual functionality is implemented in a compiled binary executable and then accessed through Python wrapper code. Examining such compiled binary executables can be quite challenging compared to examining Python source code. It requires a different type of knowledge and tools, so it isn’t usually performed during a PyPI security audit.

Software packages consist of metadata files containing information about the package itself (package name, version, author…) and a whole pile of files used to implement the software functionality. In the case of the PyPI repository, the packages are expected to contain mainly Python source code files. But quite often it happens that the actual functionality is implemented in a compiled binary executable and then accessed through Python wrapper code. Examining compiled code can be quite challenging when compared to examining Python source code. It requires a different type of knowledge and tools, so it isn’t usually performed during a PyPI security audit.

Packages hosted in the repository were processed with ReversingLabs Titanium Platform in order to extract embedded files and metadata for further analysis. Most of the extracted files were, as expected, textual files, but there were also 12,164,633 executable files present across 12,152 different PyPI packages. Extracted executable files included 6..7 million Linux ELF files and almost 2 million Windows PE files.

1 (1)

Figure 1: Number of files extracted from PyPI packages grouped by file type

Packages containing executable files were further inspected for potential security risks. When talking about security risks, we need to move on from the traditional understanding of security risk which focuses on malicious artifacts. Instead, we should think on the complexity and functionalities third-party dependencies can introduce into our systems, and how they could be misused by threat actors. Various security tools can be used to automate detection of malicious artifacts. For example, the Titanium Platform’s Explainable Machine Learning can do this. However, deciding if the complexity of some third-party dependency is acceptable is a more challenging task.

Several factors need to be looked at when estimating the potential risk of an executable file (beside the standard malware behavior indicators), the first being digital signature information. If your dependency tree contains components with invalid signatures or a suspicious publisher, they should be examined thoroughly. Secondly, you need to pay attention to an executable file’s behavior capabilities. There are several ways to determine what an executable is capable of, but the most basic is to look at the list of functions imported from system libraries and the list of functions the executable itself exports. The Titanium Platform gives this insight by extracting behavior indicators - human-readable descriptions of detected behavior patterns.

Another factor to search for are signs of packing or obfuscation. There can be many legitimate reasons to pack an executable file, but there are also some packers known to be used primarily for malicious purposes. No matter if the executable was packed because of legitimate or malicious reasons, the packed content needs further examination to get a complete insight into an executable’s capabilities. The Titanium Platform provides identification and automated extraction of packed content for a wide range of commercial and custom-made packers and protectors.

Make sure that the actual capabilities of your software dependencies are in line with your expectations. If you come across a component with an excessive set of functionalities which is at the same time packed, or you can’t verify the reputation of its publisher, you should rethink if you really want to include it into your software product.

We’ve covered some basic problems you should look out for. If you want to detect these and countless other software quality issues in an automated way, ReversingLabs’ Software Assurance Platform can do that for you.

Packages containing executable files were further inspected for potential security risks. When talking about security risks, we need to move on from the traditional understanding of security risk which focuses on malicious components. Instead, we should think about the complexity and functionalities third-party dependencies can introduce into our systems, and how they could be misused by threat actors. Various security tools can be used to automate detection of malicious artifacts. For example, our Explainable Machine Learning can do this. However, deciding if the complexity of some third-party dependency is acceptable is a more challenging task.

Several factors need to be looked at when estimating the potential risk of an executable file (beside the standard malware behavior indicators), the first being digital signature information. If your dependency tree contains components with invalid signatures or a suspicious publisher, they should be examined thoroughly. Secondly, you need to pay attention to an executable file’s behavior capabilities. There are several ways to determine what an executable is capable of, but the most basic is to look at the list of functions imported from system libraries and the list of functions the executable itself exports. Our platform gives this insight by extracting behavior indicators — human-readable descriptions of detected behavior patterns.

Another factor to search for are signs of packing or code obfuscation. There can be many legitimate reasons to pack an executable file, but there are also some packers known to be used primarily for malicious purposes. No matter if the executable was packed because of legitimate or malicious reasons, the packed content needs further examination to get a complete insight into an executable’s capabilities. Our platform provides identification and automated extraction of packed content for a wide range of commercial and custom-made packers and protectors.

We’ve covered some basic problems you should look out for. If you want to detect these and countless other software quality issues in an automated way, ReversingLabs’ Secure Software Platform can do that for you.

The Damo plug-in

Analyzing the PyPI packages containing Windows executable files revealed that a few of them contained an interesting library named “dm.dll”. Titanium Platform’s analysis shows that this UPX-packed DLL has a broad set of detected behavior indicators including screenshot taking, keyboard strokes monitoring, file system operations and much more.

2 (1)

Figure 2: Behavior indicators extracted from dm.dll using Titanium Platform

Detailed inspection also shows that it is in fact a COM object, and looking at the Type library file extracted from its resources shows that it exports 402 different functions through its interface. If such a high number of exported functions doesn’t make you think twice if all of them are necessary, looking at their names will surely raise eyebrows. These are just some of the names of functions exported by this COM object: GetOsType, ExitOs, DownCpu, CheckUAC, SetUAC, DownloadFile, RunApp, SendCommand, ExecuteCmd, DisableScreenSave, DisablePowerSave, VirtualAllocEx, GetProcessInfo, TerminateProcess. In order for packages to use exported functionalities, the dm.dll library has to be registered on the host system with the regsvr32 command-line utility. The registration is, if possible, performed with administrator privileges so it could support some of the privileged functionalities. Once registered, the COM object interface can be accessed from other processes running in the host environment.

Searching through the Python code responsible for registering the COM object, the plug-in's original name was found in one of the comments in Chinese language. Translating “大漠插件'' to English shows that this plug-in is in fact called “Damo”. Googling to find more about the Damo plug-in didn’t give too much information, but searching for the plug-in name in Chinese uncovers the origin of the plug-in. The first Google hit is the www.dmwebsite.net website. The website provides a list of supported features together with a few disclaimers regarding potential non-legal use to develop various game hacks (which could be a hint about the community where this plug-in could be popular). Unfortunately, the download link wasn’t working at the time of writing this blog.

Several popular repositories including GitHub, NuGet, NPM and PyPI were also searched for the “dm.dll” keyword, and in each of them at least one package encapsulating Damo plug-in functionalities was found. They include Java, JavaScript, Python, C# and Go wrappers intended to simplify the Damo plug-in usage for the software developed using these programming languages.

One of the discovered GitHub projects, DMProject, proved quite useful because of its README.md file which describes the exported functions in detail. Since the README file missed some of the functions seen in the interface extracted from the Type library file, we tried to find their descriptions in the linked help document. Unfortunately, the provided link is not accessible anymore, so we had to find it in some other way. After a bit of searching through Chinese-speaking forums, we found a downloadable ZIP file with an older version of the Damo plug-in that contained the help document. All of the descriptions in the document are written in Chinese, but everything can be more or less successfully translated to English using Google Translate. This way you can get a true sense for how powerful some of the exported functions are. For example:

DisableCloseDisplayAndSleep - Set the current power settings, prohibit turning off the display, prohibit turning off the hard disk, prohibit sleep, prohibit standby. Does not support XP.

AsmAdd - Add the specified MASM assembly instructions.

AsmCall - Execute the instructions added to the buffer with AsmAdd.

GitHub also hosts several projects that seem to use this plug-in to automate gameplay and simulate keyboard and mouse events for popular video games, such as League of Legends. Gamers are probably the perfect example of a community that would be willing to trade security for convenience, and would be willing to ignore security risks in order to install something that could help them during their gameplay. The above-mentioned GitHub projects have several forks, and it is hard to believe that anyone who forked the project thought about the security risks related to the plug-in for mouse and keyboard automation.

The Lewan plug-in

Besides dm.dll, another similarly powerful EXE file named “lw.dll” (aka “Lewan”) was found during research. This is also a COM object which, again, gets registered to the system using the regsvr32 command-line utility.

As with the Damo plug-in, the Lewan plug-in also has several wrappers for languages like Python, Go and C# hosted in public repositories including GitHub, PyPI and NuGet. Judging by the origin of plug-ins, they are obviously more popular in the Chinese-speaking developer community.

Also like the Damo dm.dll, lw.dll is also a self-modifying executable, or simply put — packed. In this case, it is with a custom packer rather than UPX. Opening it with a hex editor and examining it quickly shows that there are 12 more structures looking like PE file header inside its data section. Titanium Platform analysis shows that the data section has high entropy, above 7.0, which is a strong indicator that the file contains packed data.

3 (1)

Figure 3: Section metadata Titanium Platform extracted from lw.dll

Looking at the file in the debugger shows that these are indeed Windows executable files, and they are compressed with Microsoft's implementation of LZ compression.

Figure 4 shows the part of code responsible for decompression of the embedded PE files. When analyzing the extracted files in ReversingLabs cloud, none of them got classified as malicious by any of our threat detection partners, and it seems they all are legitimate components used to provide functionalities exposed by the registered COM object.

4 (1)

Figure 4: Code responsible for decompression using imported RtlDecompressBuffer function

Examining the exposed interface of this COM object shows that it implements all functionalities seen in the Damo plug-in, plus a wide range of functions used for interaction with mobile devices. This includes functions starting with the “ADB” prefix, which are probably just wrapper functions delegated to the adb.exe tool found in one of the extracted PE files.

5 (1)

Figure 5: Functions exposed by Lewan plug-in used for interaction with mobile phones

Searching for the origin of this plug-in again forced us to use Google Translate and perform searches in Chinese, because searching for terms written in English didn’t provide many relevant sources. Among the first search results are links to Chinese developer forums where it states that the only official Lewan plug-in versions are the ones distributed in those forums. One of those descriptions, together with a bunch of disclaimers against illegal use, is provided at this location bbs.anjian.com/showtopic-641096-1.aspx. From the discussions on this and similar forums, we can again notice that the most popular application of such plug-ins is automating screen operations, and users aren’t aware of all the other functionalities they expose through the use of such COM objects.

Another significant thing noticed while exploring these forums is that both the Damo and the Lewan plug-ins, at some point, decided to charge their users for the plug-in. Naturally, this resulted in the emergence of several pirate versions of the plug-ins. What’s better than using overpowered plug-ins from untrusted sources? How about using pirate versions of overpowered plug-ins from even less trusted sources?

Of course, Damo and Lewan aren’t the only plug-ins of this type; there are more plug-ins mentioned in other forum discussions with similar functions.

Ultimate LOLBin

In case you still don’t see the problem or recognize the threat that comes with installing such powerful plug-ins, we created a little proof of concept to demonstrate how easy it is for a malicious actor to take advantage of the registered COM object.

Figure 6 shows that fewer than twenty lines of code are enough to inject any kind of shellcode into a Notepad process running in the targeted operating system. Only four calls to the plug-in functions are needed to perform the process injection. The code doesn’t make any direct calls to Win32 APIs that threat detection systems would find to be suspicious. For example, the ones used for process enumeration.

6 (1)

Figure 6: Shellcode injection using Damo plug-in exported functions

The code hooks Notepad’s “Save as”' functionality and waits for the user to save a document, which triggers execution of the shellcode spawning up a reverse shell that connects to a remote listener.

And this example is made using only four out of 400 functions exported by the registered plug-in. Such plug-ins represent precious Living of the Land binary (LOLBin) gems for malicious actors. They provide an unbelievable range of functionalities and aren’t classified as malicious because their main purpose is to be used in development. Still, they lack any kind of access controls or restrictions. Once again, if you are a developer, think twice before using a powerful plug-in and carefully check what you install to your system. The benefits aren’t worth the potential risk.

Powerful plug-ins, and hard to detect

Even though malicious behavior wasn’t discovered in the powerful plug-ins described in this blog post, it is obvious that the interface these plug-ins expose could easily be abused by malware. Furthermore, such malware could perform all of the required functionalities without direct access to the common operating system APIs, making it extremely hard for security solutions to detect it before its execution.

A quick look at the behavior indicators extracted using Titanium Platform (Figure 2) is enough to make you think twice before deciding to use such plug-ins. To prevent mistakes like these, you need to include this kind of analysis into your development process in a standardized way.

At ReversingLabs, we are continuously improving our security products. Recently, we published a product called Secure Software Platform. The goal of this platform is to provide a solution that can detect security and licensing issues introduced through the use of third-party code early in the development lifecycle. In doing so, we help verify the integrity of the final software products before they get published.

Based on the analysis of the target software, the Software Assurance Platform is able to produce a report containing a Software Bill of Materials (SBOM) with a detailed explanation for each of the analyzed components. The report contains a comprehensive list of detected problems related to digital signatures, licensing issues, exposure of sensitive data, presence of known vulnerabilities, vulnerability mitigations, and unexpected or unwanted software behavior. All of the analyzed components are assigned a grade describing the severity of the detected issue and the level of effort needed to resolve it. Our platform also provides a set of configurable policy controls that clients can adapt to their internal development policies and rules.

The ultimate goal is to allow clients to integrate Secure Software Platform into their CI/CD pipelines to automatically prevent publishing software with severe threat issues, like the presence of vulnerable components or software compromised by malware. This enables developers to verify not only the third-party components used as an input to the development process, but also the final software product they intend to publish as the output of the development process. To make your software secure, you must verify the integrity of the components you use and the product you are releasing. Secure Software Platform can do that for you.