Skip to main content

Deserialization of Untrusted Data in pandas Module

PY511
deserialization_of_untrusted_data
CWE-502
⚠️ Warning
🔒 Professional Plan

The Python pandas module is a data analysis and manipulation tool. It contains a fucntion to read serialized data using the pickle format. Pickle is not secure because it can be used to deserialize malicious code. For example, an attacker could create a pickle file that contains malicious code and then trick a user into opening the file. When the user opens the file, the malicious code would be executed.

Example


warning
import pickle
import pandas as pd


df = pd.DataFrame(
{
"col_A": [1, 2]
}
)
pick = pickle.dumps(df)

pd.read_pickle(pick)

Remediation


Fix Iconfix

Consider signing data with hmac if you need to ensure that pickle data has not been tampered with.

Alternatively if you need to serialize sensitive data, you could use a secure serialization format, such as JSON or XML. These formats are designed to be secure and cannot be used to execute malicious code.

False Positives


In the case of a false positive the rule can be suppressed. Simply add a trailing or preceding comment line with either the rule ID (PY511) or rule category name (deserialization_of_untrusted_data).

Fix Iconfix
import pickle
import pandas as pd


df = pd.DataFrame(
{
"col_A": [1, 2]
}
)
pick = pickle.dumps(df)
# suppress: PY511
pd.read_pickle(pick)

See also