Critical Apache Parquet Java Flaw Exposes Big Data Systems to Remote Code Execution

A newly identified vulnerability in Apache Parquet Java (CVE-2025-46762) allows remote attackers to execute arbitrary code by exploiting Avro schema handling in the parquet-avro module. The flaw affects all versions up to 1.15.1 and poses a significant threat to big data platforms like Apache Spark and Hadoop. Users are urged to upgrade to version 1.15.2 or apply recommended mitigations immediately.

Critical Apache Parquet Java Flaw Exposes Big Data Systems to Remote Code Execution

New Apache Parquet Java Vulnerability Allows Remote Code Execution Through Malicious File Schemas

A newly discovered security flaw in Apache Parquet Java could allow attackers to remotely execute arbitrary code on systems processing compromised data files. Tracked as CVE-2025-46762, the vulnerability affects all versions of Apache Parquet Java up to and including 1.15.1, and is rated as critical.

Background and Impact

Apache Parquet is a widely used columnar data storage format, especially popular within big data frameworks such as Apache Spark, Flink, and Hadoop. The vulnerability resides in the parquet-avro module, which handles Avro schema deserialization from file metadata. If exploited, it enables attackers to inject malicious code into the metadata of a Parquet file—code that is then executed during the schema parsing process.

This security issue is only exploitable in systems configured to use Avro's “specific” or “reflect” data models. Systems using the “generic” model remain unaffected.

According to the advisory, while version 1.15.1 attempted to introduce safeguards by restricting untrusted Java packages, it still allows execution from certain pre-approved (“trusted”) packages by default. This loophole can be abused to run harmful classes if a malicious file is ingested.

Technical Prerequisites for Exploitation

To exploit CVE-2025-46762, an attacker must:

  • Deliver a maliciously crafted Parquet file containing a dangerous Avro schema.

  • Target a system running Apache Parquet Java version ≤ 1.15.1.

  • Exploit applications using the vulnerable parquet-avro module in combination with “specific” or “reflect” data models.

Mitigation Steps

Users of affected systems are strongly advised to:

  1. Upgrade to Apache Parquet Java version 1.15.2, which includes a comprehensive fix that tightens the security restrictions on executable classes.

  2. If upgrading isn’t feasible, users running version 1.15.1 can mitigate the issue by disabling deserialization from trusted packages. This is done by setting the JVM system property: