Apache Airflow server misconfigured to expose thousands of credentials

news

While investigating a misconfiguration in Apache Airflow, we found that many instances published on the web were leaking sensitive information, including credentials of well-known tech companies.

Apache Airflow is a popular open source workflow management platform for organizing and managing tasks.

Nicole Fishbein and Ryan Robinson, researchers at security firm Intezer, have released details of how they identified a misconfiguration across Apache Airflow servers run by a major tech company.

The misconfiguration exposed sensitive data, including thousands of credentials for popular platforms and services such as Slack, PayPal and Amazon Web Services (AWS), the researchers commented.

Apache Airflow is the #1 most popular open source workflow application on GitHub

Workflow management platforms are essential tools for automating business and IT tasks. These platforms make it easy to create, schedule, and monitor workflows. Typically, they are hosted in the cloud for accessibility and scalability. On the other hand, if misconfigured and allowed to be accessed via the Internet, these platforms are more likely to be exploited by attackers.

While investigating a misconfiguration of Apache Airflow, a popular workflow platform, we discovered a number of unprotected instances. These unprotected instances expose sensitive information for companies in industries such as media, finance, manufacturing, information technology (IT), biotechnology, e-commerce, health, energy, cybersecurity, and transportation.

Vulnerable Airflow has been found to expose credentials for common platforms and services such as Slack, PayPal, and AWS.

Key findings

A large number of misconfigured airflow instances expose credentials for popular services such as cloud hosting providers, payment processing, and social media platforms.

Disclosure of secrets, such as user credentials, can cause a data breach or provide an attacker with the ability to spread further within the system.

Your data that is exposed as a result of a data breach violates data protection laws and may result in legal action.

There is also the potential for malicious code execution and malware to be launched on the exposed production environment, and even on Apache Airflow itself.

Leaking credentials from services and platforms

We have identified leaks of credentials from commonly used services and platforms.

These unsecured instances have exposed sensitive information for companies in the media, finance, manufacturing, information technology (IT), biotechnology, e-commerce, health, energy, cybersecurity, and transportation industries.

The most common cause of credential compromise seen on Airflow’s servers was insecure coding practices.

For example, various production instances with hard-coded passwords in Python DAG code have been discovered

Production environment credentials compromised

Passwords should not be hard-coded, but should utilize images and long names of dependencies. Even if you think your application is firewalled from the Internet, it will not be protected if improperly coded

Fishbein and Robinson warn.

Also, as an example of misconfiguration, we have seen Airflow servers with public configuration files.

The configuration file (airflow.cfg) is created when Airflow is first started. The configuration file (airflow.cfg) is created when Airflow is first started and contains the Airflow settings, which can be modified. This file contains secrets such as passwords and keys.

But if you accidentally set the expose_config option in this file to ‘True’, then anyone can access the configuration via the web server and see these secrets.

Other examples include unauthorized users editing sensitive data stored in Airflow’s “Variables” to inject malicious code, and inappropriate use of JSON blobs stored in the unencrypted “Extra” field for the “Connect” function’s credentials for anyone to see. We have seen cases of this.

In addition to identifying improperly configured Airflow assets, this study also focused on the risks of delaying software updates.

According to Intezer, the majority of these flaws were identified on servers running Airflow v1.x in 2015 and are still in use by various sections of the organization.

Version 2 of Airflow introduces many new security features, including a REST API that requires authentication for all operations. Version 2 does not store sensitive information in the logs and configuration options are now explicitly reviewed by the administrator rather than left as default.

If customer records and sensitive data are exposed due to security flaws as a result of procrastination in upgrading and patching, it may violate data protection laws such as GDPR.

“Interfering with a customer’s operations through inadequate cybersecurity measures can lead to legal action, including class action lawsuits,” he advises.

Intezer states that it has notified specific organizations and entities that are leaking sensitive data via vulnerable Airflow instances before publishing the results of this investigation.

“In light of the major changes in version 2, we strongly recommend that you update all Airflow instances to the latest version. Also, please ensure that only authorized users can connect,” warned the Intezer researchers.

301 Moved Permanently

Comments

Copied title and URL