Function Open-source Tools Cloud Services
Orchestration Apache Airflow
Luigi
Argo Workflows AWS Step Functions
Azure Logic Apps
Google Cloud Composer
ETL (Extract, Transform, Load) Apache NiFi
Apache Beam
Apache Spark
Talend Open Studio (QL) Azure Data Factory
AWS Glue
Google Cloud Dataflow
IBM DataStage
Matillion
Data Warehousing Apache Hive
Apache HBase
ClickHouse
Greenplum
Druid Azure Synapse Analytics
Amazon Redshift
Google BigQuery
Snowflake
Teradata Vantage
Data Transformation Apache Spark
Apache Flink
Apache Beam
dbt
Presto (Trino) Databricks
Google Cloud Dataprep
AWS Glue
DataRobot
Data Monitoring Prometheus
Grafana
Nagios
Zabbix
New Relic Azure Monitor
AWS CloudWatch
Google Cloud Operations Suite
Data Governance Apache Atlas
OpenLineage
DataHub Azure Purview
AWS Lake Formation
Google Cloud Data Catalog
Data Storage HDFS
Apache Cassandra
MongoDB
MinIO
Ceph Azure Blob Storage
AWS S3
Google Cloud Storage
Snowflake
Data Integration Apache Kafka
RabbitMQ
Pulsar
NATS Azure Event Grid
AWS SNS/SQS
Google Cloud Pub/Sub
Data Security Apache Ranger
Vault
Keycloak Azure Key Vault
AWS KMS
Google Cloud Secret Manager
Data Visualization Apache Superset
Metabase
Redash Power BI
Tableau
Google Data Studio
AWS QuickSight
Data Lineage Marquez
OpenLineage
Apache Atlas Azure Purview
Alation
AWS Glue Data Catalog
Batch Data Processing Apache Hadoop
Apache Spark
Apache Flink
Apache Beam AWS Batch
Google Cloud Dataproc
Azure HDInsight
Databricks
Real-Time Data Processing Apache Kafka
Apache Flink
Apache Pulsar AWS Kinesis
Azure Stream Analytics
Google Cloud Dataflow
Machine Learning TensorFlow
PyTorch
Scikit-learn
XGBoost
H2O.ai Azure Machine Learning
AWS SageMaker
Google AI Platform
IBM Watson
Data Backup & Recovery Bacula
Duplicity
Restic Azure Backup
AWS Backup
Google Cloud Backup & DR
Data Pipeline Testing Great Expectations
dbt
pytest Azure Data Factory Monitoring
AWS Glue Monitoring
Data Cleansing Trifacta
DataWrangler Google Cloud Dataprep
AWS Glue DataBrew
Data Streaming Apache Kafka
Apache Pulsar
Apache Flink
Spark streaming Azure Event Hubs
AWS Kinesis
Google Cloud Pub/Sub
Job Scheduling Cron
Celery
Rundeck
Azure Scheduler
AWS Batch
Google Cloud Scheduler
Data Cataloging Apache Atlas
Amundsen
DataHub Azure Purview
AWS Glue Data Catalog
Google Cloud Data Catalog

More:

Criteria: