There are so many tools out there, but I found this video on YouTube. It did a good job of breaking down the essentials.
Here's a quick list of the tools mentioned:
- Basics: SQL, Python, Linux, bash scripting, network understanding
- Technical Basics: Git, SFTP, PGP
- Databases: PostgreSQL, MySQL, MongoDB
- Data Platforms: Snowflake, Databricks, BigQuery, Redshift, Azure Synapse Analytics
- Orchestration, ETL & Data Pipelines: Airflow, Dagster, SSAS, Azure Data Factory
- Cloud: AWS, Azure, GCP
- Others: Docker, Kubernetes, Terraform
If you had to give advice on where to start, what would that be? What are your favored tools?