Tool/Library | Purpose |
---|
Pandas | Data manipulation and transformation |
SQLAlchemy | Database connectivity and ORM |
Apache Airflow | Workflow automation and orchestration |
PySpark | Big data processing and distributed computing |
Boto3 | AWS services interaction |
Requests | HTTP requests to access APIs |
PyMongo | MongoDB integration |
psycopg2 | PostgreSQL connectivity |
Matplotlib / Seaborn / Plotly | Data visualization tools |
Jupyter Notebook | Prototyping, visualization, and reporting |
Fastparquet / Pyarrow | Efficient data file format handling (Parquet, Avro) |