What is Shakudo
Shakudo is an end-to-end data platform that provides the maximum flexibility on data tooling. On Shakudo, data teams can choose and mix and match best-of-breed tools and try out new emerging tools without the DevOps overhead. On Shakudo the workflow is simplified with the Shakudo components:
Session is the unified development environment with pre-configured environment, mounted credentials, network connections and connections to databases to allow start building.
Jobs is the batch job deployment ochestration. you can use any GIT repositories, which are developed and pushed in the sessions, or anywhere else. You can also deploy a pre-built Docker images. Jobs can be triggered on a schedule or with any KEDA scalers.
Services is the service deployment ochestration. Similar to jobs, you can use GIT repositories or pre-built Docker images. A service exposes an endpoint, which can be a dashboard, a website or an API endpoint.
Shakudo Stack Components is a universe of pre-configured fully-connected, ever-evolving data stack that supports end to end use cases of data and machine learning applications.
Shakudo adds new integrations every day. Visit our integration page to see the latest list. If you can't find the tool you are looking for, please send us an integration request.
Data Warehouse
- Snowflake
- Google BigQuery
- Amazon Redshift
- Dremio
- Redshift
- Apache Hudi
- SingleStore
Blob Storage
- Azure blob storage
- AWS S3
- Google storage bucket
- Oracle blob storage
- Cloudflare R2
- Wasabi
Data Ingestion and streaming
- Airbyte
- Amazon EventBridge
- Apache Kafka
IDE
- Jupyter notebooks
- VSCode
- Code-Server
- PyCharm
Data Transformation
- DBT
- DuckDB
- Trino
Pipeline Orchestration
- Airflow
- Prefect
- Dagster
- Jenkins
Distributed computing
- Apache Spark
- Dask
- Ray
- Fugue
Data Visualization
- Apache Superset
- Cube
- Streamlit
- Metabase
- PowerBI
- QuickSight
- Looker
DataCatelog
- Datahub
- Amundsen
Model training
- Transformers
- Pytorch
- Tensorflow
- Jax
- MXNet
- NVIDIA RAPIDS
- Ray Tune
- PostgresML
Model and application serving
- Triton
- TensorFlow Serving
- TorchServe
- Django
- FastAPI
- Flask
Model monitoring and governance
- MLFlow
- Whylogs
- Weights & Biases
- Evidently
- GreatExpectations
Monitoring and Alerting
- Prometheus
- Grafana
- PagerDuty
- Slack
Data source
- Openbb
Geospatial
- Xclim
- Xarray
- cdo
- Geopandas
- GDAL
- ESMF
- Zarr
When to use Shakudo
- Data engineering, including data transformation development and deployment
- Distributed computing for data larger than memory
- Data analytics and visualization
- Deployment of batch jobs
- Serving data applications and pipelines
- Machine learning model training
- Machine learning model serving