175 Dataiku Reviews
End-to-end platform: From data ingestion and preparation to model deployment and monitoring, Dataiku covers the entire lifecycle of a data project. This eliminates the need for disparate tools and streamlines the entire workflow.
Collaborative environment: The platform fosters seamless collaboration through shared projects, commenting, and version control. This ensures everyone is on the same page and contributes their unique expertise. My coworkers share their projects with each other and work together on some projects.
Extensive integrations: Dataiku integrates seamlessly with a wide range of databases, cloud platforms, and machine learning libraries. This flexibility allows us to leverage our existing infrastructure and resources. I just learned that a new R library was added recently which could make our life easier on data manipulation.
Robust model management: The platform provides comprehensive tools for tracking model performance, managing versions, and ensuring compliance. This is critical for maintaining the accuracy and reliability of our data-driven decisions. Review collected by and hosted on G2.com.
The running engine could be tricky, there is no one engine can run it all. Sometimes I have to try different engines to make it work. Review collected by and hosted on G2.com.

Dataiku stands out for its ease of use — the drag-and-drop interface combined with the option to code when necessary makes it accessible to a wide range of users. Implementation was smooth and well-supported by the customer success team. I use Dataiku frequently because it offers a vast number of features, from data preparation to model deployment. Integration with other tools like databases, cloud services, and APIs is straightforward, making the platform even more powerful for end-to-end projects. Review collected by and hosted on G2.com.
While overall the platform is excellent, some advanced settings and options could be a bit overwhelming for new users. Additionally, the pricing structure can be a limiting factor for smaller companies or teams. Review collected by and hosted on G2.com.
The plethora of data plug-in sources, tons of options for ready-made recipes (for EDAs, data engineering, & quick-and-dirty analyses). Along with that, we have code-based recipes too to write our own codes. And cherry on top is the collaborative environment without us having to explicitly handle any of that! Review collected by and hosted on G2.com.
Wish there was more customization available to some of the visual recipes. Another thing is version control - although Dataiku does handle version control, it is very non-intuitive and difficult to go back to a previous version, or even understand the changes made between different versions. We need to have commit comments and other git-like features for that to work better. Review collected by and hosted on G2.com.
Dataiku has been a game-changer in democratizing data workflows. I love how intuitive it is for cross-functional teams to build pipelines, transform datasets, and collaborate all within a visual flow and massage the data. It’s the kind of tool that makes manipulation effortless, experimentation easy, and sharing work frictionless. Review collected by and hosted on G2.com.
As much as I appreciate how approachable and feature-rich Dataiku is, there are moments where its flexibility feels like a double-edged sword. Some use cases become surprisingly complex due to very recipe logic, and handling parameterized or reusable workflows can feel clunky.
It can also be slow as a cloud-based platform, with multiple users editing single workflows. It gets the job done—but sometimes you just wish it handled faster, cleaner, more tactile. Review collected by and hosted on G2.com.

I love the platform, it's intuitive and very useful. The llm recipes are especially useful. Overall I think its a great platform, it looks great, it makes sense, and it definetely allows me to do my work quicker. Review collected by and hosted on G2.com.
The actual support hasn't always been the best. I've often reached out for support and wasted a lot of time going back and forth without resolving a problem, only to be told that the person trying to help me doesn't know as much on the cloud version of dataiku. The documentation is never cloud-specific too so it's a little confusing. The process through which dataiku have been working out a use case for us has also had some difficulties, Review collected by and hosted on G2.com.

It's hard to highlight a single feature so I will have to mention several:
- The ability to serve different personas, such as "coders" and "clickers" means that Dataiku is well received by non Data Scientists. Yet for those that prefer to code they can code as well.
- The ability to integrate with so many technologies and compute and storage engines both for ingestion and parallel compute means there is no job too big to be done in Dataiku when you use the right technology.
- The Flow makes complex data pipelines simple to understand and design. It also makes it very esy to use.
- The integration of Jupyter Notebooks, built-in Git versioning and Python code environment management makes the creation of new projects and project management very easily.
- And finally I would like to specifically mention their incredible Support team. In my IT career I have dealt with a myriad of enterprise software vendors including all the large ones and I can honestly say that Dataiku Support is the best one I have dealt with by miles. Response speed is amazing even at weekends or out of hours. It's clear they run a 24x7 operation across the globe. The quality and quantity of the responses from Support es exceptional. Even when asking for code snippets to use Dataiku API, which most vendors will normally charge for under professional services, we have been surprised by their willingness to help and always achieved a outcome. Review collected by and hosted on G2.com.
The GUI is inconsistent at times on how certain actions need to be done. While we found Dataiku Support to be exceptional we had less luck in getting new features implemented. Bug fixing has also been slow in our view even though Dataiku has a good release schedule (they usually release a patch release every 2 weeks!). In our view core features and bug fixing should take more priority than LLM features and other new features.
It needs more work to improve ML Ops. For instance model drift is also available via an additional plugin and only on certain algorithms. This should be a core capability. Collaboration could also be improved as there are some concurrency issues that need to be fixed. Review collected by and hosted on G2.com.

What I like most about Dataiku is how easy it is to use for creating and managing data-driven solutions. The platform has a very friendly interface, so even if you are not expert, you can start to work on your data projects without much trouble. It is simple to make data pipelines, do analytics, and even create machine learning models, all in one place. Also, I really like that Dataiku can connect very easily with different cloud services and data sources. This makes my work much faster and more efficient because I do not need to spend much time on integration. Overall, Dataiku helps a lot to move quickly from raw data to useful results. Review collected by and hosted on G2.com.
There are not many things I dislike about Dataiku, but sometimes, if you want to use more advanced features, you need to have deeper technical knowledge. For someone who is just starting, this can be a bit difficult. Also, because Dataiku always adds new features, sometimes it is not easy to keep up and find the best way to use everything. Sometimes, when integrating with some cloud systems, there can be small technical problems, but usually there is good support and documentation to help. Review collected by and hosted on G2.com.

🔄 Smart Data Preparation
Transform raw data into structured, ready-to-use assets using intuitive tools enhanced by AI-driven suggestions, auto-schema detection, and intelligent type recognition.
🧪 Continuous Development
Support agile analytics with a CI/CD-style environment where data flows, scripts, and models evolve continuously, promoting rapid iteration and improvement.
⚙️ Ease of Implementation
Minimize setup complexity with modular components, drag-and-drop interfaces, and seamless integration with existing data ecosystems (cloud, on-prem, hybrid).
✅ Robust Data Validation
Ensure data quality through built-in validation checks, profiling dashboards, and the flexibility to implement custom Python logic for complex or domain-specific rules.
🧠 Scenario Building
Model and simulate different business or analytical scenarios using parameterized workflows, branching logic, and reusable components to support what-if analyses.
🌀 Flow Zones
Organize and manage data processes in "Flow Zones" — clearly defined stages (e.g., Ingest → Transform → Validate → Output) that make pipeline orchestration transparent and scalable.
📚 Integrated WIKI Page
Empower collaboration and knowledge sharing with an embedded WIKI page. Document logic, share best practices, track changes, and onboard new users effortlessly. Review collected by and hosted on G2.com.
While DSS offers a powerful visual interface and flexibility, working with large datasets often introduces significant friction, particularly during scenario execution and debugging.
🚧 Key Pain Points:
Performance Bottlenecks:
Executing complex scenarios on large datasets directly in the DSS engine is slow and resource-intensive, often making it impractical for time-sensitive analytics.
Dependence on External Engines:
To achieve acceptable performance, teams must offload processing to SQL or Spark engines, requiring:
Additional infrastructure setup (clusters, permissions, connections)
Advanced SQL or PySpark expertise, which can be a barrier for data analysts or citizen data scientists.
Debugging Overhead:
Troubleshooting large workflows is cumbersome due to:
Limited transparency into underlying code execution
Multi-layered architecture (visual flow → Spark/SQL translation → execution engine)
Slower iteration cycles, especially with Spark Review collected by and hosted on G2.com.

Dataiku makes it really easy to organize and develop a data pipeline. Especially if your team works on the same pipelines together, it's really easy to co-work. I love how modular I'm able to make my flow and that I can alternate between SQL/R/Python seamlessly. Review collected by and hosted on G2.com.
Sometimes the error messages are really confusing and not helpful, especially if you're running a query downstream. Review collected by and hosted on G2.com.
My first real analytics role began with me using Dataiku. I knew how to write SQL, but the platform gave me multiple options in terms of how to execute queries. Visual recipes were easy, and I of course had thr option to write the actual code too. Being able to track my transformations through the data flow made it easy for me to understand what was going on! Review collected by and hosted on G2.com.
I wish there were capabilities to do further excel like formatting within Dataiku. I typically have to export it and then work on the file to make it look pretty. Review collected by and hosted on G2.com.