Data Integration Tools Comparison

Data Integration Tools Comparison


In recent years, open source Big Data frameworks such as Hadoop, Spark, and NoSQL have really emerged to give organizations more of a choice in where they store data and run applications on clusters of hardware, giving them additional storage options for all kinds of structured and unstructured data and enough processing power to handle virtually an infinite number of simultaneous tasks. In addition, data lakes have gained notoriety for their ability to continuously collect data and store it in a lightly structured repository.

This is a positive development because it can help deliver data to stakeholders, business processes and applications with swiftness and ease. Hadoop and friends have been able to thrive due to the demand that real-time and streaming data has put on organizations. No longer is it enough to store data in traditional relational databases, especially considering the pure volumes of data that now come from embedded sensors, computer applications, social media activity, and mobile devices. As more of these types of data sources become prevalent, open source storage frameworks should continue to grow in popularity, and with the Internet of Things expected to revolutionize virtually every consumer product with the addition of data capturing, this looks like a trend that will stick around for the long haul, making integration solutions that much more vital. The main consideration for buyers of integration software are features and functions of course, but Data Integration is not so much a product as it is a process, and there a variety of ways to move the needle.



Vendor Strengths


Based in Redwood City, California, Actian offers data integration capabilities via Actian DataConnect and Actian DataCloud. Actian’s customer base for data integration tools is estimated to be approximately 7,000 organizations.

Key Features

  • Vortex – Provides elastic data preparation for rapid processing through analytic engines, ingestion technology, and KNIME user interfaces. Also provides SQL and predictive analytics to detect trends and patterns with hyper-parallelized operators via Hadoop
  • Matrix – A parallel, columnar, compressed, compiled approach that increases performance and algorithms, indatabase analytics, loading capacity, and accelerates processing power to provide faster delivery of MPP analytics
  • Vector – Enables concurrent business owners to explore data faster, and can be connected to the SMP analytics database to any application through standard SQL



Based in Chicago, Adeptia offers the Adeptia Integration Suite (AIS) and Adeptia Connect. Adeptia’s customer base for this product set is estimated at 583 organizations.

  • The integrated tool suite covers the basics of batch, message and synchronization. Adeptia supports the core requirements of bulk/batch data delivery and granular data capture and propagation. The combination of its data integration capability, application integration, enterprise service bus (ESB), B2B integration and trading partner management within a single product suite is appreciated by its reference customers for a faster time to value. It reduces the complexity for buyers through its orchestration engine and workflow support, which includes automated handling of errors and exceptions for complex process automation use cases. Adeptia considers simplification of end-to-end B2B data integration as its priority in this market.

  • Attractive pricing and flexibility. Reference customers view Adeptia as attractively priced relative to its competitors and value its flexible subscription licensing options. Adeptia’s ability to interoperate data integration functionalities with capabilities for ESB and business process management (BPM) is greatly appreciated by clients.

  • Performance, usability and integration platform as a service (iPaaS) support. Adeptia offers iPaaS capabilities, which enable B2B integration and interenterprise data sharing use cases. It supports integration of on-premises endpoints, cloud endpoints and a combination of the two for all integration patterns for “pervasive integration.” Customers also cite ease of use, good performance and throughput, which is particularly relevant to what Gartner calls “business moments.”


Based in Burlington, Massachusetts, Attunity offers Attunity Replicate, Attunity Compose and Attunity Visibility. Attunity’s customer base for this product set is estimated to number approximately 2,000 organizations globally.

  • MapForce – Map data graphically with support for XML, databases. EDI, JSON, flat files, Excel/OOXML, XBRL, and web services

  • DiffDog – Allows users to compare and merge text, XML, MS Office (OOXML), directories, XML and database schemas

  • StyleVision – Integrate, combine, and publish XML, database, and XBRL data with Altova’s visual design tool. This allows customers to drag-and-drop data to design reports with the ability to publish in HTML, PDF, Word, and e-Forms


Based in Palo Alto, California, Denodo offers the Denodo Platform. Denodo’s customer base for its data integration product is estimated at 300 midsize to large companies. With an established customer base, the Denodo Platform provides data virtualization, for joining multistructured data sources from DBMSs, websites, documents and a variety of enterprise, cloud and big data repositories.

  • Added functionality. In 2015, Denodo introduced new dynamic query optimization features to support cost-based optimization specifically for big data and cloud environments, including a Resource Manager for policy-based workload management for resources, predicate optimization techniques including full and partial aggregation pushdown below joins, partitioned unions for logical data warehouse and data lakes requirements, and performance optimization. Denodo also delivers a self-service data preparation capability with data, entity-level, relationship and search-style browsing capabilities.

  • Broad connectivity support. Denodo’s connectivity support includes relational databases, prerelational legacy data, flat files, XML, packaged applications and emergent data types including Hadoop and cloud-based data sources (such as Apache Spark and Amazon Redshift) and traditional technology. Examples include: introspecting stored procedures, compatibility with IBM, Microsoft, Teradata and SAP Hana connectors. It Also addresses Java Database Connectivity (JDBC), Open Database Connectivity (ODBC), Java Message Service (JMS)-compliant message queues, REST and SOAP web services, OData, JavaScript Object Notation (JSON), XML, portlets and SharePoint web parts.

  • Leveraging Implementation and technology partners. Denodo’s partner network encompasses global and regional system integrators and software vendors, including Cloudera, Hortonworks, SAP, Tableau Software, Infosys, Accenture, Cognizant and HCL Technologies. Software vendors also license or bundle Denodo’s functionality as part of their products for analytics, big data and operational use cases. Denodo is also available on the Amazon Web Services (AWS) Marketplace.


Based in Armonk, New York, IBM offers the following data integration products: IBM InfoSphere Information Server Enterprise Edition (including InfoSphere Information Server for Data Integration, and InfoSphere Information Governance Catalog), InfoSphere Federation Server, InfoSphere Data Replication, InfoSphere Information Server Enterprise Hypervisor Edition, WebSphere Cast Iron Live, BigInsights BigIntegrate and DataWorks. IBM’s customer base for this product set is estimated at more than 11,000 organizations.

  • Breadth of functionality and diversity of usage Customer usage includes bulk/batch data movement, CDC and propagation, data replication, data virtualization and message-oriented movement. Reference customers routinely cite as key strengths the breadth of functionality of and integration via open extensible metadata exchange and scalability. IBM further expanded its capabilities with simplified SaaS-based self-service data preparation and data integration for analytics (using data virtualization, DataWorks and Data Click) and native integration into Hadoop environments.

  • Mind share and capitalization on market trends IBM is frequently mentioned by Gartner clients in competitive evaluations. IBM continues to gain traction as an enterprise standard for data integration infrastructure. The tight linkage and embedding of DataWorks (for self-service data preparation) in Watson Analytics, dashDB and Cloudant facilitate a cloud analytics ecosystem — allowing customers to integrate hybrid on-premises and cloud architectures.

  • Alignment to information infrastructure and EIM IBM continues to innovate to support deep alignment between data integration with EIM goals — including data governance, data quality, MDM, big data integration, support for citizen integrators and iPaaS. IBM’s focus on extending data integration use cases to line-of-business users by extending self-service data preparation capabilities and open governance and metadata management is well-received by reference customers.


Based in Redwood City, California, Informatica offers the following data integration products: Informatica Platform (including PowerCenter, PowerExchange, Data Services, Data Replication, Advanced Data Transformation, Ultra Messaging, B2B Data Transformation and Data Integration Hub), Vibe Data Stream, Informatica Cloud Integration, IronCloud Platform, Big Data Management, Intelligent Data Lake, Intelligent Streaming, Enterprise Information Catalog and Informatica Data Preparation, Data Integration Hub and Informatica Data Preparation). Informatica’s customer base for this product set is estimated at more than 7,000 organizations.

  • Rapidly adapts to market functionality demands Informatica’s product development and roadmap address diverse capabilities including batch ETL,real-time integration message – oriented data movement , web services, stream processing and change data capture), publish/subscribe-data sharing,and data virtualization. Strong interoperability and synergies between its data integration tools and other Informatica technologies encourage usage as an enterprise standard. An emphasis on supporting digital IoT data integration, iPaaS, self-service data preparation, big data , data governance and data security opportunities capitalizes on demand trends. Informatica is starting to appear in “best – fit engineering” scenarios as part of enabling hybrid integration platform (HIP ) capabilities.

  • Strong appeal to data management and nontechnical roles and personas Informatica continues to expand its business-user-facing functionality across hybrid architectures. It now supports self-service data preparation through its Data Preparation (Rev) offering across on-premises, cloud and big data, and Data Wizard app. Emphasis on collaborative business-user-oriented functionality and the agility of data integration infrastructure as an enterprise standard resonates with diverse types and sizes of organizations.

  • Broad market presence and dedicated focus on innovation around metadata management, big data, data lakes and hybrid. Informatica exhibits the highest frequency of ” appearances ” on Gartner contract reviews calls and other competitive situations. Its global network of partners resellers majorsystem integrators and external service providers delivers extensive implementation support. Informatica’s metadata hub, Live Data Map released inversion 10), advances metadata-rich capabilities to enable machine learning, introspective modeling, metadata analysis and metadata -driven applications. Informatica’s Intelligent Data Lake and Big Data Management aim at ensuring that foundational principles of data integration, data governance and data security are upheld in big data deployments.



Based in Woodcliff Lake, New Jersey, Syncsort offers DMX, DMX-h and Ironstream. Syncsort’s customer base for this product set is estimated at 1,500 organizations.

  • Robust and high-performance functionality. Strong performance for ETL and extraction loading and transformation (ELT) processing, lower TCO compared with the market leaders, and time to value are all cited as key attractions for references selecting Syncsort. An expanding focus for deploying workloads on big data platforms (including Hadoop and Spark) — together with enhanced metadata and lineage support — positions Syncsort’s relevance in digital business.

  • Capitalizing on a distributed data ecosystem. Syncsort’s customers appreciate the toolset’s scalability and throughput in working with the Hadoop ecosystem, populating data lakes, ingesting streaming data, and offloading ETL or ELT workloads from data warehouses and mainframes to Hadoop. Through offerings for AWS and the ability to operate with Docker, ETL processing is deployable across physical, virtual and cloud environments.

  • Customer experience. Customers identify Syncsort’s technical support and their overall relationship with Syncsort as positives. Evolving technology partnering for extended functionality (for example, Striim for CDC, Trillium Software for data quality, and Waterline Data for data preparation) allows Syncsort to position itself to satisfy broader demands.


Based in Redwood City, California, Talend offers Talend Open Studio, Talend Data Fabric, Talend Data Management Platform, Talend Big Data Platform, Talend Data Services, Talend Integration Cloud and Talend Data Preparation. Talend’s paying customer base for this product portfolio is estimated at more than 1,300 organizations. 

  • Cost model and flexibility. Reference customers appreciate Talend’s unique, linear and scalable licensing model — composed of the data integration platform fee and additional developer licenses fee — which allows customers to start with small, core data integration projects and then grow their portfolio for more advanced data integration projects (such as integration with Hadoop data stores).

  • Integrated product portfolio for data integration and for operating with complementary technologies. Talend possesses a comprehensive portfolio of data integration and related technology (including data quality, MDM, ESB, application integration and metadata management), and has recently added iPaaS and data preparation capabilities. Customers value this robust product set, which allows them to build and execute end-to-end data management projects and use cases and to capitalize on data integration use cases that require synergy with their related technologies.

  • Strength in core data integration capabilities and commitment to evolving trends. Reference customers and prospects are still drawn to Talend’s robust core data integration capabilities, including bulk/batch movement of data, which continue to draw a significant proportion of buyer base. Talend also has products catering to current and evolving market needs, including its recent offering of iPaaS (now including AWS integration) and data preparation; significant investment in data integration operations running natively on Hadoop and evolving operational uses cases (Apache Storm and Apache Spark environment); planned features for data lake governance; and partnerships with Cloudera Navigator and Hortonworks Atlas.



Based in Redmond, Washington, Microsoft offers data integration capabilities via SQL Server Integration Services (SSIS), which is included in the SQL Server DBMS license. Vast worldwide deployments of Microsoft SQL Server involve usage of SSIS for data integration, although Microsoft does not report a specific customer count for SSIS.

  • Relevant capabilities and TCO. Customers cite low overall TCO, speed of implementation, ease of use, and the ability to integrate with other Microsoft SQL Server capabilities as the main reasons for choosing SSIS over the alternatives. The expected focus on data delivery to the Cortana Intelligence Suite seeks to establish synergy between data integration, cloud and cognitive computing capability.

  • Synergistic support of data-, process- and user-driven integration. Connectivity to diverse data types and added support for Hadoop Distributed File System (HDFS) enables SSIS to prepare data and extract results for Hadoop. Azure Data Factory provides the same capabilities, but also provides monitoring and management of data processing (“data pipelines”). Azure Data Catalog provides a metadata catalog for enabling easy registration and discovery of datasets to facilitate self-service data preparation.

  • Extensive market presence and skills. The broad use of SSIS has produced widely available community support, training and third-party documentation on implementation practices and problem resolution approaches.


Based in Redwood Shores, California, Oracle offers the following data integration products: Oracle Data Integrator (ODI), Oracle GoldenGate and Oracle Data Service Integrator. Oracle’s customer base for this product set can only be estimated — at more than 10,000 organizations for all these products worldwide — because Oracle does not break out revenue or customer count by products.

  • Feature function evolution in 2015. In 2015, Oracle released GoldenGate for Big Data; included push-down to Spark (via pyspark.sql) in ODI for Big Data; and introduced self-service integration capability in Data Prep (including support for structured data sources). Also in 2015, natural-language processing combined with graph — to support machine learning capabilities — was added to Big Data Preparation. In parallel, GoldenGate added support for multiple data sources and platforms.

  • Leveraging modern integration trends. Oracle’s data virtualization capability has improved. ODI manages message-based, bulk/batch and data virtualization from an integrated administration console, and even permits bulk/batch and messaging (which leverages the Apache Kafka open-source messaging system) on the same host. Oracle also offers all of these capabilities as iPaaS solutions.

  • Well-established synergy with Oracle technologies. Recognition of Oracle’s diverse portfolio for addressing data integration and other data and application-oriented requirements (spanning data quality tools, MDM solutions, ESB, analytic appliances and enterprise applications) continues to fuel its appeal in deployment scenarios.