Trifacta - Quantrum LLC

Trifacta accelerates data cleaning & preparation with a
modern platform for cloud data lakes & warehouses

Ensure the success of your analytics, ML & data onboarding initiatives across any cloud, hybrid and multi-cloud environment

What is Data Wrangling?

Successful analysis relies upon accurate, well-structured data that has been formatted for the specific needs of the task at hand. Yet, today’s data is bigger and more complex than ever before. It’s time-consuming and technically challenging to wrangle it into a format for analysis. Data wrangling is the process you must undergo to transition raw data source inputs into prepared outputs to be utilized in analysis and various other business purposes.

An Intelligent Platform that Interoperates with Your Data Investments

Trifacta sits between the data storage and processing environments and the visualization, statistical or machine learning tools used downstream. The platform is architected to be open and adaptable so as the technologies upstream and downstream change, the investments and logic created in Trifacta are able to utilize those innovations.

Governance

Collaborative Data Goverance refers to features within Trifacta that provide extensive support for open source and vendor-specific security, metadata management and governance frameworks. This approach gives organizations the visibility and administration over the data wrangling users are performing. Trifacta supports user hierarchies across roles determining data access and user functionality within the application. Administrators and data stewards are able to manage platform authentication and security at various user hierarchy levels

Data Security & Administration

Trifacta provides end-to-end secure data access and clear auditability that comply with the stringent requirements of enterprise IT. The platform provides support for encryption, authentication, access control and masking. Trifacta’s differentiated approach to security focuses on providing enterprise functionality (such as SSO, impersonation, roles and permissions) while balancing extensive security framework integration with existing policies. Customers can integrate Trifacta into what’s already working for them without having to support a separate security policy.

Collaboration

Within Trifacta, users can share reusable data preparation logic and dataset relationships, which lets them leverage and build upon each other’s efforts. Multiple users can contribute to a single project, which parallelizes workflows, allows different degrees of participation, and speeds up time to completion. Datasets and data preparation steps can also be integrated with 3rd party applications through Trifacta’s API. Additionally, preparation steps can be exported and shared outside Trifacta.

Operationalization

Trifacta’s operationalization features introduce the ability for data analysts to schedule and monitor workflows that run jobs at scale in production, while still providing the traceability and access control for IT. Every data preparation recipe or set of steps created in Trifacta can be set into a repeatable pipeline according to hourly, daily, weekly schedules or the time period defined by the user. Individual recipes can makeup broader pipelines that make up multiple datasets and recipes.

Connectivity Framework

Trifacta maintains a robust connectivity and API framework enabling users to access live data without requiring them to pre-load or create a copy of the data separate from the source data system. This framework includes connecting to various Hadoop sources, Cloud services, Files (CSV, TXT, JSON, XML, etc.) and relational databases. All of these connectors support governance and security features – roles and permissions, SSL, Kerberos Auth (SSO) and impersonation.

Metadata Management

Trifacta has support for enriching data with geographic, demographic, census and other common types of reference data. Common taxonomies and ontologies are automatically recognized such as geographic and time-based content as well as data format taxonomies for nested data structures like JSON and XML. The platform is also open/extensible through APIs giving customers and partners the ability to seamlessly integrate additional data sources and targets.

Any Scale Data Processing

Photon | Spark | …

Using Trifacta’s Intelligent Execution Engine, every transformation step defined in the user interface automatically compiles down into the best-fit processing framework based on data scale. Trifacta can transform the data on-the-fly in the application or compile down to Spark, Google DataFlow, or our in-memory engine, Photon. The platform natively supports all major Hadoop on-premise and cloud platforms. With this model, Trifacta can handle any scale.

Intelligence & Context

Machine Learning | Transparent Lineage | Smart Cleaning

Trifacta learns from data registered into the platform and how users interact with it. Common tasks are automated and users are prompted with suggestions to speed their wrangling. The platform supports fuzzy matching, enabling end users to join data sets with non-exact matching attributes. Data registered in Trifacta are inferred to identify formats, data elements, schemas, relationships and metadata. The platform provides visibility into the context and lineage of data – both inside and outside of Trifacta

Wrangle Language

Recipes | User Defined Functions | Macros

Core to Trifacta’s differentiation is the platform’s Domain Specific Language Wrangle enabling users to abstract the data wrangling logic they’re creating in the application from the underlying data processing of that logic. Advanced users can create more complex wrangling tasks including window functions, user defined functions. Every step defined in Trifacta’s Wrangle language makes up a data preparation recipe or set of steps created in Trifacta that can be set into a repeatable pipeline.

Core Data Wrangling User Experience

Profile | Structure | Clean | Enrich | Validate

Trifacta leverages the latest techniques in data visualization, machine learning and human-computer interaction to guide users through the process of exploring and preparing data. Active Profiling presents guided visualizations of data based upon its content in the most compelling profile. Predictive Transformation converts every click or select within Trifacta into a prediction. Smart Cleaning empowers users to resolve common data quality issues like mismatched formats and unstandardized values.

Publishing & Access Framework

Trifacta maintains a robust Publishing and Access framework. Outputs of wrangling jobs are able to be published to a variety of downstream file systems, databases, analytical tools, file and compression formats. Trifacta has deep API and bi-directional metadata sharing with a variety of analytics, data catalog and data governance applications. This enables users to share context and work between Trifacta and the external applications they’re leveraging through native integration.

Contact us to learn more about Trifacta

Get Your Data Ready for Machine Learning
& Analytics in the Cloud

Trifacta accelerates data cleaning & preparation with a
modern platform for cloud data lakes & warehouses

Ensure the success of your analytics, ML & data onboarding initiatives across any cloud, hybrid and multi-cloud environment

What is Data Wrangling?

An Intelligent Platform that Interoperates with Your Data Investments

Governance

Data Security & Administration

Collaboration

Operationalization

Governance

Data Security & Administration

Collaboration

Operationalization

Connectivity Framework

Metadata Management

Any Scale Data Processing

Intelligence & Context

Wrangle Language

Core Data Wrangling User Experience

Publishing & Access Framework

Connectivity Framework

Metadata Management

Any Scale Data Processing

Photon | Spark | …

Intelligence & Context

Machine Learning | Transparent Lineage | Smart Cleaning

Wrangle Language

Recipes | User Defined Functions | Macros

Core Data Wrangling User Experience

Profile | Structure | Clean | Enrich | Validate

Publishing & Access Framework

Contact us to learn more about Trifacta

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Get Your Data Ready for Machine Learning & Analytics in the Cloud

Trifacta accelerates data cleaning & preparation with a modern platform for cloud data lakes & warehouses

Ensure the success of your analytics, ML & data onboarding initiatives across any cloud, hybrid and multi-cloud environment

What is Data Wrangling?

An Intelligent Platform that Interoperates with Your Data Investments

Governance

Data Security & Administration

Collaboration

Operationalization

Governance

Data Security & Administration

Collaboration

Operationalization

Connectivity Framework

Metadata Management

Any Scale Data Processing

Intelligence & Context

Wrangle Language

Core Data Wrangling User Experience

Publishing & Access Framework

Connectivity Framework

Metadata Management

Any Scale Data Processing

Photon | Spark | …

Intelligence & Context

Machine Learning | Transparent Lineage | Smart Cleaning

Wrangle Language

Recipes | User Defined Functions | Macros

Core Data Wrangling User Experience

Profile | Structure | Clean | Enrich | Validate

Publishing & Access Framework

Contact us to learn more about Trifacta

Get Your Data Ready for Machine Learning
& Analytics in the Cloud

Trifacta accelerates data cleaning & preparation with a
modern platform for cloud data lakes & warehouses