Share This Post

Revitalizing your Business Intelligence Process with Matillion ETL

What is ETL?

ETL is short for Extract, Transform and Load. It describes the steps Analysts and Developers take to collect and distill data in order to inform a business expert’s decision making. Extract describes how you gather data from multiple sources. Transform is the process of converting and manipulating the gathered data. While Load denotes the idea that this data is written into a database that is maintained separately from the source data. Separating the data gives us the opportunity to restructure it in order to make it more useful to data analysts.

 

ETL vs ELT

Classifying Matillion as an ETL platform is a little bit of a misnomer. Matillion is really an ELT platform (Extract, Load, and transform). Both approaches ultimately accomplish the same objective, so what is the difference?

A Traditional ETL Process

 

A Traditional ETL Process

ETL has 3 distinct areas where processing occurs, each step of the ETL process happens on a different logical environment with its own resource constraints.  Each phase of the process involves transmission of data from one layer to another. Requiring a transformation tool or engine in the process can often create a bottleneck when it comes time to scale your solution up.

ELT Process 2

 

ELT

ELT is more condensed—you extract the data and load it into your data warehouse, and from there you transform it. The order of this is important because the staging layer is eliminated from the process.

So, what does this mean for your process? It’s cheaper! By leveraging the warehousing storage and compute capabilities of tools like Redshift and Snowflake, you reduce the overall cost of our solution. If your transformation application creates a performance bottleneck as you scale your data warehouse, the cost to then scale that application could be significant. Matillion ETL is designed with scalability in mind, allowing you to focus on scaling your data and not your toolset.

 

What is Matillion?

Matillion is a scalable cloud-based ELT platform for Amazon Web Services’ Redshift and Snowflake as well as Google’s BigQuery. Matillion leverages a browser-based UI as well as drag and drop components to easily create process flows to build your analytics pipeline and to develop as well as maintain your data warehouse. Matillion comes out of the box with a large variety of data sources available as inputs. You can integrate Customer Relationship Management (CRM) platforms such as Salesforce; eCommerce platforms such as Shopify and Magento; Marketing Analytics platforms like Google Adwords/Analytics, and Marketo. Additionally, Matillion can even gather data from Social Media platforms like Facebook and Snapchat. If there is not a Matillion component natively available for your data source, Matillion can speak to any REST-based API to achieve the same result. If all else fails, you can export the data as a CSV file and Matillion will be able to read your data!

Matillion 1

 

Matillion can not only gather data from a wide variety of sources, but using transformation components, it can also filter data, join tables, and perform calculations across data sets. Using a built-in Python script editor, you can utilize Amazon Web Services like S3 and Lambda to aid in the storage of data and allowing you to break down complex problems without requiring significant development to your Matillion pipeline. When combined with Matillion’s scheduling functionality, you can regularly perform calculations across large data sets and fully automate the execution of ELT tasks from ingestion all the way to final reporting.

 

Matillion 2

 

Get off Premise and Into the Cloud

Matillion truly shines when it is time to update your legacy data warehousing solution and get it into the cloud. It contains all the components and tools needed to migrate your legacy dataset and transform the result into something suitable for your future needs. By leveraging AWS Redshift snapshots, you can rest easy and confident knowing that your historical data has trailing backups. If you do not have control over the quality of the data being delivered (e.g. it comes from a third party) this a valuable feature to quickly restore your data. Finally, because the solution exists on the cloud, AWS Redshift allows you to expand or reduce your resource requirements as your data usage needs change.

 

So Why Matillion?

Many solutions offer similar data transformation services, but where Matillion wins out is in ease of use, compatibility, and cost. Additionally, Matillion is a modern tool with a team that is keen on providing top notch support and frequent updates. Matillion’s thoughtfully crafted user interface lets you spin up solutions quickly and allows you to break complex data treatment into compartmentalized and easily manageable jobs. Jobs can contain a large variety of out of the box integrations to allow Matillion to process data from a vast array of sources. Also, Matillion runs as an EC2 Instance in your existing AWS environment, meaning that you have full control over the instance and its software. This makes it exceptionally fast to start up and gives you the ability to start, stop, or scale the EC2 instance up at any time.  Finally, with Matillion there are no required contracts or subscriptions, you pay for what you use and there is never a minimum usage. With competitive rates, the rewards for starting up a new data analytics pipeline have never been greater!

Revitalizing your Business Intelligence Process with Matillion ETL

More To Explore

AI to Write Requirements

How We Use AI to Write Requirements

At ArgonDigital, we’ve been writing requirements for 22 years. I’ve watched our teams waste hours translating notes into requirements. Now, we’ve cut the nonsense with AI. Our teams can spend

ArgonDigital | Making Technology a Strategic Advantage