How to migrate your data from Teradata to Snowflake

Avinash Shahdadpuri
Avinash Shahdadpuri


Teradata has been a popular on-premise database system for decades, catering to enterprises who want to manage and analyze large volumes of data. But the advent of Snowflake, a modern cloud computing data warehouse, has opened up modern data analytics capabilities. For example, Snowflake’s multi-cluster cloud infrastructure that separates compute from storage enables enterprises to automatically and instantly scale their infrastructure.

It is no surprise that a lot of enterprises have been evaluating using Snowflake in their modern applications. However, the transition from Teradata to Snowflake is not an easy one click switch. Invariably an enterprise would have built ETL pipelines to push data to Teradata, customized visualization tools to pull data out of Teradata, designed client applications dependant closely on data from Teradata, and built tooling around Teradata. With this whole ecosystem to consider, migration from Teradata to Snowflake might seem a daunting task.

But that’s where Nexla comes in! Follow along this step-by-step migration guide to execute a successful migration strategy, including special tips for automation using Nexla’s Teradata Exporter.

Migration journey – steps to migrate from Teradata to Snowflake

Migrations are usually complicated and can be a mess if not planned and executed properly. We recommend a carefully thought out step by step approach to migrating your data from Teradata to Snowflake.

Step 1: Analyze Current Teradata Usage

Start with analyzing your current Teradata implementation to get a full 360 view of your Teradata ecosystem. This involves finding answers to the following questions:

  1. Which databases & tables are to be migrated?
  2. Which users, roles, and applications have access to these databases and tables?
  3. How is data getting loaded into these tables – which scripts and applications are responsible for the data load?
  4. How is data in these tables getting used – which scripts and applications are pulling data from these tables?

Document your answers and use this list to evaluate the level of support for these inputs and outputs in Snowflake. Additionally, use this list as an input to the migration QA process.

Step 2: Design Integration Plan

Now that you are done with understanding and documenting your current Teradata usage, let’s lay out an execution plan for the migration.

Instead of an all-in-one-go approach to migration, we recommend a phased approach. Migrate low impact tables, databases, and applications first, get some success under the belt, and then move-on to the complex migration tasks.

No matter which approach you take, by the end of this step you will have a full plan of action for the migration in place.

  1. Take the output from the previous Analysis step and divvy up the tables, databases and migrations into logical phases. Phase 1 should typically involve tables that need minimal changes and are low impact to your business needs.
  2. It is best to plan a full vertical slice migration – i.e. end-to-end ingestion, migration, and consumption together. This helps isolate issues at every stage early in the process.
  3. Identify tools that can help speed up the migration process. It is generally recommend to not hand-code.
  4. Try out DataOps tools like Nexla for executing the migration. They can reduce time to market substantially by automating a significant portion of the migration and re-tooling. This becomes especially beneficial for repeatable steps in the phased migration approach.

Step 3: Setup Snowflake Account

With your plan-of-action in hand, let’s get started executing the migration. First up – you want to set up your Snowflake account to meet your needs.

Use the Snowflake UI / CLI to configure the following on Snowflake:

  1. Create databases and warehouses on Snowflake
  2. Create users and accounts on Snowflake.

Step 4: Migrate Data Definition Language (DDL)

Now that you have primed Snowflake with accounts and databases, you can start creating the database structure. This typically involves using DDL exports from Teradata, converting them to Snowflake compatible DDL and executing it. The table below provides a mapping between Teradata and Snowflake column types.

However, an intelligent DataOps platform like Nexla can handle this automatically, saving you a tedious and potentially error prone process. With Nexla you just need to select which Teradata tables and data-fields need to be migrated to Snowflake. Nexla will automatically create Snowflake DDL compliant Snowflake tables and handle the DDL migration based on the data coming from your Teradata database. Additionally, Nexla will automatically skip Teradata system tables like DBC, TD_SYSGPL, TD_SYSXML, etc during migration.

Teradata to Snowflake DataType conversion chart
Teradata TypeSnowflake TypeComments
A1ARRAY
ANARRAY
ATTIME
BFBINARY
BOBINARYBLOB data limited to 8 MB on Snowflake
BVBINARY
CFVARCHAR
COVARCHARCLOB data can be stored as varchar. Limited to 16 MB on Snowflake
CVVARCHAR
DNUMBER
DADATE
INTERVAL TYPES (DH, DM, DS,DY,HM,HR,HS,MI,MO,MS,YM,YR)N/ANot supported in Snowflake. Can be supported as a number and calculations can be performed using date functions.
DTN/ANot supported in Snowflake
FFLOAT
INTEGER TYPES (I, I1,I2,I8)NUMBER
JNVARIANT
LF, LVN/ANot supported in Snowflake
NNUMBER
PDN/ASeparate into two date fields
PMN/ASeparate into two fields (TIMESTAMP_TZ)
PSN/ASeparate into two timestamp fields (TIMESTAMP_NTZ)
PTN/ASeparate into two time fields
PZN/ASeparate into two time fields
SZTIMESTAMP_TZ
TSTIMESTAMP_NTZ
UF/UVN/ANot Supported in Snowflake
UDTN/ANot Supported in Snowflake
XMVARIANT

Step 5: Load Data (One Time) from Teradata to Snowflake

After Step 4 your Snowflake table should be ready to handle data from Teradata. Now you can load data from Teradata into Snowflake!

If you are running Teradata is running on prem, it is advisable to push data first to a cloud storage like AWS S3 or Azure DataBox before loading into Snowflake.

This step is another reason for considering an intelligent DataOps platform like Nexla. Instead of writing scripts for performing this one time data load, with Nexla all you need to do is trigger a job for Nexla’s custom Teradata Exporter. The exporter will automatically extract data from Teradata, load it in into an intermediate cloud storage if needed, and subsequently load the data into Snowflake.

Step 6: Load Data (Ongoing) from Data Sources to Snowflake

In Step 5, we replicated all historical data from Teradata into Snowflake. In order to decommission Teradata completely, we now need to connect Snowflake directly with the sources of data that are writing into Teradata.

We recommend connecting each source of data to Snowflake while it is still writing to Teradata. This will allow you to test applications in parallel and ensure seamless transition once you disconnect the source to Teradata link.

Snowflake has JDBC drivers which can be used to load data programmatically into Snowflake. But instead of writing custom connection scripts between your sources and Snowflake, we recommend trying out Nexla. You can use Nexla to load data into Snowflake from virtually any source. Nexla supports no code real-time integrations from files, databases, API’s, message buses, emails, into Snowflake and can significantly reduce time to market for ongoing data loads into Snowflake.

Check out all the benefits of using Nexla for writing Snowflake here: https://www.nexla.com/snowflake/

Snowflake Data Write
Sending data to Snowflake

Step 7: Update Applications to consume data from Snowflake

Once you have ensured that all your data is now being written into Snowflake, start tuning all your applications that were dependant on Teradata to read data from Snowflake. We recommend running two parallel versions of each application during the migration – one which continues consuming data from Teradata and another that is based on data from Snowflake.

Customizing scripts and applications can be complicated and might require significant re-engineering. Yet another reason for considering a DataOps platform like Nexla for building automated pipelines! Nexla enables self-serve, no-code, real-time data extraction from Snowflake to virtually any destination type – file, databases, APIs, message buses, etc.

Snowflake consumption patterns using Nexla look like the diagram below. With Nexla’s unique data flow branching model data from your Snowflake tables need only be read once even if multiple applications need different slices of data from those tables.

Snowflake Data Read
Snowflake as a Data Source

Step 8: Validation – Run QA with parallel stacks

Once you have configured parallel end-to-end stacks incorporating data flowing in and out of Snowflake and Teradata respectively, a comprehensive QA cycle should be run to compare functionality across two systems. In addition to validating the applications, you should compare row counts, sums of numeric fields, and hashes of string fields to identify discrepancies.

This step can be easily automated with Nexla. You can configure Nexla data flows which write data based on query results from Teradata and Snowflake into a file. This file can serve as a comparator between two systems.

We recommend doing validations in two phases:

Validation Phase 1: Teradata – Primary, Snowflake – Secondary

In the initial phase you continue with the Teradata stack as primary and Snowflake as secondary. Run this phase for a short period of time. During this period run the comparator as a scheduled process to monitor for differences. If no differences are found, you can move on to the next phase of validations.

Validation Phase 2: Snowflake – Primary, Teradata – Secondary

In this phase we flip over the stacks and make Snowflake as primary data system and Teradata as secondary. Just like the previous validation phase, run the comparator as a scheduled process to monitor for differences. If no differences are found Teradata system can be decommissioned.

If you chose the phased migration approach in Step 2, repeat steps 3-8 for each phase of your migration plan before moving on to Step 9.

Step 9: Decommission Teradata

Finally we are done with migration! Now that all the applications are running on Snowflake, inform all your Teradata users about their new Snowflake accounts and other changes. Turn off all Teradata load processes and revoke access to Teradata.

Conclusion

That was a comprehensive step-by-step plan for migrating from Teradata to Snowflake designed to avoid any hiccups pre or post migration. As we saw, a big chunk of the migration process can be achieved more quickly and reliably with an intelligent, modern DataOps platform like Nexla.

Fast-track your migration and avoid common pitfalls – signup at https://www.nexla.com/start-free-trial/ or contact us at support@nexla.com for more information!