Netezza has been a popular on-premise database system for years, catering to enterprises who want to manage and analyze large volumes of data. But the advent of Snowflake, a modern cloud computing data warehouse, has opened up modern data analytics capabilities. For example, Snowflake’s multi-cluster cloud infrastructure that separates compute from storage enables enterprises to automatically and instantly scale their infrastructure.
It is no surprise that a lot of enterprises have been evaluating using Snowflake in their modern applications. However, the transition from Netezza to Snowflake is not an easy one click switch. Invariably an enterprise would have built ETL pipelines to push data to Netezza, customized visualization tools to pull data out of Netezza, designed client applications dependant closely on data from Netezza, and built tooling around Netezza. With this whole ecosystem to consider, migration from Netezza to Snowflake might seem a daunting task.
But that’s where Nexla comes in! Follow along this step-by-step migration guide to execute a successful migration strategy, including special tips for automation using Nexla’s Netezza Exporter.
Migration journey – steps to migrate from Netezza to Snowflake
Migrations are usually complicated and can be a mess if not planned and executed properly. We recommend a carefully thought out step by step approach to migrating your data from Netezza to Snowflake.
Step 1: Analyze Current Netezza Usage
Start with analyzing your current Netezza implementation to get a full 360 view of your Netezza ecosystem. This involves finding answers to the following questions:
- Which databases & tables are to be migrated?
- Which users, roles, and applications have access to these databases and tables?
- How is data getting loaded into these tables – which scripts and applications are responsible for the data load?
- How is data in these tables getting used – which scripts and applications are pulling data from these tables?
Document your answers and use this list to evaluate the level of support for these inputs and outputs in Snowflake. Additionally, use this list as an input to the migration QA process.
Step 2: Design Integration Plan
Now that you are done with understanding and documenting your current Netezza usage, let’s lay out an execution plan for the migration.
Instead of an all-in-one-go approach to migration, we recommend a phased approach. Migrate low impact tables, databases, and applications first, get some success under the belt, and then move-on to the complex migration tasks.
No matter which approach you take, by the end of this step you will have a full plan of action for the migration in place.
- Take the output of the previous Analysis step and divvy up the tables, databases and migrations into logical phases. Phase 1 should typically involve tables that need minimal changes and are low impact to your business needs.
- It is best to plan a full vertical slice migration – i.e. end-to-end ingestion, migration, and consumption together. This helps isolate issues at every stage early in the process.
- Identify tools that can help speed up the migration process. It is generally recommend to not hand-code.
- Try out DataOps tools like Nexla for executing the migration. They can reduce time to market substantially by automating a significant portion of the migration and re-tooling. This becomes especially beneficial for repeatable steps in the phased migration approach.
Step 3: Setup Snowflake Account
With your plan-of-action in hand, let’s get started executing the migration. First up – you want to set up your Snowflake account to meet your needs.
Use the Snowflake UI / CLI to configure the following on Snowflake:
- Create databases and warehouses on Snowflake
- Create users and accounts on Snowflake.
Step 4: Migrate Data Definition Language (DDL)
Now that you have primed Snowflake with accounts and databases, you can start creating the database structure. This typically involves using DDL exports from Netezza, converting them to Snowflake compatible DDL and executing it. The table below provides a mapping between Netezza and Snowflake column types.
However, an intelligent DataOps platform like Nexla can handle this automatically, saving you a tedious and potentially error prone process. With Nexla you just need to select which Netezza tables and data-fields need to be migrated to Snowflake. Nexla will automatically create Snowflake DDL compliant Snowflake tables and handle the DDL migration based on the data coming from your Netezza database. Additionally, Nexla will automatically skip Netezza system tables during migration.
|Netezza Type||Snowflake Type||Comments|
|INTERVAL||N/A||Interval data types are not supported but date functions are available.|
|TIME WITH TIMEZONE||TIMESTAMP_TZ|
|XML, SPATIAL, VARBINARY, ST_ GEOMETRY other binary types||VARIANT|
Step 5: Load Data (One Time) from Netezza to Snowflake
After Step 4 your Snowflake table should be ready to handle data from Netezza. Now you can load data from Netezza into Snowflake!
If you are running Netezza is running on prem, it is advisable to push data first to a cloud storage like AWS S3 or Azure DataBox before loading into Snowflake.
This step is another reason for considering an intelligent DataOps platform like Nexla. Instead of writing scripts for performing this one time data load, with Nexla all you need to do is trigger a job for Nexla’s custom Netezza Exporter. The exporter will automatically extract data from Netezza, load it in into an intermediate cloud storage if needed, and subsequently load the data into Snowflake
Step 6: Load Data (Ongoing) from Data Sources to Snowflake
In Step 5, we replicated all historical data from Netezza into Snowflake. In order to decommission Netezza completely, we now need to connect Snowflake directly with the sources of data that are writing into Netezza.
We recommend connecting each source of data to Snowflake while it is still writing to Netezza. This will allow you to test applications in parallel and ensure seamless transition once you disconnect the source to Netezza link.
Snowflake has JDBC drivers which can be used to load data programmatically into Snowflake. But instead of writing custom connection scripts between your sources and Snowflake, we recommend trying out Nexla. You can use Nexla to load data into Snowflake from virtually any source. Nexla supports no code real-time integrations from files, databases, API’s, message buses, emails, into Snowflake and can significantly reduce time to market for ongoing data loads into Snowflake.
Check out all the benefits of using Nexla for writing to Snowflake here: https://www.nexla.com/snowflake/
Step 7: Update Applications to consume data from Snowflake
Once you have ensured that all your data is now being written into Snowflake, start tuning all your applications that were dependant on Netezza to read data from Snowflake. We recommend running two parallel versions of each application during the migration – one which continues consuming data from Netezza and another that is based on data from Snowflake.
Customizing scripts and applications can be complicated and might require significant re-engineering. Yet another reason for considering a DataOps platform like Nexla for building automated pipelines! Nexla enables self-serve, no-code, real-time data extraction from Snowflake to virtually any destination type – file, databases, APIs, message buses, etc.
Snowflake consumption patterns using Nexla look like the diagram below. With Nexla’s unique data flow branching model data from your Snowflake tables need only be read once even if multiple applications need different slices of data from those tables.
Step 8: Validation – Run QA with parallel stacks
Once you have configured parallel end-to-end stacks incorporating data flowing in and out of Snowflake and Netezza respectively, a comprehensive QA cycle should be run to compare functionality across two systems. In addition to validating the applications, you should compare row counts, sums of numeric fields, and hashes of string fields to identify discrepancies.
This step can be easily automated with Nexla. You can configure Nexla data flows which write data based on query results from Netezza and Snowflake into a file. This file can serve as a comparator between two systems.
We recommend doing validations in two phases:
Validation Phase 1: Netezza – Primary, Snowflake – Secondary
In the initial phase you continue with the Netezza stack as primary and Snowflake as secondary. Run this phase for a short period of time. During this period run the comparator as a scheduled process to monitor for differences. If no differences are found, you can move on to the next phase of validations.
Validation Phase 2: Snowflake – Primary, Netezza – Secondary
In this phase we flip over the stacks and make Snowflake as primary data system and Netezza as secondary. Just like the previous validation phase, run the comparator as a scheduled process to monitor for differences. If no differences are found Netezza system can be decommissioned.
If you chose the phased migration approach in Step 2, repeat steps 3-8 for each phase of your migration plan before moving on to Step 9.
Step 9: Decommission Netezza
Finally we are done with migration! Now that all the applications are running on Snowflake, inform all your Netezza users about their new Snowflake accounts and other changes. Turn off all Netezza load processes and revoke access to Netezza.
That was a comprehensive step-by-step plan for migrating from Netezza to Snowflake designed to avoid any hiccups pre or post migration. As we saw, a big chunk of the migration process can be achieved more quickly and reliably with an intelligent, modern DataOps platform like Nexla.