Tuesday, 30 December 2014

IBM Netezza Online Training

IBM Netezza is a powerful and highly parallelized Data Warehousing system that is simple to administer and to maintain. This system is an appliance that is purpose-built for data warehousing. The system is commonly referred to as data warehouse appliance that is designed specifically for running complex data warehousing workloads. The concept of an appliance is realized by integrating the database, server and the storage into an easy to deploy and manage system.
In any database system the main bottle neck is IO. IBM Netezza reduces this bottleneck by using a commodity FPGA (Field-Programmable Gate Array) by pushing the SQL closer to silicon to help improve IO performance. This core component of the appliance is referred to as the Database Accelerator.
The Database Accelerator along with the other components of the IBM Netezza appliance was discussed during a short high-level overview of the architecture. This overview was presented at the beginning of the workshop during a brief presentation. The presentation also included the basic usage on how to administer and maintain a Netezza database. The concepts covered in the presentation were reinforced by getting hands on experience using a Netezza appliance. Instead of using an actual IBM Netezza appliance a virtualized environment was provided with a lab manual outlining the steps and commands to run. The lab manual also included explanations for each of the step-by-step instructions used in the exercises.
The agenda for the topics covered in the Hands-on-Lab exercises was:
1.    Create Netezza Database Users and Groups (and set privileges)
2.    Create the Workshop database
3.    Create tables in the Workshop database
4.    Load data into the Netezza Appliance with the nzload utility using the External Table framework
The workshop showed how simple it was to setup a IBM Netezza appliance after it has been delivered and configured. A factory-configured and installed IBM Netezza appliance includes some of the following components:
§  An IBM Netezza data warehouse appliance with pre-installed IBM Netezza software
§  A preconfigured Linux operating system (with Netezza modifications)
§  Several preconfigured Linux users and groups:
§  An IBM Netezza database user named ADMIN. The ADMIN user is the database super-user, and has full access to all system functions and objects
The IBM Netezza appliance also includes a SQL dialect called Netezza Structured Query Language (NZSQL). You can use SQL commands to create and manage your Netezza databases, user access, and permissions for the databases, as well as to query and modify the contents of the databases.
On a new IBM Netezza appliance, there is one main database, SYSTEM, and a database template, MASTER_DB. IBM Netezza uses the MASTER_DB as a template for all other user databases that are created on the system.
Before creating the databases and tables, a brief explanation was provided about the virtualized environment used in the workshop. This also included how to connect to the Netezza appliance, which is completed through the Netezza SMP Host. Once connected to the Netezza appliance a set of new users were created, which were used for the remainder of the workshop. The concept of users and privileges were explored later when the database and tables were created. This would involve setting up a basic Security Access Model, which restricted or permitted certain actions to objects within the Netezza Appliance.
After the Netezza Database Users were created the database and the tables for the workshop were created. Once the database and the tables are created, the next step as with any data warehouse environment is to load data into the tables in the database. This was easy by using the Netezza utility nzload which uses the External Table framework to efficiently load data in to a Netezza database. This framework contains more than one component, some of these components are:
§  External Tables -- These are tables stored as flat files on the host or client systems and registered like tables in the Netezza catalog. They can be used to load data into the Netezza appliance or unload data to the file system.
§  nzload -- This is a wrapper command line tool around external tables that provides an easy method loading data into the Netezza appliance.
§  Format Options -- These are options for formatting the data load to and from external tables.
With a good understanding on how to create and populate tables in a Netezza database discussion followed on the importance of Data Distribution. Since IBM Netezza is built on a massively parallel architecture that distributes data and workloads over a large number of processing and data nodes, the single most important tuning factor is choosing the right distribution key. The distribution key governs which data rows of a table are distributed to a data slice and it is very important to choose an optimal distribution key to avoid data skew, processing skew and to make joins co-located whenever possible. This concept was so important that a separate section was devoted to this topic. The exercises examined how to pick the best Hash Key for distribution for each of the tables created in this workshop. During these set of exercises CTAS tables were utilized that showed how easy it is to change the Hash Key for a table without having to manually recreate and reload the data in the table.
For More Click On Below Link:



1 comment: