Hello, you are not logged in.  Login or sign up
BLOGS
Search Toad World Search

Blogs
Toad and Database Commentaries

Toad World blogs are a mix of insightful how-tos from Quest experts as well as their commentary on experiences with new database technologies.  Have some views of your own to share?  Post your comments!  Note:  Comments are restricted to registered Toad World users.

Do you have a topic that you'd like discussed?  We'd love to hear from you.  Send us your idea for a blog topic.

Unicode Migration
 
Location: Blogs Johannes Ahrends' Blog    
 Johannes 11/3/2008 9:24 AM
Migrating to a database that supports Unicode is a challenge. You are safe if you are currently using  US7ASCII for you Oracle database because Oracle provides a mechanism with ALTER DATABASE SET CHARACTERSET UTF8 which allows you by just bouncing the database to switch to this new character set. But what if you are currently on a different character set? In Europe we have a bunch of WE8 (Western European) or EE8 (Eastern European) character sets and as they are not a strict subset of Unicode the ALTER Database does not work. So the only chance is to export the whole database, create a new one (with UTF8 or similar) and import it again.
Is it that easy? No! The reason is that in most databases the so called length semantics hasn’t been used while creating a table. Since Oracle9i the correct syntax for creating is:
CREATE TABLE test1
(custid   NUMBER(10),
lastname VARCHAR2(6 CHAR))
Have you used this syntax? No of course not! We all used the old version:
CREATE TABLE test1
(custid   NUMBER(10),
lastname VARCHAR2(6))

But what is the difference? The default behavior for an Oracle database is to use “BYTE” as length semantic. And this can cause huge problems while migrating to Unicode.

Let’s assume your current database is working with WE8MSWIN1252 (common on western European Windows Databases). Entering the following name into the customer table is safe:

INSERT INTO customer (custid, lastname)
VALUES
(1,’Müller’);
COMMIT;
If you try to enter the same row in an Oracle database with AL32UTF8 you get the following error message:
ERROR at line 2:
ORA-12899: value too large for column "DEMO"."TEST1"."LASTNAME" (actual: 7, maximum: 6)

The reason is that a German umlaut takes two bytes in Unicode as well as most special characters for other languages.

When using Oracle export/import or datapump the situation doesn’t change because both utilities are using the exported format while importing into an new database.

The only chance is to first create the objects and then do the import.

Toad has a very nice switch under options -> data types which is called “Include Byte/Char spec when creating DDL scripts from 9i databases”. This behavior is not limited to 9i but all existing Oracle versions you can switch off the “BYTE or CHAR” specification for all scripts.

Now you can generate the schema script (under menu export) for all schemas you want to recreate and execute a user export or full export of the database.

On your target you first create the user (maybe with CREATE LIKE…) using Toad and load your generated schema script. Before you execute it ,please add one command to the script:
ALTER SESSION SET NLS_LENGTH_SEMANTICS = 'CHAR';

This will switch the actual session into a mode where the default length semantic is now CHAR instead of BYTE. This works for a session as well as for the database but there are several known bugs if you switch it for the database so better only use it for the actual session.

The test table is now created like:

CREATE TABLE test1
(custid   NUMBER(10),
lastname VARCHAR2(6 CHAR));
which allows you to insert the lastname “Müller” now.

After all objects are successfully created you can run the import for the table data but don’t forget to switch on “IGNORE ERRORS” as the “CREATE TABLE” scripts are included in the export file.

 

Permalink |  Trackback
Search Blog Entries
 
Blogger and Topic List
 

 

All Recent Blog Entries
 

 

Johannes Ahrends
Unicode and Toad

Ben Boise
Toad SC Discussions

Kevin Dalton
Benchmark Factory

Steven Feuerstein
Oracle PL/SQL

Devin Gallagher
Toad SC discussions

  Henrik "Mauritz" Johnson
Toad Tips & Tricks on the "other" Toads
  Mark Kurtz
Toad SC discussions
Daniel Norwood
Toad for Data Analysts
John Pocknell
Toad for Oracle
Bert Scalzo Indicates Oracle ACE status
Toad for Oracle, Data Modeling, Benchmarking
Jeff Smith
Toad product family
Richard To
SQL Optimization
Jim Wankowski
DB2 - LUW and z/OS
John Weathington
Compliance
Doug Williams
Database Musings
  Toad World Editor
Toad World issues

  Toad Data Modeler Opens in a new window
Data Modeling
 
  Real Automated Code Testing for Oracle
Quest Code Tester blog
 

Copyright 2009 by Quest Software  | Terms Of Use | Privacy Statement | Contact Us