Hello, you are not logged in.  Login or sign up
Community >> Blogs
Search Toad World Search

Do you have a topic that you'd like discussed?  We'd love to hear from you.  Send us your idea for a blog topic.

Send a memo to yourself!
 
Location: Blogs Steven Feuerstein's Blog    
 StevenFeuersteinTW Thursday, March 29, 2007 9:37 AM

I rediscovered the joy of writing to myself a week ago, and I also was reminded of the danger of coding in isolation (that is, working by oneself).

I recently tangled with my implementation of column set management in Quest Code Tester.

"Column set management," you wonder, "what's that?"

In this amazing, automated code testing tool for PL/SQL, you can at a high level describe the tests you want Quest Code Tester to perform. It then generates all the PL/SQL code needed to execute the test.

One possible test is to compare the contents of a record returned by a function to a query that retrieves a row of data.

To generate the code for this test, we need to obtain the names and datatypes of the fields of the record, and the same for the elements in the query's select list. Each of these sets of fields and/or columns is called a "column set."

In the 1.2.5.5 release of Quest Code Tester (October 2006), our implementation was clumsy and scaled poorly. We would generate hundreds of lines of code, some of which would be used to declare several database objects (object type, nested table type, table function package) to supplement the core testing package. And if you had several different tests, we would generate multiple copies of this code, even if the record or query structures were the same.

So in 1.5.1 (the February 2007 production release), we came up with a much better solution: calculate the column sets for each of the elements of these dataset tests and then store that information in the qu_column_set table. That worked a whole lot better.

But soon we uncovered some problems in that implementation. Now, I could go into all the details, but surely you will find it a bit boring. If you actually do want to read about the 1.5.1 limitations, see Appendix 151 of this blog entry (J). This text is lifted directly out of the document I started to help me sort through all the issues related to column sets.

So as part of the enhancements for 1.6, I resolved to "get it right" this time.

I thought about it for awhile and came up with a plan that looked like it would do the job. Unfortunately, it also made my life and code more complicated. So late one Saturday night, I jotted down a list of things I could think of in a Eudora email note (don't worry, again I am not expecting you to understand what the heck I am talking about, just the fact that it was a lot of stuff:

·          Add column_set_guid to qu_attributes (already in the create ad upgrade files)

·          Generate new table API packages and make all necessary changes to test builder etc. to manage attribute rows.

·          Upgrade script to move column_set_guids from outcome to attributes.

·          Don't put a colset guid there if null. That is, don't copy from one to the other.

·          The column set GUID for an attribute is specific to that dataset and is based on a strict compatibility analysis (hash of query).

·          Compatibility for purposes of assertion logic is based on weak analysis: number and type match.

·          We use the column set names for each side independently (as in the records equal logic).

·          I will need to substantially rework some very tricky logic in the qu_outcome_xp package to generate column sets, and also validate outcomes.

·          (consider) Don't clean up/delete col set rows when an outcome is changed. Minimize chance of bug and what difference does it make if it stays there.

And then I went to bed.....and woke up Sunday morning at my usual 7 AM or so, my mind bubbling over with ideas about how I was going to implement the above list.

Usually, at this point, I would simply dive right in and start making changes to my code. Fortunately, perhaps because it was so early and my mind was a bit fuzzy, I felt intimidated about all the changes to the code I faced. Now, if I was working in an office with others, I could have called a meeting to discuss my design and see what everyone thought. But my wife was still sleeping and anyway she is a ceramicist (a potter), not a programmer.

So instead I opened up my QCTO Backend Architecture document and started up a new section titled " Better way to do this in 1.6" and just started typing down my thoughts. In particular, I let my mind range over the different situations I needed to support (table against query, record against record, record against query, etc.) and tested my design against these.

Everything was looking OK until I thought to myself: what if the user adds a column to a table from which my record structure is deduced? We wouldn't realize that change had taken place, and now the column set information would be out of date.

Yikes! I suddenly realized that the big downside of my calculation and storage of the column set information is that it is a static copy. I was assuming that once the user specified it, it wouldn't change, but that clearly wasn't the case (or could not be assumed).

My whole idea was fundamentally flawed!

Fortunately, as I thought about it further (damping down feelings of panic), I realized that I could shift to dynamically generating my column set information as I needed it. In other words, I could stop using the qu_column_set table entirely, and end up with much more flexible code. I wrote these ideas down (see Appendix 16 below if you really want to get into the details), then took a look at my code. It soon became clear that I could leave most of my complicated logic unchanged, and simply change the implementation of a few key functions that already hid the way that I stored and retrieved column set information.

I could, in fact, get away with doing little more than revamping the body of the  qu_column_set_xp.column_sets_for_harness function: rather than query from the underlying table, I would now generate the column sets from the outcome definitions!

A few hours later, I was able to put the new approach in place, and most importantly minimize the impact on the code base (always critical when dealing with an application already in production).

Looking back on that experience, I was struck at how much my thinking changed in a short period of time, at how close I came to greatly complicating my life and my code (and still ending up with something inadequate), and how important it had been for me to write down my thoughts.

And so I urge you to take this advice:

If you can't talk to someone else, write a memo to yourself.

When I write code in isolation, I lose a valuable sounding board. Being able to turn to a co-worker and share ideas and frustrations, and to ask for help, has got to be among the most important arguments against working from home.

If you spend too much time inside your own head (and your own problems) you lose objectivity. It is very hard to question your own assumptions, but it is an absolute delight to uncover and challenge the assumptions of others.

Yet we often do write code in isolation (either because we work from home, we are a one-person shop, or we don't feel comfortable asking people for help). In this case, I have found it most helpful to externalize my thinking as much as possible. Saying my idea out load is probably the best way to emulate talking to another human, but that can feel really silly. Second best is to write down your ideas.

Seeing them written down, outside of your head, gives you a whole new perspective, allows you to critique your own thinking more effectively.

So the next time you are designing a new feature or trying to figure out the solution to a problem, and you are by yourself, get them out of your head. Type out your ideas. Heck, you can even write it as though you were having a conversation.

You will almost certainly uncover weaknesses or problems, you will be forced to address those inadequacies, and you will end up with a much stronger solution.

Appendix 151

The central problem is this: the column sets are actually needed in two different ways.

1. To test compatibility between changed by program and expected results datasets (can I compare the employees to departments table? No!)

2. To generate the queries and table function code for the test package. Particularly with cursor variables (and records), the table or query provided determines the structure of data being fetched.

Sadly, the requirements for these two uses are different. The compatibility analysis (1) can be more forgiving than the generated code (2), because with (2) I need in a number of situations to get the right column name as well as the datatype.

Another problem is that the compatibility analysis can be too  forgiving, resulting in the use of the wrong query set. In other words, when I check to see if I can reuse an existing column set to define my outcome, I need stricter rules (probably something like what Oracle does for reusing parsed plans of SQL statements). But when checking to see if two datasets are compatible for code generation, the implicit conversions are OK.

Finally, another problem I see is that we have a single column_set_guid column for qu_outcome. But I may have two different datasets (result and control attributes) with different column lists, and I will need a separate column set for each.

Appendix 16

We need to start managing column set information on an attribute, not outcome, basis. And this should be done dynamically. That is: stop using the column set guid column entirely.

After all, why do I need to maintain this information separately, at all? Why not simply generate what I need at runtime from the query or table? After all, those structures can change (!), even outside of the product (I can add a column!), which would render my static expansion into a column set invalid.

What if I use the column sets to generate logic within the test package (table functions)? I then use those table functions to retrieve information that is used throughout the test case executions. For example, with a cursor variable, I extract the data once and then reuse that data set throughout the test case outcomes (can be more than one).

So once the test code is generated, I need to be able to make an assertion that "this dataset is that the same as that dataset" and automatically apply the correct dataset in the test case.

Well, since I don't need the information until generate time, why not in essence produce the column set guids at the time of generation, do not put anything in tables (outside of the value1 values used to identify columns), and then work from those?

That way, we can "store" column set information at the attribute level without actually needing those columns.

Copyright ©2007 Quest Software Inc.
Permalink |  Trackback

Comment:
Add Comment   Cancel 
Search Blog Entries
 
Copyright 2008 by Quest Software  | Terms Of Use | Privacy Statement | Contact Us