Refactoring the API .. and Scenario Wrapping

I realize that a lot of the readers of this blog don’t have much direct interest in programming – they are decision makers, strategists, managers, analysts etc.

But code impacts on us so much, code “appliances” are going to be so much of your daily life as a user and quite probably a commissioner (as in idea creator, outsourcing the task) of such things, that it is interesting and useful to talk about the key issue in code design and implementation - managing complexity.

(There are actually some statistical aspects of this too, as methods of managing complexity also impact on performance and reliability testing .. which is a statistical problem).

I recently had the task, I won’t say it was a fun thing because it was not, of writing a class to copy a file. Note that I say a class .. I rarely do functional/procedural programming anymore, preferring to deal with “things” (as in objects, classes, OOP – Object oriented programming) .

Copy a file you say? a trivial task, surely.

Well, yes and no.

The actual mechanics of reading from one file and writing to another are trivial (well, not quite, there are issues about huge files, restart points, feedback during copy, the extent to which the process can be assigned to a background thread and if so what you do with notification completion, the buffer size you use etc etc) :

the API (the Windows Application Programming Interface, the core Windows engine) supplies several ways of doing a copy (such as, yes, CopyFile or the more promising CopyFileEx – pity about that.. CopyFileEx does not run on 98, and CopyFile does not give any feedback, nor does it create folders if needed.. or ShFileOperation, what about that ? .. nope, no feedback unless you set a copy hook handler which requires version 4 or later of Shell32.dll; Windows Scripting Host .. no, not always installed; likewise IO Completion Ports (CreateIOCompletionPort) are very nice, but do not run on 98 or ME. etc etc.)

So, more complexities than you ever want to think about, and we have not even got into esoterica like Alternate Data Streams (supported on NTFS only) or associated/linked files and what happens to them during a copy.

If your eyes are starting to glaze over, bear with me because I want to talk about how to manage and (partially) hide this complexity, how to give it reasonable exposure, and how to talk to coders/programmers about it.

The bottom line is that it is not much use to you if you have an app developed that rests on a set of assumptions WITHOUT YOU KNOWING ABOUT IT.

Finding out what you really need to know without becoming an expert yourself is really really tough, and it is not sufficient to issue a dictat that “it must run on XYZ operating system”.

That is because there is a FURTHER level of complexity – remember we are talking about a “simple” file copy here – and that is the environment in which it will run.

By environment I do not mean the machine or the operating system, I mean the user-specific current set of files and his/her preferences. What happens if a file already exists? what should we do with a read-only file if we are copying it .. automatically change it to read-write? Should we preserve the original dates on the file or should we do something brain-dead like the shell copy des (set the created date to today, but leave the modified date as it was leaving us with a situation where the file has ostensibly been created after it was modified).

Who is the housekeeper?

Housekeeping issues you say. Yes, but where should they be managed? In the CLIENT code (that is, the code that calls this FileCopier class) or IN THE CLASS ITSELF.

If we take the former, we are relying on the USER (that is the programmer who instantiates this class) to think about and do a good job of this environmental complexity – effectively the Windows API approach (a zillion functions, separate documentation, relies on an extremely knowledgeable and forward-looking and patient programmer): that is unlikely, and will lead to spaghetti code full of holes (pardon the mixed metaphors).

So, ONE way you can know that you are managing complexity is to have a class that EXPOSES the complexity (doesn’t ignore it, doesn’t wish it away, doesn’t hide it).

If you DON’T get a class that looks something like this

 
  public

    // initial setup
    constructor Create(Logging: Boolean);
    destructor Destroy; override;
    property BufferSize

    // setup for a particular copy/clone operation
    property DestinationIsFolder
    property SourceFile
    property Destination

    property CopyOverExisting
    property CopyOverReadOnly

    property ForceDirectoriesAllowed
    property CheckPathValidityBeforeCopy
    property CloneDates
    property CloneAttributes
    property MakeDestinationRO
    property MakeDestinationRW

    property VerifyCopy

    // action
    procedure DoCopyClone;

    // progress indicators and outcomes
    property BytesTransferred
    property TimeTaken

    property SourceFileSize
    property SourceFileAttributes
    property SourceCreationDate
    property SourceLastAccessDate
    property SourceModifiedDate

    property TransferStatus

    // progress events
    property OnBufferTransferred
    property OnDataTransferCompleted

    // log
    property Logging
    property LogStrings
  end;                              // end class declaration 

then you KNOW that it is not going to be robust.

But wait a minute .. we started with a complex situation, now we have a complex class to manage it! : ok it is exposed and managed, but still complex.

True, and although we want to make things as simple as possible but no simpler (Einstein, I think) there are a couple of nice things we can do : friends/helpers and scenario wrapping.

Complex classes are hard to use : they lead more or less directly to complex or arcane user interfaces – you know the sort, 200 check boxes to set all aspects of the operation with descriptions that are terse or obscure (it really is not sensible to expect an ordinary mortal to know what a subnet mask is for TCP/IP settings).

Property Grouping

Some languages (notably Delphi and later VB) allow you to group the properties of a component in the Object Inspector in the IDE.

This is not, imho, much of a solution. The IDE (Integrated Development Environment) is not part of the language, not part of the class design .. it is just a sort of fancy desktop that makes it easier, supposedly, to develop your app… the “visual programming” paradigm.

And components are not classes .. they are classes with mechanisms that allow them to register themselves with the IDE, and mostly intended for visual property setting, point and click rather than type.

We should not have to turn a class into a component just so we can manage its exposed properties in some semantically and logically meaningful way.

And it should not be forgotten that the component based visual approach to programming has significant implications for complexity management.

Because the programmer points and clicks, maybe drags, or selects from a drop down menu (all visual and mousy) there is no exposed record of the actions and choices made : thus the complexity gets hidden (again) .. a complex component, coupled with complex and unexplained choices by the programmer. In fact this problem can become so bad that special tools are needed to list the property settings of the components or convert them into visible, readable code – ComponentsToCode of Gexperts is imho the best of breed of these tools.

We do NOT want HIDDEN complexity, we want MANAGED and MEANINGFUL complexity.

Meta Properties

If we are prepared, as coders, to go the extra mile and define a few extra classes as ‘property containers’ then we can have a class declaration that looks something like this:

 
metaproperty ConditionsAllowingCopy
    property CopyOverExisting
    property CopyOverReadOnly
    property ForceDirectories

metaproperty StampOnCompletion
    property CloneDates
    property CloneAttributes
    property MakeDestinationRO
    property MakeDestinationRW 

Referencing the class would then use notation like

 
mycopier.ConditionsAllowingCopy.CopyOverExisting := True; 

Ok, we have achieved some LOGICAL grouping of the properties, but there is something unsatisfying about this.

· we have a bit of verbosity in usage, but that is tolerable

· it requires the coder to write extra classes only for the purpose of grouping .. tedious at best and, at worst, breaks encapsulation

· there is some logical grouping, maybe with some halfway decent semantics, but that is at the level of functional organization (what the properties DO) .. we still have not embedded any concept of common usage patterns or scenarios : in other words, no organization by how the class can be applied, by the user’s common requirements

I suspect you won’t see much code like this.

Storing Learning With a Friend

And so we go to friends and helpers.. (not friends in the C++ sense). If you as a programmer or a user are given a class with complex functionality, and it is a sealed or third-party class or component, it is going to take time for you to learn how to use it.

Where should that learning, the specific set of settings that accomplishes the task you want, be stored?.

Not in the application code, but in a special helper class that exists only as a manager of/friend to/simplifier of the complex component.

That way, when you next come back to the class you do not attempt to remember or re-learn all the 200 properties and their interrelationships, you go straight to the helper class, say ahah – yes, that is how I want it, but I need to tweak one or two switches a bit differently.

Scenario Wrapping

Scenario wrapping is much the same thing, but more applicable when you are the class author or can extend it .. simply add some well named entry points as alternate constructors or property setters .. something like

 
	SetupFor_CopyFile2Folder_ForSure
	SetupFor_CopyFile2File_OnlyIfClear  

Not exactly literate programming, but at least semi-literate. And the function names (actually they are class methods) invite some inspection of the code or documentation : that code sets several properties at once in some meaningful fashion to reflect a real-life usage strategy

Incremental learning, knowledge encapsulation, re-use. Just what we want.

I’d have to say that the API is so far from this sort of thing and so are the functional code wrappers that every man and his dog writes to “hide the complexity” of the API (thereby making sure that you have to go on a detective hunt before you dare use that code/blithely accept unknown and unmanaged risk), that it is no wonder that even fundamental programming tasks like “copy a file” are either time consuming or are accidents waiting to happen (to you).

Leave a Comment