DataSciencesAnalytics
Weblog


Hi there

I'm John Aitchison, an Australian statistical consultant based just out of Sydney.

And this is my weblog "experiment".

My work can best be described as data analytics .. that is, I have had more than 20 years experience with, and enjoyment of, the probing, exploration, analysis, modelling, interpretation and reporting of datasets : data sets from market research surveys and elsewhere.  

Multivariate statistical analysis with a data mining/machine learning flavor (flavour).

Where my work differs from that of other statisticians is, I suppose, that I work mostly with data that others have collected (it could be market research survey data, or financial datasets, or business records, or databases). And I probably do more programming than most - ..

..applications programming such as accessing the databases, delivering the results in some computer based model or application program, and algorithmic programming - I recently completed a Support Vector Machine implementation in Delphi, and am currently working on a visual/spatial hybrid between classification and clustering.

While I do get involved in research design - and I have a viewpoint on this : research design should be driven by a clear vision of what analyses the generated dataset will permit - and have years of hands on experience in data collection (but we don't do that any more), "statistics" is such a vast and expanding area that I think it sensible to state my area of expertise as "data analytics", rather than general purpose statistical consulting.

OK, enough about me.

I built this particular site because I wanted to explore the potential of the weblog (aka "blog") technology to put some thoughts and ideas in front of clients (and anyone else who is interested).

For some time I have been aware of the "barriers to publishing" .. the dreary business of composing and building and deploying a document, converting from an authoring package to HTML and checking that HTML, FTP'ing it up to the site, checking cross browser compatibility.

Doing a whitepaper that way is often just too much of a hassle, so the notes get written and the draft filed away.. never to see the light of day.

Admittedly it is much less of a hassle than sending out a paper newsletter - I used to issue "The Data Sciences Newsletter" once a month, and clients liked it, but despite the many advantages of a physical paper document (read anywhere, scribble on it, tear it up ..) I cannot see that happening any time soon.

So, the weblog.

I'll talk a bit about the underlying technologies in a moment, also about why I think it could be feasible - without taking over my life or bombarding our clients with "fluff text".

But firstly I'd like to muse a bit on "info bites", how and when fully developed whitepapers are needed or justified, and how technology affects content. Not content delivery, but content itself.. ..

The short answer on whitepapers is "probably never". Most of us don't have the time to develop the argument or the idea fully and, if we do, the matter is probably the basis of a commercial product or a report to a client, so it won't appear here.

I am not so sure that clients or readers have the time for whitepapers, either. Or the patience or the headspace to get involved with a long discussion.

I am not suggesting that a closely reasoned piece is not a joy and a delight (quite the contrary), nor am I endorsing sloppy thinking or hand waving.

But I do believe that there is a very serious place for a quick "creative jolt", an "info bite" (as in "sound bite"), or maybe that should be an "idea bite"

In my experience, humans are incremental learners (like some machine learning algorithms..) : it takes quite a long time, presumably with the subconscious hard at work, before the pieces can fall into place, before this bit can build on that foundation, before the RELEVANCE or APPPLICATION of some idea or datum becomes apparent.

If true, that is a strong argument for getting the ideas out there NOW rather than later: incomplete, inchoate, malformed as they may be. I suspect also, without any evidence, that we are getting rather more used to processing larger quantities of information but in smaller packets - a study of mobile phone communications, email and texting might provide some support for this notion.

Actually, the process of writing will be of direct benefit to me. Writing has always been a great idea clarifier, and I have for some time been keeping a "running sheet", a  word processor based personal log of project progress - essentially a daily summary of what has been done and what happens next.

So, my weblog experiment.

How it is implemented and how it affects the content.

Well, the base technology is straightforward enough. Some PHP and a MySQL database. A web based editor, and some upload facilities. I chose to use Wordpress as the weblog engine and to host this on a UNIX server with Cpanel which also had Fantastico running, so it did the install (more or less) for me. The blog appearance is easily adjusted with themes, which are just CSS when it comes down to it.

Since what you have is essentially a content management system (CMS) - a database backend with some HTML front ends, you are not overly restricted in how you display the data, or what else you have on the site. ..

The site does not have to equate to the blog.
This site, for example, has a front page quite separate from the blog.

Put another way, a weblog engine seems quite a sensible starting point for building a general purpose website, particularly one which is content rich and where the content is evolving.

I am not even sure that people quite understand that a blog is just an ordinary website (OK it has a database backend, but that is of no interest to the user). And that all you need is a browser, you don't have to get into the arcana of RSS and ATOM and syndication just to be able to read it.

And perhaps there is a prevailing misconception that a blog is only suited to the daily musings of self appointed political gurus and self-indulgent teenagers : cyberspace does not need more of those.

A blog does NOT have to be updated daily or at any frequency. As the author, I most certainly would shy away from the concept were that the case - I only want to write when I have ideas, when I have something to say. If I find the blog becoming the master rather than an amenable servant, well it will be killed off.

Wordpress has a concept of "pages" and of "posts". Pages are just normal HTML pages, authored in the same way as posts .. perhaps they are intended as the repository for longer content and the posts more for infoBites, but it is certainly possible to construct a blog that is all pages or all posts. Exactly how our blog balances out, we shall see.

Posts can be (but don't have to be) organized by category, and multiple overlapping categories. There are semantic issues about categorization that maybe I will talk about one day on the blog.

Content for posts is ALWAYS added at the top .. a big plus imo, and in fact this is how my internal running sheet is organized. Newest content topmost, rather than at the bottom which is the traditional way when building a document, seems to work well for summarizing the status quo and for outlining future directions.

I intend to use the blog to report on our progress with some projects of interest, so it will be interesting to see how well that usage of a blog is received.

You can go to the dsAnalytics WebLog here

To get in touch with me, in Australia phone 0427-791495, or send me comments/feedback here

Thank you for your interest.

John Aitchison


About the design ..
..

The inspiration for this design is an Open Source Web Design by dreamLogic .. visit them at dreamLogic Web Design.

Content development by dsAnalytics.
CSS redesign and validation likewise.

CSS and XHTML have been validated by W3C, site tested and passed quality control checks under Mozilla Firefox, IE 5.5 and IE 6, Opera 8 and screen resolutions of 800x600 and 1024x768, small and large fonts.


And thanks also to the builders of Wordpress, and the community at Open Source Web Design.