<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.0.1" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Mega-Clustering, the Representation Problem and Representational Invariance</title>
	<link>http://dsanalytics.com/dsblog/mega-clustering-the-representation-problem-and-representational-invariance_117</link>
	<description>Data Analytics- the art and science of analyzing data</description>
	<pubDate>Sun, 06 Jul 2008 04:24:44 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.1</generator>

	<item>
		<title>by: mwexler</title>
		<link>http://dsanalytics.com/dsblog/mega-clustering-the-representation-problem-and-representational-invariance_117#comment-63</link>
		<pubDate>Mon, 23 Jul 2007 16:34:38 +0000</pubDate>
		<guid>http://dsanalytics.com/dsblog/mega-clustering-the-representation-problem-and-representational-invariance_117#comment-63</guid>
					<description>Enjoyed this post; this is a very common complaint among beginning analysts, who seem to understand that this is more complex than it seems.  Over time, they learn the &quot;usual ways&quot; and take them for granted... and then run into the problems again as data gets more complex.

The problem is also exacerbated by commercial software packages, which rarely describe how they expect the data to be formatted: Do they need 1 line per person or unit of analysis?  Do they want each observation on its own line, or aggregated?  Can the procedure handle category data, or just continuous?

At the end of the day, I think variable &quot;creation&quot;, as you phrase it, is a hugely important part of the analytic practice, and one that is glossed over by many practitioners.</description>
		<content:encoded><![CDATA[<p>Enjoyed this post; this is a very common complaint among beginning analysts, who seem to understand that this is more complex than it seems.  Over time, they learn the &#8220;usual ways&#8221; and take them for granted&#8230; and then run into the problems again as data gets more complex.</p>
<p>The problem is also exacerbated by commercial software packages, which rarely describe how they expect the data to be formatted: Do they need 1 line per person or unit of analysis?  Do they want each observation on its own line, or aggregated?  Can the procedure handle category data, or just continuous?</p>
<p>At the end of the day, I think variable &#8220;creation&#8221;, as you phrase it, is a hugely important part of the analytic practice, and one that is glossed over by many practitioners.
</p>
]]></content:encoded>
				</item>
</channel>
</rss>
