<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.0.1" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Mining Terabytes on the Desktop</title>
	<link>http://dsanalytics.com/dsblog/mining-terabytes-on-the-desktop_103</link>
	<description>Data Analytics- the art and science of analyzing data</description>
	<pubDate>Sun, 06 Jul 2008 04:21:47 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.1</generator>

	<item>
		<title>by: John Aitchison</title>
		<link>http://dsanalytics.com/dsblog/mining-terabytes-on-the-desktop_103#comment-70</link>
		<pubDate>Wed, 08 Aug 2007 08:16:24 +0000</pubDate>
		<guid>http://dsanalytics.com/dsblog/mining-terabytes-on-the-desktop_103#comment-70</guid>
					<description>thanks for the comment Joe. I did at various stages through the Netflix analysis end  up keeping everything in core, but at the end of the day it seems to work quite well if you keep it on disk and load the needed bits as required, making sure that the OS caches it for you. Disk caching == Virtual Memory, more or less. And yes, I do have 2gb of memory but never need to use more than about 1gb.</description>
		<content:encoded><![CDATA[<p>thanks for the comment Joe. I did at various stages through the Netflix analysis end  up keeping everything in core, but at the end of the day it seems to work quite well if you keep it on disk and load the needed bits as required, making sure that the OS caches it for you. Disk caching == Virtual Memory, more or less. And yes, I do have 2gb of memory but never need to use more than about 1gb.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Joe Smith</title>
		<link>http://dsanalytics.com/dsblog/mining-terabytes-on-the-desktop_103#comment-69</link>
		<pubDate>Mon, 06 Aug 2007 22:35:05 +0000</pubDate>
		<guid>http://dsanalytics.com/dsblog/mining-terabytes-on-the-desktop_103#comment-69</guid>
					<description>If you browse around the Netflix site you will find people putting the data into structures measured in a few hundred megabytes so it is possible (for some algorithms) to fit all of the data into the RAM on a modern desktop.  Simon Funk says you should have 2 gigabytes but there are obviously people doing it in less.</description>
		<content:encoded><![CDATA[<p>If you browse around the Netflix site you will find people putting the data into structures measured in a few hundred megabytes so it is possible (for some algorithms) to fit all of the data into the RAM on a modern desktop.  Simon Funk says you should have 2 gigabytes but there are obviously people doing it in less.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: &#187; psst .. want some cheap n-grams? [ Data Sciences Analytics ]</title>
		<link>http://dsanalytics.com/dsblog/mining-terabytes-on-the-desktop_103#comment-62</link>
		<pubDate>Sat, 21 Jul 2007 01:51:13 +0000</pubDate>
		<guid>http://dsanalytics.com/dsblog/mining-terabytes-on-the-desktop_103#comment-62</guid>
					<description>[...] And lots of other uses. But beware .. you will need your system set up for mining terabytes on the desktop  .. the corpus comes on 6 DVD&amp;#8217;s [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] And lots of other uses. But beware .. you will need your system set up for mining terabytes on the desktop  .. the corpus comes on 6 DVD&#8217;s [&#8230;]
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: &#187; Managing Terabytes - on line and off line [ Data Sciences Analytics ]</title>
		<link>http://dsanalytics.com/dsblog/mining-terabytes-on-the-desktop_103#comment-38</link>
		<pubDate>Mon, 28 May 2007 09:13:28 +0000</pubDate>
		<guid>http://dsanalytics.com/dsblog/mining-terabytes-on-the-desktop_103#comment-38</guid>
					<description>[...] I have previously written about mining terabytes of data with a desktop machine.. some of the strategies that work for me. [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] I have previously written about mining terabytes of data with a desktop machine.. some of the strategies that work for me. [&#8230;]
</p>
]]></content:encoded>
				</item>
</channel>
</rss>
