<?xml version="1.0" encoding="utf-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Mobster Music</title>
	<atom:link href="http://geoffhutchison.net/blog/archives/2004/11/28/mobster-music/feed/" rel="self" type="application/rss+xml" />
	<link>http://geoffhutchison.net/blog/archives/2004/11/28/mobster-music/</link>
	<description>Geoff Hutchison's personal weblog on chemistry, nanotechnology, science, books, life and software development -- in no particular order.</description>
	<pubDate>Fri, 04 Jul 2008 18:34:55 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.2-alpha</generator>
		<item>
		<title>By: Geoff</title>
		<link>http://geoffhutchison.net/blog/archives/2004/11/28/mobster-music/#comment-41</link>
		<dc:creator>Geoff</dc:creator>
		<pubDate>Wed, 01 Dec 2004 18:30:48 +0000</pubDate>
		<guid isPermaLink="false">http://geoffhutchison.net/blog/archives/2004/11/28/mobster-music/#comment-41</guid>
		<description>Oh, I understand the problem with the semi-duplicates completely. Of course since the real purpose of MusicMobs and Mobster is to generate recommendations, if they don't show up in the recommendation list, it doesn't matter so much. One hack is to allow the recommendation process to look for artists with basically the same name (i.e., do a case-insensitive match, strip out whitespace and punctuation, and possibly words like "The") and add in some extra queries. Alas, this sort of fuzzy match also adds the overhead of multiple artist lookups, meaning a good 4x (or likely more) queries for every recommendation. As I said, this sort of non-uniform data is a bug with just about every database I've used.</description>
		<content:encoded><![CDATA[<p>Oh, I understand the problem with the semi-duplicates completely. Of course since the real purpose of MusicMobs and Mobster is to generate recommendations, if they don&#8217;t show up in the recommendation list, it doesn&#8217;t matter so much. One hack is to allow the recommendation process to look for artists with basically the same name (i.e., do a case-insensitive match, strip out whitespace and punctuation, and possibly words like &#8220;The&#8221;) and add in some extra queries. Alas, this sort of fuzzy match also adds the overhead of multiple artist lookups, meaning a good 4x (or likely more) queries for every recommendation. As I said, this sort of non-uniform data is a bug with just about every database I&#8217;ve used.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Toby</title>
		<link>http://geoffhutchison.net/blog/archives/2004/11/28/mobster-music/#comment-40</link>
		<dc:creator>Toby</dc:creator>
		<pubDate>Wed, 01 Dec 2004 06:56:32 +0000</pubDate>
		<guid isPermaLink="false">http://geoffhutchison.net/blog/archives/2004/11/28/mobster-music/#comment-40</guid>
		<description>Hey, just caught your post. Very excellent review. The reason that there are semi duplicates is because I don't have the processing power to smooth out all of the data and haven't had the dev time to work out another solution. I'm planning on doing some sort of aliasing sometime in the near future. For now though, it's an inconvenience but not a fatal one.</description>
		<content:encoded><![CDATA[<p>Hey, just caught your post. Very excellent review. The reason that there are semi duplicates is because I don&#8217;t have the processing power to smooth out all of the data and haven&#8217;t had the dev time to work out another solution. I&#8217;m planning on doing some sort of aliasing sometime in the near future. For now though, it&#8217;s an inconvenience but not a fatal one.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
