<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Marketing is Hard and Scary</title>
	<atom:link href="http://blog.snowtide.com/2005/06/19/marketing-is-hard-and-scary/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.snowtide.com/2005/06/19/marketing-is-hard-and-scary</link>
	<description>building complex, innovative software and the business that goes with it</description>
	<pubDate>Fri, 04 Jul 2008 12:53:26 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
		<item>
		<title>By: cemerick</title>
		<link>http://blog.snowtide.com/2005/06/19/marketing-is-hard-and-scary#comment-3</link>
		<dc:creator>cemerick</dc:creator>
		<pubDate>Mon, 14 Aug 2006 14:04:51 +0000</pubDate>
		<guid isPermaLink="false">http://blog.snowtide.com/?p=9#comment-3</guid>
		<description>&lt;p&gt;We published the benchmark, the related code, and our testing methodology in such painstaking detail so as to avoid any appearance of dishonesty or that we were attempting to make false claims. The bottom line is that, if you doubt the results, download the code, and run the benchmarks yourself. I'm not sure how we could be more honest than that.&lt;/p&gt;

&lt;p&gt;Etymon PJ was abandoned in favor of PJx years ago; PJx hasn't been under active development since April of 2004 though (see http://sourceforge.net/projects/pjx/), and in its current state provides no API for text extraction that we can see. However, our original benchmarks nevertheless showed the older PJ library to be the fastest of the available libraries (second to PDFTextStream), so we included it even though Etymon doesn't appear to support it anymore.&lt;/p&gt;

&lt;p&gt;JPedal r14 is the current major release. It looks like there's been a minor point-upgrade since the last download we did of that library, with no mention of significant performance improvements in the changelog. However, we have downloaded the most recent build (corresponding to v2.40b38), re-ran the benchmark, and posted the new results. They are almost exactly identical to those gathered in previous tests.&lt;/p&gt;

&lt;p&gt;You do have a point with PDFBox. v0.7.1 is out now, and we've been benchmarking against the v0.6.6 release (although that represents an 8-month lag, not a 2-year lag). So, we've downloaded v0.7.1 and tested against it. PDFBox's numbers improved some, but not enough to significantly impact PDFTextStream's performance advantage -- it was 7.5x faster, now it's 6.7x faster. I'm glad for the improvements Ben Litchfield et al. have made in that direction; it'll keep us on our toes, and in the end, that's what this is all about.&lt;/p&gt;

&lt;p&gt;I'll have more to say about this in a regular post soon.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>We published the benchmark, the related code, and our testing methodology in such painstaking detail so as to avoid any appearance of dishonesty or that we were attempting to make false claims. The bottom line is that, if you doubt the results, download the code, and run the benchmarks yourself. I&#8217;m not sure how we could be more honest than that.</p>
<p>Etymon PJ was abandoned in favor of PJx years ago; PJx hasn&#8217;t been under active development since April of 2004 though (see <a href="http://sourceforge.net/projects/pjx/" rel="nofollow">http://sourceforge.net/projects/pjx/</a>), and in its current state provides no API for text extraction that we can see. However, our original benchmarks nevertheless showed the older PJ library to be the fastest of the available libraries (second to PDFTextStream), so we included it even though Etymon doesn&#8217;t appear to support it anymore.</p>
<p>JPedal r14 is the current major release. It looks like there&#8217;s been a minor point-upgrade since the last download we did of that library, with no mention of significant performance improvements in the changelog. However, we have downloaded the most recent build (corresponding to v2.40b38), re-ran the benchmark, and posted the new results. They are almost exactly identical to those gathered in previous tests.</p>
<p>You do have a point with PDFBox. v0.7.1 is out now, and we&#8217;ve been benchmarking against the v0.6.6 release (although that represents an 8-month lag, not a 2-year lag). So, we&#8217;ve downloaded v0.7.1 and tested against it. PDFBox&#8217;s numbers improved some, but not enough to significantly impact PDFTextStream&#8217;s performance advantage &#8212; it was 7.5x faster, now it&#8217;s 6.7x faster. I&#8217;m glad for the improvements Ben Litchfield et al. have made in that direction; it&#8217;ll keep us on our toes, and in the end, that&#8217;s what this is all about.</p>
<p>I&#8217;ll have more to say about this in a regular post soon.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: anon</title>
		<link>http://blog.snowtide.com/2005/06/19/marketing-is-hard-and-scary#comment-2</link>
		<dc:creator>anon</dc:creator>
		<pubDate>Mon, 14 Aug 2006 14:02:52 +0000</pubDate>
		<guid isPermaLink="false">http://blog.snowtide.com/?p=9#comment-2</guid>
		<description>&lt;p&gt;If the product is so good, why are your speed comparisons using your latest version against 2 year old products.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>If the product is so good, why are your speed comparisons using your latest version against 2 year old products.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
