<?xml version="1.0" encoding="utf-8"?><!-- generator="whissip/4.1.0-beta" -->
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title>Daniels Blog - Latest Comments on Privoxy Filter for Google Analytics</title>
		<link>http://daniel.hahler.de/?disp=comments</link>
		<atom:link rel="self" type="application/rss+xml" href="http://daniel.hahler.de/?tempskin=_rss2&amp;disp=comments&amp;p=900" />
		<description></description>
		<language>de-DE</language>
		<docs>http://backend.userland.com/rss</docs>
		<admin:generatorAgent rdf:resource="http://b2evolution.net/?v=4.1.0-beta"/>
		<ttl>60</ttl>
				<item>
			<title> Dan [Besucher] in Antwort auf: Privoxy Filter for Google Analytics</title>
			<pubDate>Mon, 22 Jun 2009 22:09:13 +0000</pubDate>
			<dc:creator>Dan [Besucher]</dc:creator>
			<guid isPermaLink="false">c12659@http://daniel.hahler.de/</guid>
			<description>I believe you can also just do this:&lt;br /&gt;
&lt;br /&gt;
{ +block{Google crap} +handle-as-empty-document}&lt;br /&gt;
google-analytics.com/.*\.js$&lt;br /&gt;
googlesyndication.com/.*\.js$</description>
			<content:encoded><![CDATA[I believe you can also just do this:<br />
<br />
{ +block{Google crap} +handle-as-empty-document}<br />
google-analytics.com/.*\.js$<br />
googlesyndication.com/.*\.js$]]></content:encoded>
			<link>http://daniel.hahler.de/privoxy-filter-for-google-analytics#c12659</link>
		</item>
				<item>
			<title> Casey Jones [Besucher] in Antwort auf: Privoxy Filter for Google Analytics</title>
			<pubDate>Fri, 22 May 2009 09:44:13 +0000</pubDate>
			<dc:creator>Casey Jones [Besucher]</dc:creator>
			<guid isPermaLink="false">c12554@http://daniel.hahler.de/</guid>
			<description>Thanks very much, Terry.  Not long after posting that, I did come to the realization that Privoxy uses UNIX regular expressions.  Believe it or not, I hadn&#039;t known of their existence before, and thought that Privoxy was using some kind of proprietary pattern-matching expressions which they didn&#039;t explain very well...  But it was written directly at the beginning of Chapter 9 of Privoxy&#039;s manual that one should &quot;be familiar with HTML syntax, and, of course, regular expressions.&quot;  A forehead-slap is in order!  Now I am enlightened...</description>
			<content:encoded><![CDATA[Thanks very much, Terry.  Not long after posting that, I did come to the realization that Privoxy uses UNIX regular expressions.  Believe it or not, I hadn't known of their existence before, and thought that Privoxy was using some kind of proprietary pattern-matching expressions which they didn't explain very well...  But it was written directly at the beginning of Chapter 9 of Privoxy's manual that one should "be familiar with HTML syntax, and, of course, regular expressions."  A forehead-slap is in order!  Now I am enlightened...]]></content:encoded>
			<link>http://daniel.hahler.de/privoxy-filter-for-google-analytics#c12554</link>
		</item>
				<item>
			<title> terry chay [Besucher] in Antwort auf: Privoxy Filter for Google Analytics</title>
			<pubDate>Wed, 24 Dec 2008 22:46:57 +0000</pubDate>
			<dc:creator>terry chay [Besucher]</dc:creator>
			<guid isPermaLink="false">c12103@http://daniel.hahler.de/</guid>
			<description>1) It means any text that isn’t a greater than sign&lt;br /&gt;
2) It means no greedy regex capture (i.e. .*? = capture the minimal amount of text that matches this pattern)&lt;br /&gt;
3) it&#039;s a word boundary. It means that the match has to be the beginning of a &quot;word&quot; (i.e. theurchinTracker() won&#039;t match but urchinTracker() will)&lt;br /&gt;
4) Try refreshing and remember to include the filter rule in user.action and have a trailing &quot;/&quot; so it filters all URLs.&lt;br /&gt;
&lt;br /&gt;
I hope this helps.&lt;br /&gt;
&lt;br /&gt;
Take care,&lt;br /&gt;
&lt;br /&gt;
terry</description>
			<content:encoded><![CDATA[1) It means any text that isn’t a greater than sign<br />
2) It means no greedy regex capture (i.e. .*? = capture the minimal amount of text that matches this pattern)<br />
3) it's a word boundary. It means that the match has to be the beginning of a "word" (i.e. theurchinTracker() won't match but urchinTracker() will)<br />
4) Try refreshing and remember to include the filter rule in user.action and have a trailing "/" so it filters all URLs.<br />
<br />
I hope this helps.<br />
<br />
Take care,<br />
<br />
terry]]></content:encoded>
			<link>http://daniel.hahler.de/privoxy-filter-for-google-analytics#c12103</link>
		</item>
				<item>
			<title> Casey Jones [Besucher] in Antwort auf: Privoxy Filter for Google Analytics</title>
			<pubDate>Fri, 05 Dec 2008 05:06:45 +0000</pubDate>
			<dc:creator>Casey Jones [Besucher]</dc:creator>
			<guid isPermaLink="false">c12051@http://daniel.hahler.de/</guid>
			<description>&lt;p&gt;Oh my, I&#039;m terribly sorry! Let&#039;s try that once more...&lt;/p&gt;

&lt;p&gt;As you&#039;re probably aware, since the time you wrote this helpful filter Google has released a newer Analytics script, called &quot;ga.js&quot; instead of &quot;urchin.js&quot;, and it&#039;s function/method is called _gat._getTracker() rather than urchinTracker(). Here&#039;s an article about it, for anyone to whose attention this hadn&#039;t already come:&lt;/p&gt;

&lt;p&gt;http://www.epikone.com/blog/2007/10/16/gajs-new-google-analytics-tracking-code/&lt;/p&gt;

&lt;p&gt;I&#039;ve tried my best to write a couple supplemental (* because the old Analytics code your filter deals with is also still in use) lines for your filter, but I&#039;m a complete newb with Privoxy and would appreciate any syntax-checking that you/anyone can grace me with:&lt;/p&gt;

&lt;p&gt;s|&amp;lt;script.*?google-analytics\.com/ga\.js*.?&amp;lt;/script&amp;gt;||gis
&lt;br /&gt;s|&amp;lt;script\s[^&amp;gt;]*?_gat\._getTracker(*.).*?&amp;lt;/script&amp;gt;||gis&lt;/p&gt;

&lt;p&gt;There are a few things about Privoxy&#039;s filters&#039; syntax that I still don&#039;t grasp after reading the brunt of Chapter 9: &quot;Filter Files&quot; in the manual. This chapter has a sort of storytelling approach, when what I would wish for is a very simple, cut-and-dry reference of filter operators.&lt;/p&gt;

&lt;p&gt;Perhaps some things which it seems to me they left out will be more obvious to *nix-minded people since, for better or for worse, I come from a Windows background. I&#039;m hoping someone can help me with the answers to these questions:&lt;/p&gt;

&lt;p&gt;1) What does &quot;[^&amp;gt;]*&quot; mean? The manual says the following three things, but I still don&#039;t get it:&lt;/p&gt;

&lt;p&gt;- &quot;* means: &#039;Match an arbitrary number of the element left of myself&#039;&quot;
&lt;br /&gt;- &quot;The [&#039;&quot;] construct means: &#039;a single or a double quote&#039;.&quot;
&lt;br /&gt;- &quot;s/(&amp;lt;body [^&amp;gt;]*)onunload(.*&amp;gt;)/$1never$2/iU
&lt;br /&gt;&quot;... we had to use [^&amp;gt;]* instead of .* to prevent the match from exceeding the &amp;lt;body&amp;gt; tag if it doesn&#039;t contain &quot;OnUnload&quot;, but the page&#039;s content does.&quot;&lt;/p&gt;

&lt;p&gt;My interpretation is that this includes x-number of caret OR closing-bracket symbols in the filter. However, I don&#039;t know where caret (^) symbols are used in HTML... or how this would help keep the filter from exceeding the body tag. Probably I&#039;m just ignorant of something simple, of which I hope someone is able to enlighten me!&lt;/p&gt;

&lt;p&gt;2) What does a question-mark indicate? The manual says only the following two things:&lt;/p&gt;

&lt;p&gt;- &quot;s/window\.status\s*=\s*([&#039;&quot;]).*?\1/dUmMy=1/ig
&lt;br /&gt;&quot;The ? in .*? makes this matching of arbitrary text ungreedy. (Note that the U option is not set).&quot;
&lt;br /&gt;- &quot;s/microsoft(?!\.com)/MicroSuck/ig
&lt;br /&gt;&quot;Note the (?!\.com) part (a so-called negative lookahead) in the job&#039;s pattern, which means: Don&#039;t match, if the string &quot;.com&quot; appears directly following &quot;microsoft&quot; in the page.&quot;&lt;/p&gt;

&lt;p&gt;Maybe I&#039;m just dumb, but my mind still isn&#039;t sure it comprehends this question-mark abstract from these two anecdotal explanations. Isn&#039;t any string you write in the filters searched for regardless of whether or not the question-mark precedes it? I don&#039;t understand how it changes the filter&#039;s behavior. I&#039;ve tried to use this syntax, er... intuitively in my additions (I guessed!).&lt;/p&gt;

&lt;p&gt;3) The second line of your filter begins with &quot;\b&quot;. What&#039;s that? The manual explains that &quot;\s&quot; indicates a variable amount of whitespace (or none), but I find no mention of &quot;\b&quot;.&lt;/p&gt;

&lt;p&gt;4) Although my additions to your filters are reported to work by the script...
&lt;br /&gt;http://config.privoxy.org/show-url-info?url=[insert URL for testing]
&lt;br /&gt;... the theoretically-filtered scripts still show up when I view the source-code of a page. Why is this? Probably it&#039;s explained somewhere in the manual, but I don&#039;t know where to begin looking and I&#039;m a bit discouraged by my experience with its chapter on filters.&lt;/p&gt;

&lt;p&gt;One fella&#039; reports the same unchanged-source behavior I&#039;m describing, in regard to your own filter, at this link:&lt;/p&gt;

&lt;p&gt;http://sysblogd.wordpress.com/2007/12/06/how-to-among-others-block-google-analytics-java-script-urchinjs-from-revealing-your-site-usage/&lt;/p&gt;

&lt;p&gt;He seems to think it&#039;s working, but neither he nor I are sure. Please help explain if you can.&lt;/p&gt;

&lt;p&gt;PS: I noticed that you didn&#039;t escape your periods from the URL. It ought to process just the same, but formally shouldn&#039;t it be written &quot;google-analytics\.com/urchin\.js&quot;?&lt;/p&gt;</description>
			<content:encoded><![CDATA[<p>Oh my, I'm terribly sorry! Let's try that once more...</p>

<p>As you're probably aware, since the time you wrote this helpful filter Google has released a newer Analytics script, called "ga.js" instead of "urchin.js", and it's function/method is called _gat._getTracker() rather than urchinTracker(). Here's an article about it, for anyone to whose attention this hadn't already come:</p>

<p>http://www.epikone.com/blog/2007/10/16/gajs-new-google-analytics-tracking-code/</p>

<p>I've tried my best to write a couple supplemental (* because the old Analytics code your filter deals with is also still in use) lines for your filter, but I'm a complete newb with Privoxy and would appreciate any syntax-checking that you/anyone can grace me with:</p>

<p>s|&lt;script.*?google-analytics\.com/ga\.js*.?&lt;/script&gt;||gis
<br />s|&lt;script\s[^&gt;]*?_gat\._getTracker(*.).*?&lt;/script&gt;||gis</p>

<p>There are a few things about Privoxy's filters' syntax that I still don't grasp after reading the brunt of Chapter 9: "Filter Files" in the manual. This chapter has a sort of storytelling approach, when what I would wish for is a very simple, cut-and-dry reference of filter operators.</p>

<p>Perhaps some things which it seems to me they left out will be more obvious to *nix-minded people since, for better or for worse, I come from a Windows background. I'm hoping someone can help me with the answers to these questions:</p>

<p>1) What does "[^&gt;]*" mean? The manual says the following three things, but I still don't get it:</p>

<p>- "* means: 'Match an arbitrary number of the element left of myself'"
<br />- "The ['"] construct means: 'a single or a double quote'."
<br />- "s/(&lt;body [^&gt;]*)onunload(.*&gt;)/$1never$2/iU
<br />"... we had to use [^&gt;]* instead of .* to prevent the match from exceeding the &lt;body&gt; tag if it doesn't contain "OnUnload", but the page's content does."</p>

<p>My interpretation is that this includes x-number of caret OR closing-bracket symbols in the filter. However, I don't know where caret (^) symbols are used in HTML... or how this would help keep the filter from exceeding the body tag. Probably I'm just ignorant of something simple, of which I hope someone is able to enlighten me!</p>

<p>2) What does a question-mark indicate? The manual says only the following two things:</p>

<p>- "s/window\.status\s*=\s*(['"]).*?\1/dUmMy=1/ig
<br />"The ? in .*? makes this matching of arbitrary text ungreedy. (Note that the U option is not set)."
<br />- "s/microsoft(?!\.com)/MicroSuck/ig
<br />"Note the (?!\.com) part (a so-called negative lookahead) in the job's pattern, which means: Don't match, if the string ".com" appears directly following "microsoft" in the page."</p>

<p>Maybe I'm just dumb, but my mind still isn't sure it comprehends this question-mark abstract from these two anecdotal explanations. Isn't any string you write in the filters searched for regardless of whether or not the question-mark precedes it? I don't understand how it changes the filter's behavior. I've tried to use this syntax, er... intuitively in my additions (I guessed!).</p>

<p>3) The second line of your filter begins with "\b". What's that? The manual explains that "\s" indicates a variable amount of whitespace (or none), but I find no mention of "\b".</p>

<p>4) Although my additions to your filters are reported to work by the script...
<br />http://config.privoxy.org/show-url-info?url=[insert URL for testing]
<br />... the theoretically-filtered scripts still show up when I view the source-code of a page. Why is this? Probably it's explained somewhere in the manual, but I don't know where to begin looking and I'm a bit discouraged by my experience with its chapter on filters.</p>

<p>One fella' reports the same unchanged-source behavior I'm describing, in regard to your own filter, at this link:</p>

<p>http://sysblogd.wordpress.com/2007/12/06/how-to-among-others-block-google-analytics-java-script-urchinjs-from-revealing-your-site-usage/</p>

<p>He seems to think it's working, but neither he nor I are sure. Please help explain if you can.</p>

<p>PS: I noticed that you didn't escape your periods from the URL. It ought to process just the same, but formally shouldn't it be written "google-analytics\.com/urchin\.js"?</p>]]></content:encoded>
			<link>http://daniel.hahler.de/privoxy-filter-for-google-analytics#c12051</link>
		</item>
				<item>
			<title> Forest [Besucher] in Antwort auf: Privoxy Filter for Google Analytics</title>
			<pubDate>Sun, 20 Jul 2008 03:07:40 +0000</pubDate>
			<dc:creator>Forest [Besucher]</dc:creator>
			<guid isPermaLink="false">c11683@http://daniel.hahler.de/</guid>
			<description>Danke!</description>
			<content:encoded><![CDATA[Danke!]]></content:encoded>
			<link>http://daniel.hahler.de/privoxy-filter-for-google-analytics#c11683</link>
		</item>
			</channel>
</rss>
