Skip to content
This repository has been archived by the owner on Nov 6, 2023. It is now read-only.

Batch trivialize unnecessary cloudfront.net rewrites #15998

Merged
merged 20 commits into from
Jul 17, 2018
Merged

Batch trivialize unnecessary cloudfront.net rewrites #15998

merged 20 commits into from
Jul 17, 2018

Conversation

cschanaj
Copy link
Collaborator

@cschanaj cschanaj commented Jul 2, 2018

Attached the source code used to perform the batch rewrites

@cschanaj cschanaj closed this Jul 2, 2018
@cschanaj cschanaj reopened this Jul 2, 2018
@cschanaj cschanaj closed this Jul 2, 2018
@cschanaj cschanaj reopened this Jul 2, 2018
@@ -55,7 +55,7 @@


<rule from="^http://c\.cksource\.com/"
to="https://d2xc7e2akbvihw.cloudfront.net/" />
to="https://c.cksource.com/" />
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule can be removed (covered by the trivial rewrite)

@@ -87,7 +87,7 @@
to="https://www.comixology.com/" />

<rule from="^http://support\.comixology\.com/"
to="https://d3jyn100am7dxp.cloudfront.net/" />
to="https://support.comixology.com/" />
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule can be removed (covered by the trivial rewrite)

@@ -28,9 +28,9 @@
to="https://$1dealerrater.com/" />

<rule from="^http://cdn-static\.dealerrater\.com/"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subdomain is not listed in the targets

@@ -28,9 +28,9 @@
to="https://$1dealerrater.com/" />

<rule from="^http://cdn-static\.dealerrater\.com/"
to="https://d3go5n7g2hd1g0.cloudfront.net/" />
to="https://cdn-static.dealerrater.com/" />

<rule from="^http://cdn-user\.dealerrater\.com/"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subdomain is not listed in the targets

@@ -44,7 +44,7 @@
to="https://s3.amazonaws.com/$1.filemobile.com/" />

<rule from="^http://cimg\.filemobile\.com/"
to="https://d2ksawk1637r8x.cloudfront.net/" />
to="https://cimg.filemobile.com/" />
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule can be removed (covered by the trivial rewrite)

@@ -70,7 +70,7 @@ Non-2xx HTTP code: http://cfs1.gcaptain.com/ (200) => https://d2m8pnbf2v4ae2.clo
to="https://d38ecmhxsvwui3.cloudfront.net/" />

<rule from="^http://cfs1\.gcaptain\.com/"
to="https://d2m8pnbf2v4ae2.cloudfront.net/" />
to="https://cfs1.gcaptain.com/" />
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule can be removed (covered by the trivial rewrite)

@@ -47,7 +47,7 @@ Non-2xx HTTP code: http://www.thinksteroids.com/ (200) => https://www.thinkstero


<rule from="^http://cdn\.thinksteroids\.com/"
to="https://d1uxx6tvzy89fj.cloudfront.net/" />
to="https://cdn.thinksteroids.com/" />
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule can be removed (covered by the trivial rewrite)

@@ -65,7 +65,7 @@
<test url="http://assets.rovio.com/rovio/favicon.ico" />

<rule from="^http://assets-production\.rovio\.com/"
to="https://ductl1gjw76cl.cloudfront.net/" />
to="https://assets-production.rovio.com/" />
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule can be removed (covered by the trivial rewrite)

@@ -112,7 +112,7 @@ Fetch error: http://me.sailthru.com/ => https://www.sailthru.com/: Too many redi
to="https://na-sj08.marketo.com/" />

<rule from="^http://media\.sailthru\.com/"
to="https://d1t6td7a2qcurl.cloudfront.net/" />
to="https://media.sailthru.com/" />
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule can be removed (covered by the trivial rewrite)

@@ -60,7 +60,7 @@
to="https://www.scmagazine.com/" />

<rule from="^http://media\.scmagazine\.com/"
to="https://dw5yb2c18zulu.cloudfront.net/" />
to="https://media.scmagazine.com/" />
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule can be removed (covered by the trivial rewrite)

@Bisaloo
Copy link
Collaborator

Bisaloo commented Jul 2, 2018

I suggested simplifications. It's okay if you think the manual removal of redundant rules is outside the scope of this PR.

On the other hand, #15998 (review) and #15998 (review) should be addressed.

@Bisaloo Bisaloo self-assigned this Jul 2, 2018
@cschanaj
Copy link
Collaborator Author

cschanaj commented Jul 2, 2018

@Bisaloo I have removed the redundant rules and addressed the missing targets. It seems that DealerRater.com.xml require a bit more modifications. Travis is passing now. Thank you very much!!

@Bisaloo
Copy link
Collaborator

Bisaloo commented Jul 2, 2018

I'll wait a bit before merging in case @gloomy-ghost @Hainish @jeremyn or @J0WI wants to double check.

@J0WI
Copy link
Contributor

J0WI commented Jul 3, 2018

I'm not so familiar with nodejs, but I think this is a valuable enhancement.
How did you find the hosts in list.js?

@Hainish
Copy link
Member

Hainish commented Jul 4, 2018

In general ruleset parsing scripts should be checked in, along with the methodology to generate any data sets that the parsing script relies on (such as the hosts in list.js).

Some of the host clean-up scripts that @RReverser has run haven't been checked in, but I think this is a good practice in general, and we may want to start a util/ just for this.

@RReverser
Copy link
Contributor

Some of the host clean-up scripts that @RReverser has run haven't been checked in

I think I have checked in all of them, which one you have in mind?

@RReverser
Copy link
Contributor

Looking at the diff in this PR, many of these should be possible to actually trivialise further by running node utils/trivialize-rules on top of these changes. @cschanaj would you want to try that?

@Bisaloo
Copy link
Collaborator

Bisaloo commented Jul 5, 2018

@RReverser, it looks like this was done already in 61cec8e. Do you think the subsequent commits allow for further simplification by the script?

@RReverser
Copy link
Contributor

@Bisaloo Ah, I've just seen Adk2.xml which can be trivialised further, but it can't by automatic script as there is securecookie tag (which might be used for cookie on other subdomains).

@cschanaj
Copy link
Collaborator Author

cschanaj commented Jul 5, 2018

@J0WI @Hainish the script I used to find the host is rather dirty and include a bit of manual steps.

  1. ./make.sh
  2. node index.js > input.txt
  3. cat input.txt | while read host; do curl -IsS https://$host/ -o /dev/null && echo $host; done > success.txt
  4. manual changes to success.txt to list.json

https://gist.github.com/cschanaj/2ee89cb791eeda0e764b167c5c3f4ede

@RReverser I agree that some rulesets can be futher trivialized but the securecookie tags forbidden that. I tried to make some changes (still work in progress) to the script to identify static cookies and it seem to work. I can submit a PR later if I can verify the modified script is really working. https://github.com/cschanaj/https-everywhere/blob/trivialize-rules.js/utils/trivialize-rules/trivialize-rules.js

@Hainish
Copy link
Member

Hainish commented Jul 5, 2018

@cschanaj it's okay if the scripts are sloppy, I think the important part is that the process is documented and can be reproduced in the future. If someone comes along and cleans it up (or you have time to down the line), all the better.

@Bisaloo
Copy link
Collaborator

Bisaloo commented Jul 14, 2018

@cschanaj, it would be great if you could add your script to the repo as suggested by @Hainish.

Otherwise, this is ready to merge.

@cschanaj
Copy link
Collaborator Author

@Bisaloo @Hainish I have pushed a final version of the script in 87a9087. Sorry for force pushing a few time. This PR is ready to merge if you agree.

@Bisaloo
Copy link
Collaborator

Bisaloo commented Jul 16, 2018

Can I edit the commit message to remove intermediate commits and just keep Batch trivialize unnecessary cloudfront.net rewrites when I squash and merge or do you wish to keep all intermediate steps?

@cschanaj
Copy link
Collaborator Author

@Bisaloo please feel free to edit the commit message. thanks!!

@Hainish
Copy link
Member

Hainish commented Jul 17, 2018

Thanks for this PR @cschanaj and for the review @Bisaloo!

@Bisaloo Bisaloo merged commit 812d5bb into EFForg:master Jul 17, 2018
@Bisaloo Bisaloo removed their assignment Jul 17, 2018
@cschanaj cschanaj deleted the cf-rewrite branch July 17, 2018 12:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants