Delete page forbidden #270

amihaiemil · 2017-09-18T17:16:21Z

delete page does not work, AWS returns status 403 FORBIDDEN: http://charles.amihaiemil.com/logs.html?log=/d0181af5-5eda-444c-8654-eb488275935a.log

amihaiemil · 2017-09-30T17:14:43Z

@charlesmike delete this page please

charlesmike · 2017-09-30T17:14:48Z

@charlesmike delete this page please

@amihaiemil Some steps failed when processing your command. See logs for details.
Try again and if the error persists please, open an issue.

amihaiemil · 2017-10-05T14:00:39Z

@charlesmike delete this page please

charlesmike · 2017-10-05T14:00:43Z

@charlesmike delete this page please

@amihaiemil Some steps failed when processing your command. See logs for details.
Try again and if the error persists please, open an issue.

amihaiemil · 2017-10-11T06:55:12Z

@SherifWaly I will explain later today what the problem is

SherifWaly · 2017-10-12T12:31:45Z

@amihaiemil What is the problem here ?

amihaiemil · 2017-10-12T13:00:47Z

@SherifWaly When we index each page, we use the url in plain format as id (e.g. the id of an indexed document is http://example.com/path/to/page.html).

Now, we use this id for deletion and I think the problem is that, because of the special characters contained in the url (e.g. /), the AWS signature is not generated correctly, thus we get 403 FORBIDDEN when trying to perform the operation.

We need to stop using the plain URL as id, and turn it into a Base64-encoded String instead. I will come back with the details in about an hour :)

amihaiemil · 2017-10-13T11:54:37Z

@SherifWaly The problem is quite straight forward - everywhere we use the ID, we have the URL of the page and need to turn it into a Base64-encoded String. See an example of encoding here (first answer) -- we don't have Java8, so use the class from Apache Commons (if we don't have the dependency, declare it with maven)

Now, so far, the ID is used in 2 places:

1) When indexing the page/pages

When we index the pages, we turn them into an JSON "bulk", specific to ElasticSearch (more details about _bulk API of ElasticSearch here, if you're curious).

The class responsible for turning page(s) into the bulk object is EsBulkJson -- there, in the method preparePage, you have to encode the page's url before assigning it as ID

2) When we perform the delete page operation

In class AmazonElasticSearch, method delete(final String type, final String id) -- there, the ID also has to be encoded.

Just these 2 changes, and fix failing unit tests, if any.

Don't hurry with this one, you can also do it next week (this weekend I won't have my laptop with me anyway, until Sunday evening)

amihaiemil added 30 min bug labels Sep 18, 2017

amihaiemil assigned SherifWaly Oct 11, 2017

amihaiemil added the @SherifWaly label Oct 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delete page forbidden #270

Delete page forbidden #270

amihaiemil commented Sep 18, 2017

amihaiemil commented Sep 30, 2017

charlesmike commented Sep 30, 2017

amihaiemil commented Oct 5, 2017

charlesmike commented Oct 5, 2017

amihaiemil commented Oct 11, 2017

SherifWaly commented Oct 12, 2017

amihaiemil commented Oct 12, 2017

amihaiemil commented Oct 13, 2017 •

edited

Loading

Delete page forbidden #270

Delete page forbidden #270

Comments

amihaiemil commented Sep 18, 2017

amihaiemil commented Sep 30, 2017

charlesmike commented Sep 30, 2017

amihaiemil commented Oct 5, 2017

charlesmike commented Oct 5, 2017

amihaiemil commented Oct 11, 2017

SherifWaly commented Oct 12, 2017

amihaiemil commented Oct 12, 2017

amihaiemil commented Oct 13, 2017 • edited Loading

1) When indexing the page/pages

2) When we perform the delete page operation

amihaiemil commented Oct 13, 2017 •

edited

Loading