Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for bounding box in GeoSearchGenerator #57

Closed
zstadler opened this issue Oct 6, 2019 · 11 comments · Fixed by #58
Closed

Add support for bounding box in GeoSearchGenerator #57

zstadler opened this issue Oct 6, 2019 · 11 comments · Fixed by #58

Comments

@zstadler
Copy link

zstadler commented Oct 6, 2019

The Wikimedia geosearch supports the use of a bounding box as an alternative to the coordinates+radius as a Geograpic selector:

gsbbox: Bounding box to search in: pipe (|) separated coordinates of top left and bottom right corners.

and provides an example:

api.php?action=query&list=geosearch&gsbbox=37.8|-122.3|37.7|-122.4

Since the coordinates+radius approach is limited to a 10000 meter radius, combining multiple requests in order to cover a larger area is a challenge. On the other hand, the use of a bounding box for searching Wikimedia is easier to aggregate and to integrate with other Geographic systems

Please consider adding support for search based on a bounding box.

@CXuesong CXuesong self-assigned this Oct 7, 2019
@CXuesong CXuesong added this to the v0.7.0 milestone Oct 7, 2019
@zstadler
Copy link
Author

zstadler commented Oct 7, 2019

See also this Wikimedia API bug report related to the use of gsbbox for geosearch.

@CXuesong
Copy link
Owner

CXuesong commented Oct 7, 2019

Thanks for your links, @zstadler ! I will check on this and work on the implementation after the holiday, which is, tomorrow 😄

@CXuesong
Copy link
Owner

Published v0.7.0-int.6. You may now use GeoSearchGenerator.BoundingRectangle to specify a small rectangle with the left (longitude), top (latitude), width, height and search for the pages.

I'm planning to refector GeoSearch, GeoCoordinate and GeoCoordinateRectangle API. I'm going to extract the Dimension and Global from GeoCoordinate structure, and GeoCoordinateRectangle may need some polishment. If you have any more suggestion / feature requests regarding to these API, feel free to open another issue and let me know 😉

@HarelM
Copy link

HarelM commented Oct 17, 2019

Thanks for this! :-)
What's a small rectangle?
I'm getting the following error:
OperationFailedException: toobig: Bounding box is too big - the exception should indicate which bbox I should be using I think...
Also toobig is missing a space :-)

@CXuesong
Copy link
Owner

CXuesong commented Oct 17, 2019

I've tried this roughly, and ranges less than 0.2 degrees in longitude and lattitude seem okay.

[Fact]
public async Task WpEnGeoSearchTest2()
{
var site = await WpEnSiteAsync;
var gen = new GeoSearchGenerator(site) { BoundingRectangle = new GeoCoordinateRectangle(1.9, 47.1, 0.2, 0.2) };
var result = await gen.EnumItemsAsync().Take(20).FirstOrDefaultAsync(r => r.Page.Title == "France");
ShallowTrace(result);
Assert.NotNull(result);
Assert.True(result.IsPrimaryCoordinate);
}

My hypothesis is that on MW API server, eventually you cannot bypass the Radius limitation of GeoSearch. 10km is roughly 0.28 degrees on earth.

So if you are planning to scan on some larger area the earth, you may need to split your range into a grid, and request for the smaller tiles one by one from the client.

And toobig is actually the error code from MW API response, like permissiondenied or badtoken.

@HarelM
Copy link

HarelM commented Oct 17, 2019

Thanks for the quick response!
This is what I do right now with the 10Km radius search, only the circles are overlapping and I though I'll be able to do it in one call of bbox instead of around 1000.
Here's the relevant code I was hoping to simplify... :-/
https://github.com/IsraelHikingMap/Site/blob/5bf63fc2a0e2c1a22bf82d3f1175141b45c25356/IsraelHiking.API/Services/Poi/WikipediaPointsOfInterestAdapter.cs#L77

@HarelM
Copy link

HarelM commented Jan 18, 2020

When using the GeoSearchGenerator it seems that I can't cross the pagination size of 500 in terms of number of results.
The following is generating a 500 items results but I don't know how to continue to the next page:

                    var geoSearchGenerator = new GeoSearchGenerator(new WikiSite(wikiClient, new SiteOptions($"https://he.wikipedia.org/w/api.php")))
                    {
                        BoundingRectangle = GeoCoordinateRectangle.FromBoundingCoordinates(34.75, 32, 34.9, 32.15),
                        PaginationSize = 1000 // this is ignored
                    };
                    var results = await geoSearchGenerator.EnumItemsAsync().ToListAsync(); // this returns only 500...

Let me know if you want me to open a new issue on this or am I missing out something?

@HarelM
Copy link

HarelM commented Jan 18, 2020

Same request from the browser:
https://he.wikipedia.org/w/api.php?action=query&maxlag=5&list=geosearch&gsradius=10&gsprimary=primary&gslimit=500&gsbbox=32.15%7C34.75%7C32%7C34.9
Seems like the response doesn't have a continuation parameter? not sure...

@CXuesong
Copy link
Owner

It seems so. GeoSearch does not support pagination for now. Example response of https://en.wikipedia.org/w/api.php?action=query&maxlag=5&list=geosearch&gsradius=10&gsprimary=primary&gslimit=2&gsbbox=32.15%7C34.75%7C32%7C34.9

{
    "batchcomplete": "",
    "query": {
        "geosearch": [
            {
                "pageid": 18328987,
                "ns": 0,
                "title": "Beit Zvi",
                "lat": 32.078408333333336,
                "lon": 34.821713888888894,
                "dist": 489.4,
                "primary": ""
            },
            {
                "pageid": 46324352,
                "ns": 0,
                "title": "HaAliya HaShniya Garden",
                "lat": 32.0697,
                "lon": 34.8148,
                "dist": 1127.4,
                "primary": ""
            }
        ]
    }
}

@CXuesong
Copy link
Owner

I think the continuation problem is originally tracked with phab:T95241 and closed as duplicate of phab:T78703.

Unfortunately, I don't think T78703 is going to be resolved soon...

@CXuesong
Copy link
Owner

Let's use #64 to track this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants