Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated geolocate speed #53

Merged
merged 4 commits into from
Apr 30, 2021
Merged

Conversation

Arkoniak
Copy link
Collaborator

Closes #52

With this PR, speed has improved significantly. @btime shows 27 ns which is 10^7 times faster than the current implementation.
Unfortunately, this comes at a cost of ~9Gb of geodata and almost a minute of data loading.

So, before merging, issues with extra memory occupation should be solved.

@codecov-commenter
Copy link

codecov-commenter commented Apr 29, 2021

Codecov Report

Merging #53 (32ebeaa) into master (66cc4a7) will increase coverage by 0.58%.
The diff coverage is 94.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #53      +/-   ##
==========================================
+ Coverage   87.83%   88.42%   +0.58%     
==========================================
  Files           2        2              
  Lines          74       95      +21     
==========================================
+ Hits           65       84      +19     
- Misses          9       11       +2     
Impacted Files Coverage Δ
src/geoip-module.jl 89.28% <91.30%> (-0.19%) ⬇️
src/data.jl 88.05% <96.29%> (+0.78%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 66cc4a7...32ebeaa. Read the comment docs.

@Arkoniak
Copy link
Collaborator Author

Arkoniak commented Apr 30, 2021

With 0a0375b update, I have the following benchmarks:

using BenchmarkTools
using GeoIP
using StableRNGs
import Sockets: IPv4, @ip_str

db = load(zipfile = "GeoLite2-City-CSV_20210427.zip");

rng = StableRNG(2021)
smp = rand(rng, db.index, 100)
ips = map(smp) do net
    IPv4(net.netaddr + 1)
end

ip = ips[1]    # ip"201.186.185.1"
ipnet = IPv4Net(ip, 32)

julia> @btime geolocate.($db, $ips);
  392.947 μs (4144 allocations: 284.28 KiB)

julia> @btime geolocate($db, $ip)
  2.947 μs (41 allocations: 2.84 KiB)
Dict{String, Any} with 21 entries:

It still can be Improved, since lookup takes only ~40 ns, so most of the 2.947μs time is spent on building a dictionary.

@Arkoniak Arkoniak marked this pull request as ready for review April 30, 2021 10:09
@Arkoniak Arkoniak merged commit a000677 into JuliaWeb:master Apr 30, 2021
@Arkoniak Arkoniak deleted the geolocate_improve branch April 30, 2021 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

geolocate is slow
2 participants