A resolution of tinyurl
This is a problem from leetcode, you can find the original problem here The detailed requirement is like this:
- Convert any url to short url like: http://tinyurl.com/4e9iAk
- The last segment is the id, which has 6 characters, and each character is from '0-9a-zA-Z'
- No two different urls have the same short url
- Total count of possible short url is 62^6 - 1 = 56800235583. We can name is MAX_ID.
- For a given url, assign it a free id (which has not been used) with in MAX_ID.
- Store the <id, url> pair to database like mysql or redis, or even a txt file.
- Convert the id of number to id of short url, 10 based digit to 62 based digit.
- How many unique identifiers possible? Will you run out of unique URLs?
Possible of 56800235583 unique identifiers. Yeah, it will run out of unique URLSs.
- Should the identifier be increment or not? Which is easier to design? Pros and cons?
It could be either increment or not. Increment way is easier to design.
- Mapping an identifier to an URL and its reversal - Does this problem ring a bell to you?
- How do you store the URLs? Does a simple flat file database work?
YES
- What is the bottleneck of the system? Is it read-heavy or write-heavy?
- Estimate the maximum number of URLs a single machine can store.
- Estimate the maximum number of queries per second (QPS) for decoding a shortened URL in a single machine.
- How would you scale the service? For example, a viral link which is shared in social media could result in a peak QPS at a moment's notice.
- How could you handle redundancy? i,e, if a server is down, how could you ensure the service is still operational?
- Keep URLs forever or prune, pros/cons? How we do pruning? (Contributed by @alex_svetkin)
- What API would you provide to a third-party developer? (Contributed by @alex_svetkin)
- If you can enable caching, what would you cache and what's the expiry time? (Contributed by @Humandroid)