How I Always Get The Best Ferry Discounts (and Why Length Matters)

An important thing to remember when developing web-applications is that your own front end JavaScript app isn't the only one that can use your API. Your API might not be intended as public, but it still can be. If the server HTTP API is available at a public address, everyone can try to use it with a HTTP client of their choice.

This is a short story about how I found all available discount codes for a ferry operator's web shop, and successfully paired the codes with ferry trips.

One day, I was just checking prices, with no extra intentions, for a trip I was going to make. While browsing, I noticed that the ferry operator's site contained a very familiar looking form for inserting discount codes.

Very familiar looking form for inserting discount codes

For some reason, I wanted to see how the form would react to made up codes.

Site tells you immediately after inserting a code whether or not it is valid

It told you immediately after inserting a code whether or not it was valid.

Figuring Out The Inner Workings of The Web Site

Being a curious developer, I decided to take a look at how the voucher validation worked. Do the codes follow a certain pattern with a check digit that can be validated in front end? Is every code validated with a server? I first opened networking tab in Chrome developer tools, and saw that each entered code was indeed validated with a HTTP request to an API-endpoint.

Chrome developer tools

In the validation request, POST payload was simply the voucher entered by the user as a string "12345" and the API responded with a JSON {"valid": false}.

Realising a Vulnerability

I happened to have access to a valid discount code. (One of those almost worthless codes that every web-shop is sharing eagerly in hopes of new customers.) The code I had was a 5-digit-long number.

5 digits seemed short, so out of curiosity, I did quick calculations. Assuming, that for some reason all the vouchers are 5 digits, there are 100 000 unique possibilities [00000, 00001, ..., 99999].

A single request from my browser took a little less than 150ms. 100 000 * 150ms = 4.2 hours.

Going through every possible 5-digit voucher would take just 4 hours. Since most of the 150ms of a single roundtrip is spent waiting for the request or the response to be delivered over the internet, checking can be made much faster with concurrent requests.

After doing the calculations, I thought why not. I could practice my Python skills and try brute forcing all the discount codes!

At this point, I didn't give high odds for my success. Maybe I was missing something that would block someone from making the requests programmatically. Maybe the API could start throttling or discarding the requests if made in large volumes by the same client, even if a couple of requests would succeed.

Making Voucher Checks with a Script

After a quick research, I decided to try Python with Requests library for making requests to the voucher API.

A POST request can be made in just one line.

r = requests.post('http://httpbin.org/post', data = {'key':'value'})  

I had to play around a while before I got the right set of headers and cookies that the server needed. I had a complete valid request in Chrome developer tools. I just had to one by one add more stuff to my Python request until it started working. It turned out, this was about the minimum request that got an OK response. No cookies needed, just the correct Content-type-header.

import requests

VOUCHER_URL = "https://booking.ferryoperator.com/api/voucher"  
HEADERS = {  
    "Content-Type": "application/json; charset=UTF-8"
}
voucher = "12345"

response = requests.post(VOUCHER_URL,  
    data=voucher,
    headers=HEADERS)

print(response.text)  
# {"valid": false}

I had to check, and string literal like "12345" without any curly braces actually is a valid JSON object (like the required Content-type-header value application/json was suggesting).

Making Concurrent Requests from Python Script

Now that I could make a single voucher code check from my script, I needed to figure out how to do multiple parallel checks. Here is what I learned.

Parallel request can be made with threading and queue modules from the Python standard library.

The basic structure for computing parallel tasks from a queue is like this:

from threading import Thread  
from queue import Queue


q = Queue()

def check_voucher(code)  
    # check the code with a HTTP request
    pass

# The worker thread pulls an item from the queue and processes it
def worker():  
    while True:
        item = q.get()
        check_voucher(item)
        q.task_done()

# Create the thread pool.
for i in range(10):  
    t = Thread(target=worker)
    # Thread dies when main thread (only non-daemon thread) exits.
    t.daemon = True
    t.start()

# Now we have 10 worker threads running and waiting for items to
# appear on the queue.

# Put work items on the queue.
for x in list(range(0, 99999)):  
    potential_voucher = str(x).zfill(5)
    q.put(potential_voucher)

# Block until all tasks are done
q.join()  

It seemed that vouchers starting with '0' are totally invalid. Requests with a voucher code with a leading zero do not return false, just a server error. Quick tests with vouchers with non-5-digit-format gave similar errors. This hints that all vouchers actually could have the same 5-digit format.

After letting the script run, I got a nice text file with about a 100 unique 5-digit numbers.

$ tail vouchers.txt
99591  
99602  
99646  
99647  
99690  
99846  
99881  
99882  
99956  
99958  
$

A quick manual check at the web store confirmed that the codes actually were valid and accepted!

Accepted voucher codes in web store

Matching Vouchers with Cruise Trips

Next, what I realised was that while the web shop accepts any valid voucher codes in the voucher form, not all vouchers give you a discount for all trips.

Luckily, there seemed to be an other API endpoint which would take a trip ID and a voucher as an input and return information whether a given voucher was applied and the amount of discount the voucher gives for selected trip.

Because I was already so far, I decided to try to write a script that would give all voucher codes that would give a discount for a selected trip.

The JSON payload for this request was much more complicated. Instead of string manipulation, I decided to use json library for
1. parsing an example request JSON string (copy-paste from Dev tools) to a Python object
2. manipulating the request data in an object form to contain needed voucher code
3. encoding the Python object back to JSON string-form for the HTTP request.

import json

def make_request_data(voucher):  
    example_request = '{"clubClientIds":[null],"passengerTypes":{"adults":1,"children":0,"juniors":0,"youths":0},"sails":{"outwardSailId":5921775},"travelClasses":{"outwards":{"LUX":1},"returns":{},"sameForAll":false},"outwardVehicleCodes":[],"paymentWithClubPoints":false,"isHotelPackage":false,"voyageType":"CRUISE","extraServices":{"outwardExtraServices":[],"returnExtraServices":[]},"landServices":[],"voucherIds":["99246"]}'
    obj = json.loads(example_request)
    vouchers = []
    vouchers.append(str(voucher))
    # Manipulate the JSON data as a Python dictionary object
    obj['voucherIds'] = vouchers
    return json.dumps(obj)

In this case, straight forward string concatenation would have been feasible, but this technique could be useful if the example JSON required more modifications per request. As a plus side, the json-decode-encode method allowed me to easily paste in new example requests with different trip etc. data.

Besides the more complicated request and response format, the second script was very similar to the first one. The second script took as an input the .txt file with all vouchers the first script produced. The second script then made a subset of the list with vouchers that gave a discount for a given trip.

Now I had a working Python program, that, if I wanted, I could use to find all available discounts for a given trip! All I had to do was manually get the id for the trip from the web shop with e.g. Chrome Dev tools.

Going for a discount trip

Lessons Learned

The moral of the story.

1. If users are not supposed to guess it, use complex enough values

5-digit-long code might be unfeasible for a human to guess, but given the possibility to use a machine without any forced throttling, that is way too simple. It might be tempting to select short values for better usability, but is it worth the security risk?

A better option would be to use characters and numbers and make the voucher code somehow partly meaningful. E.g. SUMMERCRUISER20 or 15EUROFF2016XMAS is a better compromise between security and memorability/usability than 96320 as long as the form and words used in the vouchers vary enough. 15-character long alphanumeric vouchers have 36^15 = 2.2107392e+23 unique possibilities. That is 2.2107392e+18 times more possibilities than with 5-digit-long numbers.

The above is assuming that at least some vouchers are not meant to be public knowledge and targeted for e.g. only a single user as a gift certificate or a small group of special gold club customers.

2. The API intended to be used by your own front end can also be used by other clients

Nothing stops someone from opening Developer tools from their browser (or Wireshark or something similar) and seeing examples how your own client code makes requests to your backend. From a couple of examples it is usually pretty easy to extrapolate and figure out ways to make requests that are valid but not supposed to be made. Authentication helps, but every token or cookie that the browser uses can be copied to self-made custom requests.

Best way is to design your back end API with this in mind and not to assume that the client uses the API only the right way.

  • Brute force/DOS requests should be detected and throttled/discarded.
  • Even if you have data validation in front end, back end should have the final say.
  • Access to other users' data must be blocked in back end. If you have REST API with URLs like this /user/<id>/trip-history/ the back end must restrict the access so that the authenticated user can only request their own data.

I have never used discount codes obtained this way nor do I intend to. I have informed the ferry operator about this issue before making the post.

comments powered by Disqus