Exporting Nike Run Club Data with Python

Nike Run Club (NRC) is a popular app for tracking running workouts and is widely used by runners to log their activities and monitor key performance metrics like distance, pace, and elevation. However, despite its robust tracking capabilities, it lacks an easy way to export data out of the app for deeper analysis.

This became a challenge when we tried to use Nike Run Club data to prepare for an ultra-marathon, where we needed to analyze past runs to help plan pacing and calorie goals over long distances. Fortunately, we discovered a few REST API endpoints that allow us to export the data, and we’ll show you how to use them in this article. We’ll cover everything from downloading the data from the API, re-formatting it from JSON to CSV using Python, and exploring the data through Power BI.

So, without further ado, let’s get into it, starting with the downloads from the API.

Download from API

Authentication

First things first, we need to get an API token to authenticate the API calls, which basically proves to the server that we’re allowed to access the data for our account. Nike doesn’t provide a direct way to do this, but we found a way to get it.

Start by going to https://www.nike.com/member/profile. If you aren’t already signed in, you’ll be prompted with a sign in screen. If you are already signed in, sign out first since we’ll get our access token during the sign in process.

Before signing in, open up the developer settings, which can be done in Google Chrome by right clicking the page and selecting Inspect. We’ll navigate to the Network tab to record all of the traffic that goes through during the sign in process.

You can now sign in to your account, and as you do that you’ll notice that network tab fill up with a whole bunch of different network requests. We’ll look through those requests to find one with your API key in it, and we can usually find it in one of the requests that are named NIKECOM. The API token should be listed under the Request Headers section in this Authorization Bearer field.

This whole big string underneath Bearer is your API token, so we’ll copy that and paste it in as a variable in our Jupyter notebook. This goes into our headers dictionary, which is provided in the API calls to authorize access to our Nike Run Club account.

For security reasons, this API key only works for a certain period of time, so you’ll want to repeat this process for a new API key every time before you run the calls.

Now that we have our API key, we’re ready to start calling the API and exporting our data.

Activity summaries

We’ll start with the first endpoint, which gives a basic summary of each activity, or run. It provides a dictionary that lists out all of the different runs that we’ve tracked with a bunch of info and summary metrics about each.

Note that it only returns about 30 results at a time, and if you have more than 30 runs tracked, we’ll have to use this paging info down at the bottom.

This paging info contains an after_id attribute, which basically tells us where we left off in case we want to return the next set of results. Once we’ve returned all available results, there won’t be an after_id there anymore, so we can setup a while loop to continue calling the API until we no longer get an after_id.

We use this after_id by plugging it into the endpoint URL with an f string, and we can setup a list of all the results that we keep appending the new results to. Finally, we’ll put this list into a dictionary so that we can save the whole thing as a JSON file.

We now have a solid summary of each of the runs that we tracked, but there’s a second endpoint we’ll have to use to get the full details.

Activity details

The second endpoint provides the full details on one of the runs that we tracked, and each run is given an identifier called activity_id that we can use to specify which run we want details for.

Since the summaries dictionary lists out all of the runs that we have logged, we can use a for loop to iterate through each of these and get the full activity details for each. We’ll start by getting the activity_id attribute for each and plugging it into a custom URL that gets the activity details for a specific activity. Once we have our URL, we can make the API call, and once again transform the output into a JSON format with all of the activity details listed out.

Now we’re going to save each of these outputs as its own JSON file, and as the notebook iterates through the different activities we can see the individual files being written to our desktop.

With all of our data exported out of Nike Run Club and downloaded to our computer, the next step is to put it into a format where we can actually start analyzing it.

Formatting the Data

The REST API returns the data in a JSON format, but we usually like to work with data in a tabular format, like a CSV file. For now, we’re going to work on transforming it out into 3 different CSV files:

An activity summaries file that lists out the basic summary metrics of each run (i.e. each run has its own row)
An activity details file with detailed metrics throughout the course of each run, all put together into one long format
A pivoted location data file that has the latitude, longitude, and elevation together over the course of the run

Activity summaries

The activity summaries that we exported contain all kinds of info about each run, including the start and stop times, total distance, average pace, etc. Each of these can be added to the activity summaries file, but for now we’ll start with just the activity_id for each workout.

We still have the activity summaries listed out in our combined data dictionary, so we can once again iterate through each activity, get each activity_id, and add them to a dataframe.

Each summary also contains the start and end time for each run, so we can add those as columns too. However, we notice that the start and end times are stored as this weird UNIX integer format, which represents each time as the number of milliseconds after January 1^st, 1970. We want this as just a normal timestamp format, so we’ll define a quick function that converts each UNIX value to the desired timestamp format.

We can then use our function to add the start and end times as additional columns in our dataframe.

Now let’s grab all of the different summary metrics that we want from the data. We notice that each summary metric is formatted in its own dictionary in a list, so it’ll take a little extra code to find the dictionary that we want and extract the value from it. We’ll code up another function for this, where our function takes in the list of dictionaries along with the metric that we want and the way we want it summarized and simply outputs the value for it.

We can then use this function to add each of the summary values that we want as columns in the dataframe, including average pace, average speed, total distance, total ascent, total descent, and total calories.

Finally, we’ll save this dataframe as our activity summaries CSV, and we’re ready to move on to the next one.

Activity details

The next CSV that we want is a full activity details one that lists out a bunch of different metrics throughout each run.

We’ll start by defining a list of all of the different metrics that we want to get from the JSON data. Since the data is broken out into a separate JSON file for each activity, we’ll use a for loop to iterate through the different activities and read in each activity file. You’ll notice that each JSON file lists out each of the different metric types as a dictionary that defines what the metric is, what the units are for it, and all of the values over time.

We’ll create a function that takes this list of metrics and returns the desired metric in a dataframe format.

With this function, we can iterate through each of the desired metrics and get each set of values as a dataframe. Provided that the desired metric was found, we’ll add in the activity ID and append everything to one big dataframe.

Note that like before, we’ll have to use our convert UNIX column function to convert that UNIX integer format to our normal timestamp format.

Finally, we can save it as our activity details CSV, and we’re ready to move on to our third and final one.

Pivot location data

Although we have a csv with all of the metrics in it, it’s hard to analyze when they’re broken out onto different rows. Instead, we want to pivot out the different metrics types into their own columns so that we can analyze them together.

For now, we’re just going to pivot out the latitude, longitude, and elevation metrics since these are usually tracked at the same cadence. We’ll start by filtering to just these 3 metrics, and then use the pivot table method to pivot each of these out into their own rows.

Let’s make sure that all of our rows have all 3 of the metrics by filtering out any rows with NAs. Also, since Python is case sensitive, we like to format all column names in all caps and can do so with a quick little for loop.

Although we have timestamps for each measurement, we want to know the time of each relative to when the run started, as in the total number of hours into the run. In order to calculate this, we first need to get the timestamp of when the run had started, which is still stored in our runs dataframe. We can join the start times in and calculate the time into each run by getting the difference between the run start time and each individual start time.

Since we have latitude and longitude coordinates for each data point, we can calculate the distance traveled between each by comparing the current and previous sets of coordinates. First, we’ll define new columns with the previous latitude and longitude for each row. Then, we’ll define a function that compares the two sets of coordinates and calculates the distance between them.

We’ll apply it to the dataframe to get the distance between the current and previous data point, and can calculate the total distance ran at each data point by getting a cumulative sum of the distance between points.

Lastly, we’ll save this as our third and final csv, the pivoted location data file.

Power BI Visualizations

Now that we have the data in a nice CSV format, we can do a little quick exploratory analysis on it in Power BI.

We’ll open up Power BI, use the “Get Data” button to select our CSV files, and hit “Apply Changes” to load the data in.

Once it loads, you’ll notice that Power BI automatically setup a one to many relationship between the activity summaries table and the location data table.

We’ll put together just some quick exploratory visuals, starting with a table that lists out a quick summary of each activity.

We can also create an elevation profile by creating a line chart with average elevation on the Y axis and total distance on the X axis. Note that this isn’t going to make much sense until we filter to one of our runs by selecting a row in the summaries table.

We can also create an interactive map using the latitude and longitude coordinates in our location data to map out the path of each run.

And with that, we have a nice little interactive report where we can toggle through the different runs to see the path and elevation profile of each.

Conclusion

Exporting data from Nike Run Club may not be straightforward, but it can be done with the right tools and a bit of Python code. For us, this data was invaluable in helping to plan pace goals and review performance leading up to an ultra-marathon.

We went through it all a bit fast, so feel free to download the full notebook of Python code to try it out yourself:

Exporting Nike Run Club Data Download

If you found this guide helpful, feel free to drop a comment below with any questions or feedback. Your input helps us improve future posts and inspires new ideas.