My pace during stadium runs

December 21, 2025

I often train by running up and down the rows of concrete stairs at the Harvard stadium near where I live. My watch records a lot of information during these runs. For example, the latitude, longitude, and elevation are saved at regular time intervals. Just for fun, I decided to look a bit more closely at this data.

In previous posts, I showed how I retrieved metrics and weather data from my garmin watch, and some basic analysis of this data where I concluded that external factors (weather conditions) had an effect on my running performance. Here I want to look at "internal" factors: how do I pace myself during the run?

First, we need to retrieve the data from the garmin servers. Here is the code I have used:

Show code
#!/usr/bin/env python

import argparse
from garminconnect import Garmin, GarminConnectAuthenticationError
import json

# lat, lon of the stadium
LAT_STADIUM =  42.3669
LON_STADIUM = -71.1260

def is_stadium_run(activity):
    try:
        lat = activity['startLatitude']
        lon = activity['startLongitude']
        el = activity['elevationGain']

        m_per_degrees = 110 * 1000
        dist = m_per_degrees * ((lat - LAT_STADIUM)**2 + (lon - LON_STADIUM)**2)**(1/2)
        dist_threshold = 100 # m

        if dist < dist_threshold and el > 300 and el < 400:
            return True
    except:
        return False
    return False

def retrieve_stadium_data(garmin):
    start_times = []
    durations = []
    activities = garmin.get_activities(start=0, limit=1000, activitytype='running')
    all_data = []
    for activity in activities:
        if is_stadium_run(activity):
            activity_id = activity["activityId"]
            details = garmin.get_activity_details(activity_id)
            metrics = details["activityDetailMetrics"]
            descriptors = details["metricDescriptors"]

            desc_key2metrics_id = {}
            for desc in descriptors:
                key = desc['key']
                metrics_id = desc['metricsIndex']
                desc_key2metrics_id[key] = metrics_id

            time_id = desc_key2metrics_id['directTimestamp']
            time = [m['metrics'][time_id] for m in metrics]

            duration_id = desc_key2metrics_id['sumDuration']
            duration = [m['metrics'][duration_id] for m in metrics]

            distance_id = desc_key2metrics_id['sumDistance']
            distance = [m['metrics'][distance_id] for m in metrics]

            elevation_id = desc_key2metrics_id['directElevation']
            elevation = [m['metrics'][elevation_id] for m in metrics]


            all_data.append(
                {
                    'time': time.copy(),
                    'duration': duration.copy(),
                    'distance': distance.copy(),
                    'elevation': elevation.copy(),
                }
            )
    return all_data


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--out', type=str, default='data.json')
    args = parser.parse_args()

    try:
        tokenstore = '~/.garminconnect'
        garmin = Garmin()
        garmin.login(tokenstore)
    except (FileNotFoundError, GarthHTTPError, GarminConnectAuthenticationError):
        print(f"Login tokens not found in {tokenstore}.")

    data = retrieve_stadium_data(garmin)

    with open(args.out, 'w') as f:
        json.dump(data, f)


if __name__ == '__main__':
    main()

This gives time series of the elevation and the corresponding duration from the start of my run. The time series are remarkably smooth. We can easily distinguish the ups and downs of each row (recall: there is a total of 37 flights of stairs to run up.)

The elevation against time of 4 of my runs. I only show the first five minutes for clarity.

The elevation against time of 4 of my runs. I only show the first five minutes for clarity.

In the above plot, I marked the location of the peaks, computed with the scipy function find_peaks. These peaks correspond to the highest point of each flight of stairs, and they mark a regular spacing of the total progress. This is a very robust way of keeping track of the progress of each run. Using GPS-computed distance would be a lot noisier, especially due to the sharp turns at the ends of each row.

We can now look at the running time between consecutive rows:

Running time between consecutive rows against the index of the starting row. Symbols show the mean over all my recorded runs, and error bars indicate the 5th to 95th percentiles.

Running time between consecutive rows against the index of the starting row. Symbols show the mean over all my recorded runs, and error bars indicate the 5th to 95th percentiles.

We can observe a few things.

  1. On average, I start pretty fast compared to the average pace, and then slow down steadily over time.
  2. There is a little deviation from the steady increase in the middle. I believe that this is related to the slightly different setup at these particular flights of stairs, with a few obstacles along the way.
  3. I tend to accelerate near the end: that's where I empty the tank, and it's also a tradition to fully run up the 37th.

All these observations make sense - although I had never noticed that steady decrease in pace before.

Is there a better way to do this? Let's add the average of my 10 best runs on top of this data:

Same plot as before, with the mean coming from my best 10 runs of all times.

Same plot as before, with the mean coming from my best 10 runs of all times.

Well, it seems like I won't learn too much from the global progress alone: my best runs were just constantly faster than the average run. This can't be the whole story. Let's zoom in individual segments of the time series.

Below, I am plotting the elevation profiles of each up and down, centered around the top of each row. On each plot, there are thus 36 curves (I skip row 37 which is only up, no down). These are 3 of my best runs:

And 3 slower runs:

From these plots, we can see that my runs follow a three-phase structure for each up and down:

This is clearly visible on first set of graphs. Going down takes about the same time as going up.

Let's quantify the time spent in each zone. For each up and down trajectory, I assume that I have a constant vertical speed in each zone. This leads to a piecewise linear approximation of the elevation against time. A fit over all cycles is illustrated below for the three "fast" runs:

The black dot indicates the transition from running to walking up. The average time spent in each zone can be extracted from this approximation. It is shown below for all runs and for my ten fastest runs:

Average time spent in each zone. Zone 1: Run up; Zone 2: walk up; Zone 3: run down.

Average time spent in each zone. Zone 1: Run up; Zone 2: walk up; Zone 3: run down.

In my best 10 runs, I was faster in zone 1 (run up) and zone 3 (run down), but not in zone 2 (walk up).

This was a fun deep dive in the elevation data recorded by my watch over the past 77 stadium runs. I now have evidence that my best strategy yet is not to sprint the whole thing, but to go hard at the start of each flight of stairs, followed by a steady walk up to the top, and then keep a steady but relatively high pace on the way down. All that repeated 36 times before running up the last flight of stairs as fast as I can.

The results were produced with the following code:

Show code
#!/usr/bin/env python

import argparse
import json
import matplotlib.pyplot as plt
import numpy as np
import os

from scipy.signal import find_peaks

NUM_ROWS = 37

def plot_peaks(data, samples, out_dir=None):
    fig, ax = plt.subplots()
    for j, sid in enumerate(samples):
        t = np.array(data[sid]['duration']) / 60
        e = np.array(data[sid]['elevation'])
        pids, _ = find_peaks(e, prominence=np.std(e), distance=len(e) // NUM_ROWS // 2)
        ax.plot(t, e, color=f'C{j}')
        ax.plot(t[pids], e[pids], 'o', ms=10, color=f'C{j}')

    ax.set_xlim(0,5)
    ax.set_xlabel('duration (min)')
    ax.set_ylabel('elevation (m)')
    plt.tight_layout()
    if out_dir is None:
        plt.show()
    else:
        plt.savefig(os.path.join(out_dir, 'peaks.svg'))
    plt.close(fig)

def plot_p2p_durations(data, out_dir=None):
    diffs = []
    for d in data:
        t = np.array(d['duration'])
        e = np.array(d['elevation'])
        pids, _ = find_peaks(e, prominence=np.std(e), distance=len(e) // NUM_ROWS // 2)
        pids = np.append(pids, [len(t)-1])

        if len(pids) >= NUM_ROWS:
            t_ = t[pids[:NUM_ROWS]]
            diffs.append(np.diff(t_))

    diffs = np.array(diffs)
    fig, ax = plt.subplots()

    x = np.arange(1, NUM_ROWS)
    y = np.mean(diffs, axis=0)
    ylo = np.quantile(diffs, q=0.05, axis=0)
    yhi = np.quantile(diffs, q=0.95, axis=0)
    yerr = np.stack((y-ylo, yhi-y), axis=0)
    ax.errorbar(x, y, yerr, ls='', marker='o', capsize=5, clip_on=False)
    ax.set_xlabel('row number')
    ax.set_ylabel('time between two tops (s)')
    ax.set_xlim(0, NUM_ROWS)
    plt.tight_layout()
    if out_dir is None:
        plt.show()
    else:
        plt.savefig(os.path.join(out_dir, 'p2pdurations.svg'))
    plt.close(fig)


def plot_p2p_durations_and_best(data, out_dir=None):
    diffs = []
    durations = []
    for d in data:
        t = np.array(d['duration'])
        e = np.array(d['elevation'])
        pids, _ = find_peaks(e, prominence=np.std(e), distance=len(e) // NUM_ROWS // 2)
        pids = np.append(pids, [len(t)-1])

        if len(pids) >= NUM_ROWS:
            t_ = t[pids[:NUM_ROWS]]
            diffs.append(np.diff(t_))
            durations.append(t[-1])

    diffs = np.array(diffs)
    durations = np.array(durations)
    order = np.argsort(durations)
    durations = durations[order]
    diffs = diffs[order]

    fig, ax = plt.subplots()
    x = np.arange(1, NUM_ROWS)
    y = np.mean(diffs, axis=0)
    ylo = np.quantile(diffs, q=0.05, axis=0)
    yhi = np.quantile(diffs, q=0.95, axis=0)
    yerr = np.stack((y-ylo, yhi-y), axis=0)
    ax.errorbar(x, y, yerr, ls='', marker='o', capsize=5, clip_on=False)

    y = np.mean(diffs[:10,:], axis=0)
    ax.plot(x, y, '-o', ms=7)

    ax.set_xlabel('row number')
    ax.set_ylabel('time between two tops (s)')
    ax.set_xlim(0, NUM_ROWS)
    plt.tight_layout()
    if out_dir is None:
        plt.show()
    else:
        plt.savefig(os.path.join(out_dir, 'p2pdurations_and_best.svg'))
    plt.close(fig)


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('json', type=str, help='data to plot')
    parser.add_argument('--out-dir', type=str, default=None, help='output directory for plots')
    args = parser.parse_args()

    out_dir = args.out_dir

    with open(args.json, 'r') as f:
        data = json.load(f)

    plot_peaks(data, samples=[20, 21, 22, 23], out_dir=out_dir)
    plot_p2p_durations(data, out_dir=out_dir)
    plot_p2p_durations_and_best(data, out_dir=out_dir)


if __name__ == '__main__':
    main()
Show code
#!/usr/bin/env python

import argparse
import json
import matplotlib.pyplot as plt
import numpy as np
import os
from scipy.signal import find_peaks

NUM_ROWS = 37



def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('json', type=str, help='data to plot')
    parser.add_argument('--run-ids', type=int, nargs='+', default=None, help='plot only for these run ids')
    parser.add_argument('--out-dir', type=str, default=None, help='output directory for plots')
    args = parser.parse_args()

    out_dir = args.out_dir
    run_ids = args.run_ids

    if out_dir is not None:
        os.makedirs(out_dir, exist_ok=True)

    with open(args.json, 'r') as f:
        data = json.load(f)


    plot_id = 0
    for run_id, d in enumerate(data):
        duration = np.array(d['duration'])
        elevation = np.array(d['elevation'])

        # High peaks (top of each row)
        hpids, _ = find_peaks(elevation,
                              prominence=np.std(elevation),
                              distance=len(elevation) // NUM_ROWS // 2)
        hpids = np.append(hpids, [len(duration)-1])

        if len(hpids) != NUM_ROWS:
            continue

        # Low peaks (bottom of each row)
        lpids, _ = find_peaks(-elevation,
                              prominence=np.std(elevation),
                              distance=len(elevation) // NUM_ROWS // 2)
        lpids = np.append([0], lpids)

        if len(lpids) != NUM_ROWS:
            continue

        if run_ids is not None and not run_id in run_ids:
            continue

        profiles = []

        # go over every up and down;
        # row 37 doesn t count as we do not go down.

        fig, ax = plt.subplots(figsize=(4,4))
        for i in range(NUM_ROWS-1):
            start = lpids[i]
            mid = hpids[i]
            end = lpids[i+1]

            tup = duration[start:mid+1]
            eup = elevation[start:mid+1]
            tdown = duration[mid:end]
            edown = elevation[mid:end]

            # normalize
            tshift = tup[-1]
            #tscale = tdown[-1] - tup[0]
            tscale = 1

            tup   = (tup - tshift) / tscale
            tdown = (tdown - tshift) / tscale

            eshift = eup[0]
            escale = edown[0] - eup[0]

            edown = (edown - eshift) / escale
            eup = (eup - eshift) / escale

            ax.plot(tup, eup, color='C0', alpha = i / NUM_ROWS)
            ax.plot(tdown, edown, color='C1', alpha = i / NUM_ROWS)

        ax.set_xlabel(r'$t$(s)')
        ax.set_ylabel('normalized elevation')
        ax.set_xlim(-30, 30)
        ax.set_ylim(0, 1)
        ax.set_title(f"{int(duration[-1]//60)} minutes {int(duration[-1]%60)} sec")
        plt.tight_layout()

        print(run_id)

        if out_dir is None:
            plt.show()
        else:
            plt.savefig(os.path.join(out_dir, f'{plot_id:03d}.svg'))

        plot_id += 1
        plt.close(fig)


if __name__ == '__main__':
    main()
Show code
#!/usr/bin/env python

import argparse
import json
import matplotlib.pyplot as plt
import numpy as np
import os
from scipy.signal import find_peaks
from scipy.optimize import least_squares

NUM_ROWS = 37

def piecewise_linear(t, aup1, tau12, aup2, bup2, ado):
    # bup1 + aup1 * tau12 = bup2 + aup2 * tau12
    bup1 = bup2 + tau12 * (aup2 - aup1)
    bdo = bup2
    return np.where(
        t < tau12,
        np.maximum(bup1 + aup1 * t, 0),
        np.where(t < 0,
                 bup2 + aup2 * t,
                 bdo + ado * t))

def find_piecewise_linear_params(t, e):
    def fun(p):
        aup1, tau12, aup2, bup2, ado = p
        pred = piecewise_linear(t, *p)
        diff = pred - e
        # prior
        logprior_res = [(tau12 + 15) / 5,
                        max(aup2 - aup1, 0) / 0.01, # prior: aup1 > aup2
                        max(-aup1, 0) / 0.01, # prior: aup1 > 0
                        max(-aup2, 0) / 0.01, # prior: aup2 > 0
                        max(ado, 0) / 0.01] # prior: ado < 0

        diff = np.append(diff, logprior_res)
        return diff

    res = least_squares(fun, [0.6, -15, 0.4, 13, -0.5])
    return res.x


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('json', type=str, help='data to plot')
    parser.add_argument('--run-ids', type=int, nargs='+', default=None,
                        help='plot details for these run ids')
    parser.add_argument('--out-dir', type=str, default=None, help='output directory for plots')
    args = parser.parse_args()

    out_dir = args.out_dir
    run_ids = args.run_ids

    if out_dir is not None:
        os.makedirs(out_dir, exist_ok=True)
        if run_ids is not None:
            os.makedirs(os.path.join(out_dir, 'detailed'), exist_ok=True)

    with open(args.json, 'r') as f:
        data = json.load(f)

    all_params = []
    durations = []

    plot_id = 0
    for run_id, d in enumerate(data):
        duration  = np.array(d['duration'])
        elevation = np.array(d['elevation'])

        # High peaks (top of each row)
        hpids, _ = find_peaks(elevation,
                              prominence=np.std(elevation),
                              distance=len(elevation) // NUM_ROWS // 2)
        hpids = np.append(hpids, [len(duration)-1])

        if len(hpids) != NUM_ROWS:
            continue

        # Low peaks (bottom of each row)
        lpids, _ = find_peaks(-elevation,
                              prominence=np.std(elevation),
                              distance=len(elevation) // NUM_ROWS // 2)
        lpids = np.append([0], lpids)

        if len(lpids) != NUM_ROWS:
            continue

        # go over every up and down;
        # row 37 doesn t count as we do not go down.

        ts = []
        es = []
        ps = []
        for i in range(NUM_ROWS-1):
            start = lpids[i]
            mid = hpids[i]
            end = lpids[i+1]

            t = duration[start:end]
            e = elevation[start:end]

            # normalize
            t -= t[mid - start]
            e -= e[0]

            ps.append(find_piecewise_linear_params(t, e))
            ts += t.tolist()
            es += e.tolist()
        ts = np.array(ts)
        es = np.array(es)
        ps = np.array(ps)

        if run_id in run_ids:
            fig, ax = plt.subplots(figsize=(4,4))
            aup1, tau12, aup2, bup2, ado = find_piecewise_linear_params(ts, es)
            t_ = np.linspace(np.min(ts), np.max(ts), 1000)
            e_ = piecewise_linear(t_, aup1, tau12, aup2, bup2, ado)

            ax.plot(ts, es, '+', color='C0')
            ax.plot(t_, e_, '-', color='C1')
            ax.plot([tau12], [bup2 + aup2 * tau12], 'ok')

            ax.set_xlabel(r'$t$(s)')
            ax.set_ylabel('elevation (m)')
            ax.set_xlim(-30, 30)
            ax.set_ylim(0, 14)
            ax.set_title(f"{int(duration[-1]//60)} minutes {int(duration[-1]%60)} sec")
            plt.tight_layout()
            if out_dir is None:
                plt.show()
            else:
                plt.savefig(os.path.join(out_dir, 'detailed', f"{plot_id:03d}.svg"))
            plt.close(fig)
            plot_id += 1

        all_params.append(ps.copy())
        durations.append(duration[-1])

    durations = np.array(durations)
    all_params = np.array(all_params) # shape (nsamples, nrows, nparams)
    order = np.argsort(durations)
    durations = durations[order]
    all_params = all_params[order]

    aup1, tau12, aup2, bup2, ado = np.transpose(all_params, (2, 0, 1)) # shape(nsamples, nrows)
    bup1 = bup2 + tau12 * (aup2 - aup1)
    bdo = bup2
    tzone1 = bup1 / aup1 + tau12
    tzone2 = -tau12
    tzone3 = -bdo / ado

    tzones = [
        np.mean(tzone1),
        np.mean(tzone2),
        np.mean(tzone3)
    ]

    tzones_fast = [
        np.mean(tzone1[:10]),
        np.mean(tzone2[:10]),
        np.mean(tzone3[:10])
    ]

    fig, ax = plt.subplots()
    ax.bar(np.arange(1,4)-0.15, tzones, width=0.3, label='all runs',
           hatch='///', facecolor='none', edgecolor='C0', linewidth=2)
    ax.bar(np.arange(1,4)+0.15, tzones_fast, width=0.3, label='faster runs',
           hatch='\\\\\\', facecolor='none', edgecolor='C1', linewidth=2)
    ax.set_ylabel('duration (s)')
    ax.set_xticks([1, 2, 3])
    ax.set_xticklabels([f'zone {i}' for i in range(1,4)])
    ax.legend()
    plt.tight_layout()
    if out_dir is None:
        plt.show()
    else:
        plt.savefig(os.path.join(out_dir, "time_zones_comp.svg"))
    plt.close(fig)



if __name__ == '__main__':
    main()