How to determine Centers-of-Gravity

Designing a supply chain network often starts with a demand Centers-of-Gravity analysis that will provide a first good view on what warehouse locations to consider. This page visualizes and describes in detail how to determine and calculate Centers-of-Gravity, covering the basic core algorithms of Centers‑of‑Gravity Calculator.

The page concludes with notes about data preparation and things to consider when doing a Centers-of-Gravity analysis.


Centers‑of‑Gravity indicate warehouse locations that minimize transport costs

Centers-of-Gravity are those locations that minimize the sum of weighted distances. Weighted distance is the distance from warehouse to customer multiplied by demand. If customer A has a demand of 10 and is located 25 kilometers (km) from its warehouse, then the weighted distance is 250 km. The sum of weighted distances over all customers acts as an indicator for transport costs.
In an extended version weighted distance is defined as distance × demand (1st weight) × transport rate (2nd weight). This enables differentiation between a demand volume transported in small shipments versus the same volume transported large shipments, which is less expensive.

Underlying assumption of the Centers-of-Gravity analysis is: transport cost = rate/kilometer (or mile) × distance.
This assumption is only partly valid. Parcel rates are often distance independent within a smaller country or region, FTL pallet rate/km is lower than LTL pallet rate/km, macro-economic imbalances cause direction-dependent rates. And (upscaled) as-the-crow-flies distances are used as an approximation of road distances.

Of course, transport costs are only a part of supply chain costs. The optimal number of warehouses and their locations are driven by many quantitative and qualitative factors such as (future) transport and warehousing rates, future demand (and supply), lead time requirements, inventory effects, supply chain risk/redundancy, and contractual obligations. Nevertheless, it is common practice to run a Centers‑of‑Gravity-analysis to get a view on what warehouse locations to consider, when designing a supply chain network.

Towards the algorithms - Introductory thoughts

Customer A has a demand of 10 and customer B a demand of 1.
Where is the Center-of-Gravity (CoG)? Somewhere on line A-B, closer to A?

Well, customer A pulls 10 times harder than customer B.
If the CoG moves a distance d towards A the sum of weighted distances decreases (10 x d) – (1 x d).

So, the CoG is right on top of customer A, not somewhere in between A and B!

Though the goal value is the sum of weighted distances, distances itself are 'less relevant' when figuring out in what direction to move the CoG.
The overall demand force is leading. In the example it has a length of 9 and points towards customer A.

The overall demand force points the right direction, but it does not tell how far to move the CoG. This far?

Or this far?

The smaller the move distance, the less chance of bypassing the CoG, but the more moves to be made. So, start with big moves. If the CoG is bypassed ('overshooting') then the sum of weighted distances will have increased instead of decreased, and the move size should be reduced to half of its size. By then, the force arrow will have reversed its direction (it always points in the right direction). By taking smaller steps each time having bypassed the CoG you will finally end up on top of it! You may stop moving back-and-forth if the move size has become very small. Or alternatively: do not move to a next position if it would increase the sum of weighted distances, but lower step size first (then the force arrow will never completely reverse its direction).

Below you will find a real-time generated animation that visualizes the above process, step-by-step.

Single-Center-of-Gravity algorithm

Initial Center-of-Gravity position

The Center-of-Gravity is initially positioned at the weighted average X and Y coordinates of its assigned customers. This inital Center-of-Gravity position is not optimal (though others often state it is). Imagine there are only two customers, customer A with demand 10 at position (0 , 0) and B with demand 1 at position (100 , 0). The weighted X,Y position is at (9.09 , 0), with 9.09 calculated as (10 × 0 + 1 × 100)/(10+1). If the Center-of-Gravity moves a distance d towards customer A the goal value improves 10 × d (closer to A) − 1 × d (further from B) = 9 × d. So the optimal position is on top of customer A, not at the weighted X,Y position! The optimal Center-of-Gravity has a goal value of (10 × 0 + 1 × 100) = 100, whereas the initial Center-of-Gravity at (9.09 , 0) has a much higher value of (10 × 9.09 + 1 × 90.91) = 182, so 82% higher costs. In realistic situations with multiple customers, the relative difference will be much smaller - less than 5% - as also can be seen in the visual simulation further below.

Moving the Center-of-Gravity in the direction of the overall pull force will decrease the sum of weighted distances, if moved the right distance
All forces pulling the Center-of-Gravity need to be summed up to get the overall pull force, which is a vector with a size (irrelevant) and a direction.

Single-Center-of-Gravity algorithm

This algorithm resembles the gradient descent method.

Real-time visual simulation - press the button

Earth is a globe

Note that X,Y on a flat plane (Chartesian coordinate system) needs to be translated into latitude, longitude on a globe (Spherical coordinate system).

All principles remain the same, but formulas become more complicated.
Flat plane?

Multiple-Centers-of-Gravity algorithm

Each single run of this algorithm does the following:

Multiple runs are done, each run starting with different random locations. The best solution out of those runs is presented as final outcome. The higher the number of runs, the more likely the optimal solution will be found. Usually, this optimal solution will then have been found multiple times as well.

The animation shows what happens during a single run of this algorithm: customers (circles) are assigned to the closest warehouse (triangles), warehouses move to their center-of-gravity, customer are reassigned, warehouses move, et cetera, until the final situation is reached where none of the asssignments and warehouse positions change anymore.

(index 100)

(index 68)

On a side note, the problem and algorithm are comparable to k-means clustering: cluster analysis performed in data mining, and used in machine learning.

Centers-of-Gravity analysis - Data preparation

  1. Collect customer data: adresses, demand (and number of shipments). And always validate your data!
    • Retrieve data from your ERP system: customer addresses and demand.
    • Express demand in the transport cost driver applicable for your business: weight, volume or volumetric weight.
    • Demand can either be derived from sales (item master based) or from shipments (carrier based). If available and trustworthy, then use shipment data. Item masters are not always '100% clean'. For example, some items accidentally got 'kilogram' instead of 'gram' as Unit of Measurement causing large kilogram errors. Of course, best is to compare both, check outliers, check if M3/KG make sense, check top-X customers, check on the map, et cetera: data validation!
    • Optional: incorporate number of shipments in the calculations (using a more advanced definition of 'weight' than simply 'demand'). This may add accuracy. However, its impact on Centers-of-Gravity will often be marginal. Therefor, considered optional.

  2. Optional: collect supplier data: adresses, supply (and number of shipments)
    • You may want to consider both demand and supply, as supply pulls a warehouse as well.
    • Take into account that supplier transport is usually more FTL-like (relatively less expensive) and customer transport more LTL-like: reduce supplier sizes.

      DEMAND AND SUPPLY: CoG has moved 165 kilometers up north

  3. Geocode adresses
    Geocode customer (and supplier) locations. Geocoding is the process of taking input text, such as an address or the name of a place, and returning a latitude/longitude location on the Earth's surface for that place. Those latitudes/longitudes are required for the distance calculations.

    The analysis is usually based on total demand per customer, not on individual shipments. You may even want to aggregate demand up to regional level. Within Europe a 2-digit postal code level is usually accurate enough. In general, data aggregation keeps models smaller and run faster. It may also make a map more clear: a few larger dots are more easily compared to each other than tens of thousands tiny dots. However, geocoders - such as this Geocoder - will run at (postal code) address level, not at 2-digit postal code level.

  4. Consider harbours
    If far-away-customers are delivered via an exit harbour, then put the aggregated far-away-customers' demand on top of that harbour, instead of the customer country. The harbour may be located in opposite direction of the demand country, seen from a warehouse perspective. A warehouse should be pulled in the right direction!

  5. Consider regional setups
    You may want to split your data set. For instance, if you already know that you will operate a warehouse in the UK to deliver UK customers, then split your data into 'UK data' and 'mainland Europe data' and run the model for both sets separately.