HECnet cost reflections

Introduction

This page is an attempt at looking at the cost settings in DECnet in relationship with HECnet. Attempts in the past to come to any agreement on costs have not been successful, and I do not expect that this page will significantly change that. However, by presenting some background and facts, along with some recommendations, might help some people who are seeking guidance. However, in the end, every person is free to configure his nodes pretty much any way he wants to, as long as it don't totally disrupt the operations on HECnet.

HECnet is a DECnet network that spans over much of the world. In order for it to communicate with the best performance, it is important that packets travel through the optimal path from source to destination. It should be obvious that if you want to communicate between a machine in Uppsala, Sweden, and a machine in London, UK, hopping through a node in California, US, is not ideal.

Problem

A central "problem" in all this, is that DECnet does not guarantee that packets take the same path from A to B, as the packets travel from B to A.

That is, DECnet is not symmetrical in its routing. This is an important point to understand.

This also means that you probably never achieve what one person considers to be the absolute optimal routing, as there are mutliple persons choices of configuration involved, and in addition, some of the routing decisions DECnet make are definitely not even close to optimal, no matter how you try to configure it.

So, any configuration aim to achieve the optimal routing is guaranteed to fail. So there is no point in trying to go in to deep details of individual routes and try to find if they are suboptimal, and try to improve them, because that will unconditionally cause some other route to become worse.

Why is this so? Well, DECnet routing acts at multiple levels, and routing is only optimized at one level, without regards to the other.

Let's say that you have an endnode who wants to talk to another endnode. They are not adjacent to each other, so no matter what, you need to go through some router. Assuming that your endnode have a direct path to two routers, your endnode will not pick the router based on which one is best for reaching the destination. It will always pick the designated router, which is selected on a simple algorithm within DECnet, and which always picks one router, for all destinations. If this happens to be the "wrong" router, the packet will have to take one extra hop to the "right" router in order to get to the correct destination. Yes, you can tweak the configuration to make the other router the designated router. But then you have the same issue if you want to communicate with any node for which the old designated router was actually the right choice. Basically, you cannot win.

However, packets coming back from the other endnode will not ever go through the "wrong" router. No need. Both routers know how to reach your endnode, and will directly send the packets there.

So here you also see the uncontrolled assymmetry built in to DECnet in action.

This is what is meant by routing being optimized at one level without regards to the other. For an endnode, the optimization is just about finding one router, without considerations that at the routing level, which router you pick should perhaps be different depending on your destination. The same thing is true for area routing. If your packet is going outside your own area, your routers only knows the path to the closest area router, and will forward packets there, no matter if a different area router might actually have been a better choice. And the return path can be a very different path.

Costs

So, what can and should be done?

Well, as described at the start, hopping through a node in California when communicating between two nearby nodes in Europe is undesireable, and we can make DECnet behave better. The reason we can even get to the undesired situation is if we do not honestly describe how costly a path is.

If we say that the cost of all links are equal, DECnet essentially just comes down to picking the routes with the fewest hops. But all links are not equal. Sweden-US-UK is way worse than Sweden-NL-UK for example. But it would be the same number of hops.

However, DECnet do allow us to actually describe how costly a hop is, and not just do hop counting. This is the cost parameter of circuits.

If we were to say that the costs looked like this:
LinkCost
Sweden-NL2
Sweden-US4
NL-Sweden2
NL-UK2
US-Sweden4
US-UK3
UK-NL2
UK-US3
Then the path Sweden-NL-UK would carry a cost of 4, and reverse would be the same. And Sweden-US-UK would carry a cost of 7. Which is way costlier, and would not be picked.

If we added a direct link NL-US with a cost of 4, the NL-US would pick the direct path with a cost of 4, over the UK link, with a cost of 5, and over a hop over Sweden, which would have a cost of 6.

But what would happen if Sweden-US had a cost of 1 here?

Well, then the NL-US communication would suddenly go over Sweden, since the cost that way would only be 3, which is better than the direct link, which have a cost of 4.

But I think everyone would agree that this would be a bad lie from Sweden, which causes traffic to flow in bad ways, and make the service worse for everyone. (Unless, of course, we actually only have a really crappy, slow link between the US and NL, in which case that cost should really be higher, but this should not be handled by lowering the cost of the Swedish links.)

Furthermore, if the US-Sweden link remained at 4, then traffic from US to NL would still go over the direct link, as from the US side, the cost of passing over Sweden would still be 6.

This is another form of assymetry, and this kind of assymetry comes about because of our own configurations. It is unlikely that this makes much sense in general, but it might be that there are occasions where this will happen, and it can be a legitimate configuration.

Proposal

So, how do we come up with costs?

My suggestion is that we use ping times as a measure of cost. It is not perfect, but it does give a fairly good suggestion of preferred paths.

Essentially, we start out with 20ms steps. Every started 20ms is one point on cost.

In addition, we have a base cost of 2 for any node, for the cost of just routing through it.

That means the lowest cost any circuit could have would be 3.

This would give a good baseline to start setting costs on circuits. However, sometimes, for some reasons, a person might think that a specific machine is not suitable for routing in general, and would prefer that communcation took other paths. In such a situation adding 1 or 2 to the cost of links would be appropriate.
Ping timeCost
0-20ms3
20-40ms4
40-60ms5
60-80ms6
80-100ms7
100-120ms8
120-140ms9
......
(it should be obvious how it continues...)

What about the bridge?

What about the bridge?

Well, this has been one of the biggest problems in the discussions in the past.

The thing is, the bridge extends the ethernet, and you can only set one cost for the whole ethernet. But the bridge makes the ethernet potentially communicating with things that are both very close, and are also very far away, for which you would like to set very different costs.

There is no perfect solution for this one. Set it too high, and you might have nodes that are literally next to each other, which picks something else than the direct ethernet link to communcate, and set it too low, and you might actually pass through that node in California when you want to talk to something only halfway there.

Try to be reasonable with ethernet. If you only have local ethernet nodes, then of course set the cost low. RSX defaults to 3 as the cost for ethernet, which fits very well in with the suggestions above. VMS defaults to 4, which also works out just fine.

If you mostly have nodes very far away on the ethernet, using the bridge, set the cost higher. Might even make sense to use the same algorithm as suggested above with ping times. But it's probably ok to be a bit on the low side with ethernet, as the cost of processing them is often a bit lower than point-to-point links to start with.

If you are running RSX, there is one more thing that can be done about this specific problem. In RSX, you actually have two costs associated with each circuit. One is the cost for level 1 routing, and one is for the cost of level 2 routing, and they can be different. So for an ethernet that contains a bridge, you might set a low cost for level 1, and a high cost for level 2. Meaning that for packets inside your area, the ethernet is a good choice, but for packets to other areas, the ethernet is not going to be a good choice.

This might sometimes be a good compromise, and an additional tool to help configure for a good performing DECnet.

However, VMS do not have the ability to set costs depending on the routing level, so this trick cannot be used for those systems.

Final words

Please remember that even if this scheme is implemented by everyone, you will still see routes that are not optimal, neither in your view, or in the general sense. However, as pointed out above, it is not even possible to get things working optimal. DECnet was not designed for that. DECnet was designed to be predictable and consistent. And to work.

But we can, and should, configure things to not do obvious bad choices when we can avoid them. And the way of doing that is to configure things so they, to the best of our ability, reflect the nature of each link, and then let DECnet work as well as it can.

And the more people and machines that apply the same rules, the better the network works. But even if people choose to use other rules or metrics, DECnet should continue working, unless someone goes really crazy. So do not see these rules as a strict requirement, but as a suggestion on how you could try and set your costs. MIM:: is now changed to apply these rules to the best of my abilities and knowledge.

Also, remember, you set the cost of your links. Do not try to make others change their costs based on your views. This all works best if everyone applies the costs based on what things looks like from their point of view. Trust DECnet to then figure out the acceptable routing solution from this.