遗传算法求解最短路径

AM2024-05-152024-05-16

遗传算法求解最短路径

遗传算法求解最短路径的可行性

遗传算法是一种启发式算法，通过模拟生物界中的自然选择，来实现优胜劣汰，最终得到一个较优的解。只要一个问题可以给出解的表示，并且存在一种评判机制，可以判断给出解的适应度，那么就存在使用遗传算法进行求解的可能性。比如求最短路径，我们可以通过深度优先搜索给出起点至终点的路径表示，而评判机制便是路径的距离，所以存在使用遗传算法进行求解的可行性。

任意路径的表示

如果想要通过遗传算法求解最短路径问题，首先需要想一个能表示任意路径的方法，因为如果不能表示任意路径的话，就说明在演化过程中某些路径永远不会被考虑，这是不合理的。于是，我们可以使用深度优先搜索(Depth First Search)寻找一条可行路径，搜索所有可行路径的算法有很多，在此使用DFS是因为我们只需要优先找出一条可行路径。

DFS搜索路径时，当一个结点有多个可达结点，对下一个结点的选择是随机的，那如何让其变为选择性的呢？在此引入一种优先级编码的方式，为每个结点分配一个优先级（各不相同），通过优先级序列来表示路径。不过，优先级序列并不能直接表示路径，需要对其进行解码才能得到真正的路径。当存在多个可达结点时，根据结点的优先级进行选择，这样选择变得不再随机，即对于一个确定优先级序列，路径是唯一确定的，对于不同的优先级序列，路径是不相同的，所以我们可以通过优先级序列来表示任意的路径。

下面举个例子

以a为起点，g为终点时，首先搜索a的可达结点，在可达结点b，c中选择优先级更高的b
再搜索b的可达结点，在可达结点d，e中选择优先级更高的e
最后搜索e的可达结点，为终点g，完成一次基于优先级编码的DFS，得到路径为：a-b-e-g

tips：可以用更小的数字表示更高的优先级，也可用更大的数字表示更高的优先级，只要确保一致即可；如果可达结点中有终点，就没必要比什么优先级了，直接选终点就是了。

评估函数的确定

得到任意路径后，我们就需要评估这条路径好不好，如果只是单纯的求最短距离，我们可以把最小化路径距离作为优化目标，把路径距离的计算作为评估函数，路径距离短的个体对应着高适应度，更容易在“自然选择”中存活下来。当然可以设置其他的评估函数，也可以设置多目标的评估，设置多目标评估时，评估函数的返回值设为多个即可，这也是相比于其他最短路径算法的灵活之处。比如，可以将最小化经过结点数量作为优化目标，结点数量的计算设为评估函数，便可求起点到终点经过结点最少的路径。

实际问题的应用

多说无益，不如看看实际的例子

问题描述

根据已开通的重庆轨道交通网络图，建立模型研究重庆大学(大学城校区)(v1)、西南大学(v2)、西南政法大学(渝北校区)(v3)、重庆交通大学(科学城校区)(v4)、重庆邮电大学(v5)、重庆医科大学(袁家岗校区)(v6)、重庆师范大学(大学城校区)(v7)、重庆工商大学(南岸校区)(v8)、四川外国语大学(v9)、四川美术学院(大学城校区）(v10)、重庆理工大学(花溪校区)(v11)、重庆科技大学(大学城校区)(v12)等重庆主城区 12 所学校之间通过轻轨到达的用时最短且费用较少的路径（部分轻轨不能直达的学校，到达临近的轨道站后再换乘公共汽车，假设轻轨换乘时间为5分钟）。

线路全

问题分析

对于问题，我们可以为每个不能直达的学校选择一个最近的地铁站点（剩余的路程之后再单独计算相关时间和费用），这样每所学校都会对应一个站点。地铁的换乘只能在换乘点进行，所以可以将轨道交通路线图简化为无向图，12所学校对应的站点和换乘点为无向图的结点，根据各结点之间的连接关系添加边，边的权值可以设置为时间和费用。

绘图1

上图为简化的无向图，v1、v7、v10、v12的最近地铁站为同一个，所以在分析问题时，可以将它们视为一个点v0。

12f5f6f4c6ddd876391c885bdd191fc8

得到无向图后，我们需要对其结点进行编号，以便于在求解时进行表示。

数据准备

QQ截图20240515094401

QQ截图20240515094626

官网可以查看相邻结点的时间，地图可以查看相邻结点的距离，费用我们将根据票价政策用距离来计算，例如，环线上的结点1和结点27之间的时间为12，距离为7.5，费用的计算应该在得到总距离后再进行计算（不使用地图的时间是因为其时间考虑了其他因素，如等待时间等）。

代码实现

我直接使用遗传算法的相关代码模板，在其基础上实现个性化需求，所以需要特别设计的便是路径选择函数和评估函数。

路径选择函数

路径选择函数为基于优先级编码的深度优先搜索。

nodeDict为字典，键为结点(str)，值为该结点的可达结点(list)

def dfs(ind, source, target):
    '''
    ind : 优先级序列
    source : 起点
    target : 终点
    '''
    path = [] # 记录路径
    visited = [] # 记录已访问结点
    stack = [source] # 利用栈结构进行深度优先搜索
    lens = [] # 记录结点未访问可达结点的数量
    while stack:
        cur = stack[-1] # 获取一个结点作为当前结点
        if cur not in visited:
            # 将当前结点添加到path和visited中，表示可为路径和已访问
            path.append(cur) 
            visited.append(cur)
            allow = np.array(nodeDict[str(cur)]) # 获取可达结点
            # 根据所有结点的优先级序列ind获取可达结点的优先级序列
            # 结点的标号是从1开始，而列表下标从0开始，所以需要-1
            priority_ls = np.asarray(ind)[np.asarray(allow)-1]
            # 按照优先级进行排列，并返回排序完成后的下标
            index = np.argsort(priority_ls) 
            # 将按照优先级排列完成的可达结点依序加入到栈中
            stack.extend(allow[index])
            # 记录当前结点的可达结点数量
            lens.append(len(allow[index])) 
            # 如果添加结点中包含终点，则搜索成功完成
            if target in stack:
                path.append(target)
                break
        else:
            stack.pop()
            lens[-1] -= 1 # 每访问一个可达结点便减一
            # 当所有可达结点都已访问进行回溯
            if lens[-1] == 0:
                lens.pop()
                path.pop()
    return path

评估函数

评估函数为路径对应的时间及费用。

edgeTime为字典，键为两个结点(str)，值为该两个结点之间的时间(float)
nodeLine为字典，键为结点(str)，值为该结点所属于的线路(list)

def eval(ind):
    path = dfs(ind, source, target)
    time = TIME(path)
    cost = COST(path)
    return (time), (cost)


def TIME(path):
    # 若路径上只有两个结点，直接计算路径第一段的时间
    time = edgeTime[str(path[0])+str(path[1])]
    # 当路径上的结点大于2时，进行一般性的时间计算
    if len(path) > 2:
        # 通过前三个结点可以唯一确定出发时的线路
        # 对三个结点的线路做交集，即可得到出发线路
        pre = set(nodeLine[str(path[0])]) & set(
            nodeLine[str(path[1])]) & set(nodeLine[str(path[2])])
        # 加上第二段的时间
        time += edgeTime[str(path[1])+str(path[2])]
        for i in range(2, len(path)-1):
            time += edgeTime[str(path[i])+str(path[i+1])]
            # 通过对当前结点线路和下一结点线路做交集，得到当前线路
            now = set(nodeLine[str(path[i])]) & set(nodeLine[str(path[i+1])])
            # 如果之前线路与当前线路不同，说明进行了换乘
            if not pre & now:
                pre = now
                time += T
    return time


def COST(path):
    dis = 0
    # 计算总距离
    for i in range(len(path)-1):
        dis += edgeCost[str(path[i])+str(path[i+1])]
    price = [2, 3, 4, 5, 6, 7] # 价目表
    rank = [6, 11, 17, 24, 32] # 分段标准
    # 判断总距离属于什么级别的收费
    rank.append(dis)
    rank.sort()
    cost = price[rank.index(dis)]
    return cost

源代码

设置好这两个函数后，就可以使用遗传算法进行求解了，下面给出完整的源代码。

import numpy as np
import matplotlib.pyplot as plt
import random
from deap import creator, tools, base


# 对于多目标规划分别设置两个优化目标的权值weights
# weights的正负表示最大化和最小化，绝对值的大小表示相对重要程度
creator.create('FitnessMin', base.Fitness, weights=(-1.0, 1.0))
creator.create('Individual', list, fitness=creator.FitnessMin)

gen_size = 54 # 因为无向图有54个结点，对应生成的优先级序列大小也为54
toolbox = base.Toolbox()
toolbox.register('Sequence', np.random.permutation, gen_size)
toolbox.register('Individual', tools.initIterate,
                 creator.Individual, toolbox.Sequence)

toolbox.register('Population', tools.initRepeat, list, toolbox.Individual)

nodeDict = {} # 结点的可达结点
with open("node.txt", 'r') as f:
    tmp = f.readlines()
    for t in tmp:
        t = t.split()
        nodeDict[t[0]] = list(map(int, t[1:]))

nodeLine = {} # 结点所属的线路
with open("line.txt", 'r') as f:
    tmp = f.readlines()
    for t in tmp:
        t = t.split()
        nodeLine[t[0]] = list(map(int, t[1:]))

edgeTime = {} # 相邻结点的时间
with open("time.txt", 'r') as f:
    tmp = f.readlines()
    for t in tmp:
        t = t.split()
        edgeTime[t[0]+t[1]] = float(t[2])
        edgeTime[t[1]+t[0]] = float(t[2])

edgeCost = {} # 相邻结点的距离
with open("dis.txt", 'r') as f:
    tmp = f.readlines()
    for t in tmp:
        t = t.split()
        edgeCost[t[0]+t[1]] = float(t[2])
        edgeCost[t[1]+t[0]] = float(t[2])

T = 5 # 换乘时间设置


def dfs(ind, source, target):
    path = []
    visited = []
    stack = [source]
    lens = []
    while stack:
        cur = stack[-1]
        if cur not in visited:
            path.append(cur)
            visited.append(cur)
            allow = np.array(nodeDict[str(cur)])
            priority_ls = np.asarray(ind)[np.asarray(allow)-1]
            index = np.argsort(priority_ls)
            stack.extend(allow[index])
            lens.append(len(allow[index]))
            if target in stack:
                path.append(target)
                break
        else:
            stack.pop()
            lens[-1] -= 1
            if lens[-1] == 0:
                lens.pop()
                path.pop()
    return path


def eval(ind):
    path = dfs(ind, source, target)
    time = TIME(path)
    cost = COST(path)
    return (time), (cost)


def TIME(path):
    time = edgeTime[str(path[0])+str(path[1])]
    if len(path) > 2:
        pre = set(nodeLine[str(path[0])]) & set(
            nodeLine[str(path[1])]) & set(nodeLine[str(path[2])])
        time += edgeTime[str(path[1])+str(path[2])]
        for i in range(2, len(path)-1):
            time += edgeTime[str(path[i])+str(path[i+1])]
            now = set(nodeLine[str(path[i])]) & set(nodeLine[str(path[i+1])])
            if not pre & now:
                pre = now
                time += T
    return time


def COST(path):
    dis = 0
    for i in range(len(path)-1):
        dis += edgeCost[str(path[i])+str(path[i+1])]
    price = [2, 3, 4, 5, 6, 7]
    rank = [6, 11, 17, 24, 32]
    rank.append(dis)
    rank.sort()
    cost = price[rank.index(dis)]
    return cost


for source in [54]:
    for target in range(53, 54):
        if target == source:
            continue
        toolbox.register('evaluate', eval)
        toolbox.register('select', tools.selTournament, tournsize=2)
        toolbox.register('mate', tools.cxOrdered)
        toolbox.register('mutate', tools.mutShuffleIndexes, indpb=0.5)
        stats = tools.Statistics(key=lambda ind: ind.fitness.values)
        stats.register('avg', np.mean)
        stats.register('std', np.std)
        stats.register('min', np.min)
        stats.register('max', np.max)
        logbook = tools.Logbook()
        logbook.header = 'gen', 'avg', 'std', 'min', 'max'

        pop_size = 100 # 初始种群大小
        N_GEN = 20 	   # 迭代次数
        CXPB = 0.8     # 交叉概率
        MUTPB = 0.2    # 变异概率

        # 生成种群
        pop = toolbox.Population(n=pop_size)
        # 评价初代种群
        invalid_ind = [ind for ind in pop if not ind.fitness.valid]
        fitnesses = list(map(toolbox.evaluate, invalid_ind))
        for ind, fit in zip(invalid_ind, fitnesses):
            ind.fitness.values = fit
        record = stats.compile(pop)
        logbook.record(gen=0, **record)
        # 遗传迭代
        for gen in range(1+N_GEN):
            # 配种选择
            selectTour = toolbox.select(pop, pop_size)
            # 复制
            selectInd = list(map(toolbox.clone, selectTour))
            # 交叉
            for child1, child2 in zip(selectInd[::2], selectInd[1::2]):
                if random.random() < CXPB:
                    toolbox.mate(child1, child2)
                    del child1.fitness.values
                    del child2.fitness.values
            # 变异
            for ind in selectInd:
                if random.random() < MUTPB:
                    toolbox.mutate(ind)
                    del ind.fitness.values
            # 对于被改变的个体，重新计算其适应度
            invalid_ind = [ind for ind in selectInd if not ind.fitness.valid]
            fitnesses = list(map(toolbox.evaluate, invalid_ind))
            for ind, fit in zip(invalid_ind, fitnesses):
                ind.fitness.values = fit
            # 精英育种-加快迭代
            combinedPop = pop + selectInd
            pop = tools.selBest(combinedPop, pop_size)
            # 记录
            record = stats.compile(pop)
            logbook.record(gen=gen, **record)
        print(logbook)
        bestInd = tools.selBest(pop, 1)[0]
        bestFit = bestInd.fitness.values
        print(f'{source} to {target}')
        print('最短耗时为:', bestFit[0])
        print('对应费用为:', bestFit[1])
        print('对应路径为:', dfs(bestInd, source, target))


min = logbook.select('min')
avg = logbook.select('avg')
gen = logbook.select('gen')
plt.plot(gen, min, 'b-', label='MIN_FITNESS')
plt.plot(gen, avg, 'r-', label='AVG_FITNESS')
plt.xlabel('gen')
plt.ylabel('fitness')
plt.legend(loc='best')
plt.tight_layout()
plt.show()

代码及相关数据文件

需要代码和相关数据文件可以在GitHub上获取

参考

https://zhuanlan.zhihu.com/p/81749290

https://blog.csdn.net/weixin_46649052/article/details/110324373