PythonPro #43: VRP Optimization with Genetic Algorithms, New Marimo 0.7.x SQL Features, and Django for Trojan Detection
Bite-sized actionable content, practical tutorials, and resources for Python programmers.
Welcome to a brand new issue of PythonPro!
In today’s Expert Insight we bring you an excerpt from the recently published book, Hands-On Genetic Algorithms with Python - Second Edition, which discusses how to solve the Vehicle Routing Problem (VRP) using genetic algorithms
News Highlights: Marimo 0.7.x introduces reactive SQL cells for querying various data sources and HFLM provides a simple Python interface for text generation and probability computation using Hugging Face models.
Here are my top 5 picks from our learning resources today:
Django - Detecting Trojans in Object Detection Models via Gaussian Focus Calibration🕵️♂️
Linting Excellence - How Black, isort, and Ruff Elevate Python Code Quality✨
And, in today’s Featured Study, we introduce a modular architecture that integrates GA-based routing with virtual servers to enhance the performance, scalability, and security of LANs.
Stay awesome!
Divya Anne Selvaraj
Editor-in-Chief
P.S.: This month’s survey is still live. Do take the opportunity to tell us what you think of PythonPro, request learning resources, and earn your one Packt Credit for this month.
🐍 Python in the Tech 💻 Jungle 🌳
🗞️News
Marimo 0.7.x reactive notebook for Python adds reactive SQL cells: The update allows users to query data from Pandas/Polars dataframes, Google Sheets, and database tables using SQL, with results returned as DataFrames.
Easily generate text and compute probabilities for any Hugging Face LLM: HFLM is a simple Python interface for generating text and computing log probabilities using any model from the 🤗 Transformers library.
💼Case Studies and Experiments🔬
Tests aren’t enough - Case study after adding type hints to urllib3: Outlines the challenges and best practices for integrating type hints incrementally and emphasizes the importance of type hints in improving software quality.
Django - Detecting Trojans in Object Detection Models via Gaussian Focus Calibration: Discusses the development of DJANGO (not to be confused with Django the web development framework), a novel framework for detecting backdoors in object detection models, which enhances detection accuracy.
📊Analysis
Lesser known parts of Python standard library: Covers advanced data structures in collections, such as Deque, Counter, and ChainMap, and tools for better handling of floating-point errors using decimal and fractions.
STORM - Stanford’s Revolutionary Research Tool Harnessing the Power of Agents and Agentic Workflows: STORM generates comprehensive, Wikipedia-style articles by using multiple AI agents to gather information, simulate multi-perspective questioning, and synthesize content.
🎓 Tutorials and Guides 🤓
The Walrus Operator - Python's Assignment Expressions: Covers the operator's syntax, use cases, benefits, and potential pitfalls. Read to understand how to use the operator to avoid repetitive code and improve coding style.
How to build a query language in Python: Guides readers through defining grammar, parsing, and evaluating queries using an Abstract Syntax Tree (AST). Read for insights into building custom query languages.
How to Use Amazon Bedrock APIs for Anthropic Claude 3.5 Sonnet in Python: Covers installing necessary libraries, setting up the Amazon Bedrock client with Boto3, adapting request structures, handling responses, and managing errors.
Dynamic Mode Decomposition using OpenFOAM and Python: Explores the use of Python and Dynamic Mode Decomposition (DMD) to analyze fluid dynamics simulations. Read to learn how to apply DMD to CFD data.
From Zero to Hero - Build a Face Recognition System with Python and Machine Learning in 30 Minutes: Covers prerequisites, setting up the development environment, and step-by-step instructions for detecting and recognizing faces in images.
Building an Efficient Semantic Cache with FAISS and Sentence Transformers for Fast Query Responses: Discusses building an efficient semantic cache to enhance the performance of applications, particularly those using LLMs and RAG systems.
Fast Resampling of Timeseries with ArcticDB: Introduces a new feature that allows efficient downsampling of timeseries data directly during the read process, significantly boosting performance. Read to learn how to process large datasets.
🔑Best Practices and Advice🔏
Python Classes - The Power of Object-Oriented Programming: Offers a comprehensive guide for defining, using, and managing classes. Read to learn how to effectively use Python classes to encapsulate data and behavior.
10 Ways to Write Better Python Codes: Discusses tips including using generators to save memory, setdefault() for managing dictionary keys, and replacing complex if-elif conditions with dictionaries.
Linting Excellence - How Black, isort, and Ruff Elevate Python Code Quality: Discusses how to improve Python code quality through formatting, import organization, and linting. Read to learn about each tool.
Data Analysis and Automation Using Python: Introduces the basics of data analysis and automation using Python, emphasizing techniques for efficiently processing large datasets through chunking. Read to learn through practical examples.
Coding Standards and Best Practices for System Design: Discusses coding standards and best practices in system design, highlighting their importance in ensuring consistency, maintainability, and collaboration in large projects.
🔍Featured Study: Genetic Algorithms and Virtualization for Enhanced LAN Performance💥
In “Optimization of Local Area Network Architecture: A Case Study on Genetic Algorithm-Based Routing and Virtualization,” Khan et al. explore enhancements to Local Area Network (LAN) architecture using Genetic Algorithms (GAs) for dynamic routing and virtualisation.
Context
Traditional static network configurations are becoming increasingly inadequate due to the surge in data traffic from IoT, cloud computing, and mobile applications. GAs are search algorithms based on the principles of natural selection and genetics, highly effective in solving optimization problems by generating high-quality solutions to complex issues. Virtualisation, on the other hand, allows for creating a virtual version of something, including virtual computer hardware platforms, storage devices, and computer network resources. The study proposes a modular architecture that integrates GA-based routing with virtual servers to enhance the performance, scalability, and security of LANs.
Key Features of the Optimised LAN Architecture
Genetic Algorithm for Routing Optimization: This feature dynamically optimizes routing using GAs, which iteratively evaluate and refine routing options based on real-time data to reduce latency and enhance throughput. This adaptive process improves the network’s response to varying traffic conditions.
Virtualization for Resource Management: Virtualization separates network operations from physical hardware, improving scalability and resource management. Virtual servers adjust resources dynamically, allowing the network to scale without extensive reconfiguration and supporting independent component management.
Advanced Security Measures: Integrated security protocols protect against internal and external threats:
DHCP Snooping: Prevents IP spoofing by validating DHCP messages.
Access Control Lists (ACLs): Regulates access to network resources based on IP addresses and protocols.
Network Address Translation (NAT): Enhances security by mapping private IP addresses to public ones.
What This Means for You
This research provides a blueprint for Python developers interested in network management and security solutions, offering a practical example of how Python can be interfaced with advanced technologies like genetic algorithms and virtualisation to tackle real-world problems.
Examining the Details
The study's methodology included the development of a simulated environment where the effectiveness of GA-based dynamic routing and virtual server use was tested. The researchers employed a series of simulations using Cisco Packet Tracer, NetworkX, and Python to validate their modular architecture.
There was a noticeable decrease in server load with Server1's average load reducing from 60% to 40%. Similarly, Server2 showed a comparable reduction, and Server3 experienced a decrease in response time from 220 milliseconds to 190 milliseconds. The network's throughput capacity significantly increased from 20 Gbps to 50 Gbps, indicating a substantial improvement in the network's ability to handle data. Resource utilization became more efficient with CPU usage dropping from 80% to 40%, and memory usage decreasing from 70% to 50%. The need for physical servers was also reduced from 10 to 3, emphasizing a reduction in hardware dependencies and operational costs. Networks with higher modularity levels showed reduced reaction times (from 250 milliseconds to 180 milliseconds) and error rates (from 14% to 6%), which suggests enhanced network stability and performance under diverse traffic scenarios.
You can learn more by reading the entire paper.
Take the Survey, Get a Packt Credit!
🧠 Expert insight 📚
Here’s an excerpt from “Chapter 4: Combinatorial Optimization” in the book, Hands-On Genetic Algorithms with Python - Second Edition, by Eyal Wirsansky, published in July 2024.
Solving the VRP
Imagine that you …manage a large… fulfillment center. You… need to deliver packages to a list of customers…(and) have a fleet of several vehicles at your disposal. What’s the best way to deliver the packages to the customers using these vehicles?
This is an example of the VRP, a generalization of the TSP described in the previous section. The basic VRP consists of the following three components:
The list of locations that need to be visited
The number of vehicles
The location of the depot, which is used as the starting and ending point for each of the vehicles
The problem has numerous variations, such as several depot locations, time-critical deliveries, different types of vehicles (varying capacity, varying fuel consumption), and many more.
The goal of the problem is to minimize the cost, which can also be defined in many different ways. Examples include minimizing the time it takes to deliver all the packages, minimizing the cost of the fuel, and minimizing the variation in travel time among the vehicles used.
An illustration of a VRP with three vehicles is shown here. The cities are marked with dark circles and the depot location with an empty square, while the routes of the three vehicles are marked with three different colors:
Figure 4.11: Example VRP with three vehicles
In our example, we will aim to optimize the time it takes to deliver all the packages. Since all the vehicles operate simultaneously, this measure is determined by the vehicle making the longest route. Therefore, we can make it our objective to minimize the length of the longest route among the participating vehicles’ routes. For example, if we have three vehicles, each solution consists of three routes. We will evaluate all three, and then only consider the longest one of them for scoring – the longer the route, the worse the score. This will inherently encourage all three routes to be shorter, as well as closer in size to each other.
Thanks to the similarity between the two problems, we can utilize the code we wrote previously to solve the TSP for solving the VRP.
To build on the solution we created for the TSP, we can represent vehicle routing as follows:
A TSP instance, namely a list of cities and their coordinates (or their mutual distances)
The depot location, which is selected out of the existing cities, and represented by the index of that city
The number of vehicles used
…
Python problem representation
To encapsulate the VRP problem, we created a Python class called VehicleRoutingProblem. This class is contained in the vrp.py file and can be found here.
The VehicleRoutingProblem class contains an instance of the TravelingSalesmanProblem class, which is used as the container for the city indices and their corresponding locations and distances. When creating an instance of the VehicleRoutingProblem class, the instance of the underlying TravelingSalesmanProblem is created internally and initialized.
The VehicleRoutingProblem class is initialized using the name of the underlying TravelingSalesmanProblem, as well as the depot location index and the number of vehicles.
In addition, The VehicleRoutingProblem class provides the following public methods:
getRoutes(indices): This breaks the list of given indices into separate routes by detecting the “separator” indices
getRouteDistance(indices): This calculates total the distance of the path that starts at the depot location and goes through the cities described by the given indices
getMaxDistance(indices): This calculates the max distance among the distances of the various paths described by the given indices, after breaking the indices to separate routes
getTotalDistance(indices): This calculates the combined distance of the various paths described by the given indices
plotData(indices): This breaks the list of indices into separate routes and plots each route in a different color
When executed as a standalone program, the main method of the class exercises these methods by creating an instance of VehicleRoutingProblem with the underlying TSP set to “bayg29” – the same problem we used in the previous section. The number of vehicles is set to 3, and the depot location index is set to 12 (which maps to a city with a central location). The following figure shows the locations of the cities (red dots) and the depot (green “x”):
Figure 4.14: A plot of the VRP based on the “bayg29” TSP.
Red dots mark the cities while the green “X” marks the depot.
The main method then generates a random solution, breaks it down into routes, and calculates the distances, as shown here:
random solution = [27, 23, 7, 18, 30, 14, 19, 3, 16, 2, 26, 9, 24, 22, 15, 17, 28, 11, 21, 12, 8, 4, 5, 13, 25, 6, 0, 29, 10, 1, 20]
route breakdown = [[27, 23, 7, 18], [14, 19, 3, 16, 2, 26, 9, 24, 22, 15, 17, 28, 11, 21, 8, 4, 5, 13, 25, 6, 0], [10, 1, 20]]
total distance = 26653.845703125
max distance = 21517.686
Note how the original list of indices of the random solution is broken down into separate routes using the separator indices (29 and 30). The plot for this random solution is shown here:
Figure 4.15: A plot of a random solution for the VRP with three vehicles
As we would expect from a random solution, it is far from optimal. This is evident from the inefficient order of cities along the long (green) route, as well as one route (green) being much longer than the other two (red and purple).
In the next subsection, we will attempt to produce good solutions using the genetic algorithms method.
Genetic algorithm solution
The genetic algorithm solution we created for the VRP resides in the 04-solve-vrp.py Python file located here.
Since we were able to build on top of the TSP and used a similar representation for the solution – an array of indices – we could use the same genetic approach we used in the previous section. We could also take advantage of elitism by reusing the elitist version that we created for the genetic flow. This makes our genetic algorithm solution very similar to the one we used for the TSP.
The following steps detail the main parts of our solution:
The program starts by creating an instance of theVehicleRoutingProblem class, using the “bayg29” TSP for its underlying data, and setting the depot location to 12 and the number of vehicles to 3:
TSP_NAME = "bayg29"
NUM_OF_VEHICLES = 3
DEPOT_LOCATION = 12
vrp = vrp.VehicleRoutingProblem(TSP_NAME, NUM_OF_VEHICLES,
DEPOT_LOCATION)
The fitness function is set to minimize the distance of the longest route among the three routes produced by each solution:
def vrpDistance(individual):
return vrp.getMaxDistance(individual),
toolbox.register("evaluate", vrpDistance)
For the genetic operators, we once again use tournament selection with a tournament size of 2, which is assisted by the elitist approach, and crossover and mutation operators that are specialized for ordered lists:
# Genetic operators:
toolbox.register("select", tools.selTournament, tournsize=2)
toolbox.register("mate", tools.cxUniformPartialyMatched, \
indpb=2.0/len(vrp))
toolbox.register("mutate", tools.mutShuffleIndexes, \
indpb=1.0/len(vrp))
As the VRP is inherently more difficult than TSP, we chose a larger population size and number of generations than before:
# Genetic Algorithm constants:
POPULATION_SIZE = 500
P_CROSSOVER = 0.9
P_MUTATION = 0.2
MAX_GENERATIONS = 1000
HALL_OF_FAME_SIZE = 30
And that’s it! We’re ready to run the program. The results that we obtain with these settings are shown here – three routes, with a maximum length of 3857:
-- Best Ever Individual = Individual('i', [0, 20, 17, 16, 13, 21, 10, 14, 3, 29, 15, 23, 7, 26, 12, 22, 6, 24, 18, 9, 19, 30, 27, 11, 5, 4, 8, 25, 2, 28, 1])
-- Best Ever Fitness = 3857.36376953125
-- Route Breakdown = [[0, 20, 17, 16, 13, 21, 10, 14, 3], [15, 23, 7, 26, 22, 6, 24, 18, 9, 19], [27, 11, 5, 4, 8, 25, 2, 28, 1]]
-- total distance = 11541.875
-- max distance = 3857.3638
Note, again, how the solution is broken down into three separate routes using the highest two indices (29, 30) as separators, and ignoring the depot location (12). We ended up with three routes, two of them covering nine cities each, and the third covering 10 cities.
Plotting the solution produces the following figure depicting the three resulting routes:
Figure 4.16: A plot of the best solution found by the program for the VRP with three vehicles
The following statistics plot shows that the algorithm did most of the optimization before reaching 300 generations. After, there are several small improvements:
Figure 4.17: Stats of the program solving the VRP with three vehicles
Packt library subscribers can continue reading the entire book for free. You can buy Hands-On Genetic Algorithms with Python - Second Edition, by Eyal Wirsansky, here.
And that’s a wrap.
We have an entire range of newsletters with focused content for tech pros. Subscribe to the ones you find the most useful here. The complete PythonPro archives can be found here.
If you have any suggestions or feedback, or would like us to find you a Python learning resource on a particular subject, take the survey or leave a comment below!












