Jekyll2023-09-26T21:17:17+10:00https://hsteinshiromoto.github.io/feed.xmlCritical PointHumberto's websiteHumberto STEIN SHIROMOTOLeveraging Generative AI for Effective Risk Management Part II: Actionable Strategies for Businesses2023-09-26T00:00:00+10:002023-09-26T00:00:00+10:00https://hsteinshiromoto.github.io/posts/2023/09/26/blog-post_generative_ai_for_risk_management<p>In today’s rapidly evolving business landscape, risk management is more critical than ever before. As industries grow increasingly complex and interconnected, the need for sophisticated strategies to identify, assess, and mitigate risks has become paramount. Enter generative artificial intelligence (AI), a revolutionary technology that has the potential to transform the way businesses design and implement risk controls. In this article, we’ll explore how businesses can harness the power of generative AI to enhance their risk management practices, backed by actionable tips and real-life case studies.</p>
<div class="notice--info">
<p>This post is part of a series. The link for the part I can be found <a href="https://humberto.stein-shiromoto.net/posts/2023/09/01/introduction_to_business_risk">here</a>.</p>
</div>
<h2 id="understanding-generative-ai-in-risk-management">Understanding Generative AI in Risk Management</h2>
<p>Generative AI involves using algorithms to create new, original content based on patterns and data it has learned from. In the context of risk management, generative AI can be employed to simulate scenarios, model potential risks, and design effective controls. Here’s how businesses can put this innovative technology to work:</p>
<h3 id="1-identifying-emerging-risks">1. Identifying Emerging Risks</h3>
<p>Generative AI can analyze massive datasets from various sources to identify emerging risks. By recognizing subtle patterns and correlations, businesses can stay ahead of potential threats. For instance, a financial institution could use (generative) AI to analyze market trends, news sentiment, economic indicators to predict potential financial crises, and detect policy infringements suck as financial crime.</p>
<h3 id="2-scenario-modeling">2. Scenario Modeling</h3>
<p>Generating realistic risk scenarios is crucial for preparing effective risk controls. Generative AI can simulate a wide range of scenarios, helping businesses understand the possible impacts of different risks. This approach empowers businesses to design controls that are agile and responsive. For example, financial company could use generative AI to simulate scenarios on how the market could react for specific types of news and develop strategies to ensure stability.</p>
<h3 id="3-designing-tailored-controls">3. Designing Tailored Controls</h3>
<p>Generative AI can assist in designing controls that are tailor-fitted to a business’s unique risk profile. By considering multiple variables and data points, AI can suggest controls that are both effective and efficient. An insurance company could employ generative AI to create customized policies for its customers.</p>
<h2 id="actionable-tips-for-businesses">Actionable Tips for Businesses</h2>
<p>Implementing generative AI for risk management requires a strategic approach. Here are some actionable tips for businesses looking to harness its potential:</p>
<h3 id="1-data-quality-matters">1. Data Quality Matters</h3>
<p>Generative AI thrives on data. To ensure accurate risk assessments, gather high-quality, relevant data. Clean, comprehensive data sets will enhance the AI’s ability to generate meaningful insights.</p>
<h3 id="2-collaboration-is-key">2. Collaboration is Key</h3>
<p>Engage a cross-functional team to work with the AI system. Risk management involves input from various departments, each with unique insights. Collaborative efforts will lead to more robust risk controls.</p>
<h3 id="3-human-oversight-and-interpretation">3. Human Oversight and Interpretation</h3>
<p>While generative AI is powerful, human expertise is irreplaceable. Interpret AI-generated insights through the lens of domain knowledge experts to make informed decisions.</p>
<h2 id="real-life-case-studies">Real-Life Case Studies</h2>
<p>Let’s examine how leading businesses have successfully integrated generative AI into their risk management strategies:</p>
<h3 id="case-study-1-proactive-supply-chain-management">Case Study 1: Proactive Supply Chain Management</h3>
<p>A global retailer employed generative AI to analyze supply chain data and identify potential disruptions. The AI-generated scenarios allowed the company to optimize inventory levels and establish backup suppliers, minimizing the impact of unforeseen events.</p>
<h3 id="case-study-2-healthcare-system-enhancement">Case Study 2: Healthcare System Enhancement</h3>
<p>A healthcare provider used generative AI to predict patient admission surges. By considering factors like weather, disease outbreaks, and historical admission data, the hospital developed staffing strategies to handle influxes efficiently, ensuring optimal patient care.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Generative AI is a game-changer in risk management, enabling businesses to anticipate, plan for, and mitigate a wide array of risks. By harnessing its capabilities, organizations can design tailored controls, model scenarios, and make informed decisions to secure their future. Embrace the power of generative AI, and elevate your risk management practices to new heights.</p>
<p>Remember, while generative AI offers incredible insights, it’s essential to balance its outputs with human judgment. The marriage of cutting-edge technology and human expertise will undoubtedly drive businesses toward more effective risk management in this dynamic world.</p>Humberto STEIN SHIROMOTOIn today’s rapidly evolving business landscape, risk management is more critical than ever before. As industries grow increasingly complex and interconnected, the need for sophisticated strategies to identify, assess, and mitigate risks has become paramount. Enter generative artificial intelligence (AI), a revolutionary technology that has the potential to transform the way businesses design and implement risk controls. In this article, we’ll explore how businesses can harness the power of generative AI to enhance their risk management practices, backed by actionable tips and real-life case studies.Leveraging Generative AI for Effective Risk Management Part I: Introduction to Business Risk2023-09-01T00:00:00+10:002023-09-01T00:00:00+10:00https://hsteinshiromoto.github.io/posts/2023/09/01/blog-post_risk_management<p>In life, there are certainties like death and taxes, but there’s one more constant: risk. The COVID-19 pandemic starkly reminded us of this fact as we grappled with evaluating and reevaluating personal risks with each wave of the pandemic. Businesses face similar challenges, and their ability to manage and mitigate risk plays a crucial role in their success.</p>
<div class="notice--info">
<p>This post is part of a series. The link for the part II can be found <a href="https://humberto.stein-shiromoto.net/posts/2023/09/26/generative_ai_for_risk_management">here</a>.</p>
</div>
<h2 id="the-origins-of-business-risk">The Origins of Business Risk</h2>
<p>Businesses encounter risk from both external and internal sources. External factors like inflation, supply chain disruptions, geopolitical shifts, climate-related disasters, competition, reputation issues, and cyberattacks can significantly impact an organization’s plans. Internally, poor leadership decisions or unauthorized disclosures of sensitive information can also pose risks. Yet, perhaps the most dangerous risk is missing opportunities for innovation and growth.</p>
<p>The modern era is marked by frequent shocks related to socioeconomic, economic, and climate factors. In 2019 alone, there were 40 weather-related disasters causing over $1 billion in damages each. To stay competitive, organizations must adopt flexible risk management strategies, which involve forecasting new threats, recognizing shifts in existing threats, and forming comprehensive response plans. While there’s no magic formula to navigate crises, a well-structured risk management strategy can shield an organization from critical disruptions.</p>
<h2 id="understanding-risk-management">Understanding Risk Management</h2>
<p>Risk management involves identifying, handling, and mitigating threats through various approaches and activities. After recognizing a risk, organizations develop measures to reduce its potential impact. While eliminating risk is ideal, other methods include loss mitigation (like insurance) and redundancy (using backup systems to prevent data loss during outages).</p>
<h2 id="the-three-key-elements-of-a-comprehensive-risk-management-strategy">The Three Key Elements of a Comprehensive Risk Management Strategy</h2>
<p>A proactive risk management plan comprises three critical components:</p>
<h3 id="1-detecting-risks-and-addressing-vulnerabilities">1. Detecting Risks and Addressing Vulnerabilities</h3>
<p>Organizations must maintain a proactive stance by analyzing how risks might evolve over time, handling systemic risks, and identifying new risks that may emerge.</p>
<h3 id="2-evaluating-risk-tolerance">2. Evaluating Risk Tolerance</h3>
<p>Companies should define risk tolerance levels that align with their values, strategies, capabilities, and competitive landscapes. This involves reassessing risk profiles, rejecting some risks unequivocally, and considering the effectiveness of control mechanisms.</p>
<h3 id="3-choosing-a-risk-management-approach">3. Choosing a Risk Management Approach</h3>
<p>Organizations must decide how to respond when confronted with new risks. This decision-making process should involve leaders from various departments and adapt to changing circumstances.</p>
<h2 id="developing-adaptable-risk-management">Developing Adaptable Risk Management</h2>
<p>Effective risk management is crucial for survival, especially during severe or abrupt risks. Here are five actions leaders can take:</p>
<ol>
<li>
<p><strong>Reframe the vision for risk management:</strong> Set clear goals, define risk levels, and engage in conversations with business leaders to foster well-informed decision-making regarding risk versus reward.</p>
</li>
<li>
<p><strong>Establish agile risk management procedures:</strong> Form cross-functional teams with the authority to make swift risk management decisions.</p>
</li>
<li>
<p><strong>Leverage data and analytics:</strong> Digital tools and data can enhance risk management efforts, providing better insights and predictions.</p>
</li>
<li>
<p><strong>Cultivate future-ready risk expertise:</strong> Equip risk managers with fresh competencies and knowledge to understand evolving risks.</p>
</li>
<li>
<p><strong>Strengthen risk culture:</strong> Foster an organizational mindset that responds swiftly to threats.</p>
</li>
</ol>
<h2 id="the-role-of-scenarios-in-grasping-uncertainty">The Role of Scenarios in Grasping Uncertainty</h2>
<p>Scenario planning helps leaders turn abstract hypotheses into narratives that depict plausible future scenarios. This offers advantages like expanding thinking, identifying likely futures, safeguarding against groupthink, and challenging conventional wisdom.</p>
<h2 id="insights-on-risk-in-financial-institutions">Insights on Risk in Financial Institutions</h2>
<p>According to chief risk officers (CROs), banks face heightened exposure to rapidly evolving market dynamics, climate change, and cybercrime. While the pandemic’s impact on nonfinancial risk is expected to diminish, climate change is anticipated to become a substantial concern. Cybercrime remains a top risk for financial institutions.</p>
<h2 id="understanding-cyber-risk">Understanding Cyber Risk</h2>
<p>Cyber risk encompasses potential digital losses, including financial, reputational, operational, productivity, and regulatory aspects. It can also manifest as physical world losses, such as damage to operational equipment. Cyber threats, like privilege escalation, vulnerability exploitation, and phishing, create the potential for cyber risk.</p>
<h2 id="a-risk-based-cybersecurity-approach">A Risk-Based Cybersecurity Approach</h2>
<p>A risk-based cybersecurity approach prioritizes risk reduction over achieving a specific level of cybersecurity maturity. It focuses on addressing the most critical vulnerabilities effectively. Steps include integrating cybersecurity into enterprise risk management, evaluating vulnerabilities across people, processes, and technology, comprehending threat actors, and monitoring risks against risk appetite.</p>
<h2 id="prudent-investments-in-risk-management">Prudent Investments in Risk Management</h2>
<p>To manage high-consequence, low-likelihood risks (or “big bets”), organizations must prioritize existential threats. A two-by-two risk grid assesses the impact of an event against the certainty of that impact. Investments aimed at safeguarding value propositions can enhance an organization’s resilience.</p>
<p>In a constantly changing world, managing risk effectively is not just about creating plans; it’s about regularly evaluating and updating them to remain relevant and resilient.</p>Humberto STEIN SHIROMOTOIn life, there are certainties like death and taxes, but there’s one more constant: risk. The COVID-19 pandemic starkly reminded us of this fact as we grappled with evaluating and reevaluating personal risks with each wave of the pandemic. Businesses face similar challenges, and their ability to manage and mitigate risk plays a crucial role in their success.The Traveling Salesman Problem2023-08-29T00:00:00+10:002023-08-29T00:00:00+10:00https://hsteinshiromoto.github.io/posts/2023/08/29/blog_post_traveling%20salesman<p>The Traveling Salesman Problem (TSP), a captivating conundrum in mathematical optimization, has seamlessly integrated itself into a multitude of industries, reshaping the way we approach efficiency and problem-solving.</p>
<p>The problem consists of finding the path that minizes the overall traveled distance between locations such that all locations are visited only once.</p>
<p>In the realm of logistics, the TSP takes center stage, offering a beacon of optimization for supply chains and delivery routes. Giants like Amazon leverage TSP-inspired algorithms to orchestrate last-mile deliveries, aligning packages with real-time traffic data and delivery priorities. This dynamic approach shaves miles off routes, minimizes fuel consumption, and ensures prompt deliveries.</p>
<p>Manufacturing, too, bows to the TSP’s prowess. Within bustling factories, where precision and organization reign supreme, the TSP guides the sequencing of production steps. Whether in the assembly of intricate automobiles or the manufacturing of various goods, the TSP minimizes idle time, reduces bottlenecks, and ultimately enhances productivity.</p>
<p>Surprisingly, the TSP extends its influence beyond the realms of logistics and manufacturing. The world of DNA sequencing benefits from its route-optimizing abilities, accelerating genetic research by reducing sequencing time and costs. Additionally, the TSP finds its place in the intricate landscape of circuit design, optimizing signal propagation in integrated circuits and exemplifying the synergy between mathematics and engineering.</p>
<h2 id="modules">Modules</h2>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre><span class="kn">import</span> <span class="nn">pulp</span>
<span class="kn">import</span> <span class="nn">networkx</span> <span class="k">as</span> <span class="n">nx</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">import</span> <span class="nn">math</span>
<span class="kn">import</span> <span class="nn">networkx</span> <span class="k">as</span> <span class="n">nx</span>
<span class="kn">import</span> <span class="nn">itertools</span>
<span class="o">%</span><span class="n">load_ext</span> <span class="n">watermark</span>
<span class="o">%</span><span class="n">watermark</span> <span class="o">-</span><span class="n">n</span> <span class="o">-</span><span class="n">u</span> <span class="o">-</span><span class="n">v</span> <span class="o">-</span><span class="n">iv</span> <span class="o">-</span><span class="n">w</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre>Last updated: Mon Aug 28 2023
Python implementation: CPython
Python version : 3.11.1
IPython version : 8.12.0
matplotlib: 3.7.2
networkx : 3.1
numpy : 1.25.2
pulp : 2.7.0
Watermark: 2.3.1
</pre></td></tr></tbody></table></code></pre></div></div>
<h2 id="data">Data</h2>
<p>The data used for this exercise is a dictionary containing the name of the cities and distances between them.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
</pre></td><td class="rouge-code"><pre><span class="n">cities</span> <span class="o">=</span> <span class="p">{</span>
<span class="s">'name'</span><span class="p">:</span> <span class="p">[]</span>
<span class="p">,</span><span class="s">'distance'</span><span class="p">:</span> <span class="p">[]</span>
<span class="p">}</span>
<span class="c1"># We will generate synthetic coordinates for n_cities
</span><span class="n">n_cities</span> <span class="o">=</span> <span class="mi">4</span>
<span class="n">M</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">rand</span><span class="p">(</span><span class="n">n_cities</span><span class="p">,</span><span class="n">n_cities</span><span class="p">)</span>
<span class="n">cities</span><span class="p">[</span><span class="s">'distance'</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">M</span> <span class="o">+</span> <span class="n">M</span><span class="p">.</span><span class="n">T</span><span class="p">)</span><span class="o">/</span><span class="mi">2</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_cities</span><span class="p">):</span>
<span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">].</span><span class="n">append</span><span class="p">(</span><span class="sa">f</span><span class="s">'city</span><span class="si">{</span><span class="n">i</span><span class="si">}</span><span class="s">'</span><span class="p">)</span>
<span class="n">cities</span><span class="p">[</span><span class="s">'distance'</span><span class="p">][</span><span class="n">i</span><span class="p">,</span> <span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">cities</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>{'name': ['city0', 'city1', 'city2', 'city3'],
'distance': array([[0. , 0.80122558, 0.50666924, 0.13568748],
[0.80122558, 0. , 0.64279003, 0.31393952],
[0.50666924, 0.64279003, 0. , 0.51592309],
[0.13568748, 0.31393952, 0.51592309, 0. ]])}
</pre></td></tr></tbody></table></code></pre></div></div>
<p>We use <code class="language-plaintext highlighter-rouge">networkx</code> to plot the graph generated by the distance matrix.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">20</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span>
<span class="n">G</span> <span class="o">=</span> <span class="n">nx</span><span class="p">.</span><span class="n">DiGraph</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'distance'</span><span class="p">])</span>
<span class="n">pos</span> <span class="o">=</span> <span class="n">nx</span><span class="p">.</span><span class="n">spring_layout</span><span class="p">(</span><span class="n">G</span><span class="p">)</span>
<span class="n">nx</span><span class="p">.</span><span class="n">draw_networkx_nodes</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">pos</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">);</span>
<span class="n">nx</span><span class="p">.</span><span class="n">draw_networkx_labels</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">pos</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">);</span>
<span class="n">nx</span><span class="p">.</span><span class="n">draw_networkx_edges</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">pos</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">,</span> <span class="n">width</span><span class="o">=</span><span class="mi">2</span><span class="p">);</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p><img src="images/2023-08-29_travelling_salesman/output_7_0.png" alt="png" /></p>
<h2 id="formulation">Formulation</h2>
<p>For each $i,j=1,\ldots,n$, let $x_{ij}$ be a binary variable defined as 1 if there exists a path between city $i$ to $j$ and 0 otherwise.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="n">x</span> <span class="o">=</span> <span class="n">pulp</span><span class="p">.</span><span class="n">LpVariable</span><span class="p">.</span><span class="n">dicts</span><span class="p">(</span><span class="s">"x"</span><span class="p">,</span> <span class="p">[(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">]))</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">]))</span> <span class="k">if</span> <span class="n">i</span> <span class="o">!=</span> <span class="n">j</span><span class="p">],</span> <span class="n">cat</span><span class="o">=</span><span class="s">'Binary'</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>Minimize the distance $d_{ij}$ between cities $i$ and $j$</p>
\[\min\sum_{i=1}^n\sum_{j\neq i,j=1}^nd_{ij}x_{ij}\]
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="c1"># Define the TSP problem
</span><span class="n">prob</span> <span class="o">=</span> <span class="n">pulp</span><span class="p">.</span><span class="n">LpProblem</span><span class="p">(</span><span class="s">"TSP"</span><span class="p">,</span> <span class="n">pulp</span><span class="p">.</span><span class="n">LpMinimize</span><span class="p">)</span>
<span class="c1"># Define the objective function
</span><span class="n">prob</span> <span class="o">+=</span> <span class="n">pulp</span><span class="p">.</span><span class="n">lpSum</span><span class="p">([</span><span class="n">cities</span><span class="p">[</span><span class="s">'distance'</span><span class="p">][</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">*</span> <span class="n">x</span><span class="p">[(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">)]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">]))</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">]))</span> <span class="k">if</span> <span class="n">i</span> <span class="o">!=</span> <span class="n">j</span><span class="p">])</span>
</pre></td></tr></tbody></table></code></pre></div></div>
\[x_{ij}\in\{0,1\}\quad i,j=1,\ldots,n\]
\[u_i\in\mathbb{Z}\quad i=1,\ldots,n\]
<p>For each city $j$, the salesman must arrive exactly one time</p>
\[\sum_{i=1,i\neq j}^n x_{ij}=1\quad j=1,\ldots,n\]
<p>For each city $i$, the salesman must leave exactly one time:</p>
\[\sum_{j=1,j\neq i}^n x_{ij}=1\quad i=1,\ldots,n\]
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">])):</span>
<span class="n">prob</span> <span class="o">+=</span> <span class="n">pulp</span><span class="p">.</span><span class="n">lpSum</span><span class="p">([</span><span class="n">x</span><span class="p">[(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">)]</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">]))</span> <span class="k">if</span> <span class="n">i</span> <span class="o">!=</span> <span class="n">j</span><span class="p">])</span> <span class="o">==</span> <span class="mi">1</span>
<span class="n">prob</span> <span class="o">+=</span> <span class="n">pulp</span><span class="p">.</span><span class="n">lpSum</span><span class="p">([</span><span class="n">x</span><span class="p">[(</span><span class="n">j</span><span class="p">,</span> <span class="n">i</span><span class="p">)]</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">]))</span> <span class="k">if</span> <span class="n">i</span> <span class="o">!=</span> <span class="n">j</span><span class="p">])</span> <span class="o">==</span> <span class="mi">1</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>Subtour elimination constraint—ensures no proper subset $Q$ can form a sub-tour, so the solution returned is a single tour and not the union of smaller tours</p>
\[\sum_{i\in Q}\sum_{j\neq i,j\in Q}^n x_{ij}\leq|Q|-1\quad \forall Q\subsetneq\{1,\ldots,n\},|Q|\geq2\]
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="k">for</span> <span class="n">k</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">])):</span>
<span class="k">for</span> <span class="n">S</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">])):</span>
<span class="k">for</span> <span class="n">subset</span> <span class="ow">in</span> <span class="n">itertools</span><span class="p">.</span><span class="n">combinations</span><span class="p">([</span><span class="n">i</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">]))</span> <span class="k">if</span> <span class="n">i</span> <span class="o">!=</span> <span class="n">k</span><span class="p">],</span> <span class="n">S</span><span class="p">):</span>
<span class="n">prob</span> <span class="o">+=</span> <span class="n">pulp</span><span class="p">.</span><span class="n">lpSum</span><span class="p">([</span><span class="n">x</span><span class="p">[(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">)]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">subset</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">subset</span> <span class="k">if</span> <span class="n">i</span> <span class="o">!=</span> <span class="n">j</span><span class="p">])</span> <span class="o"><=</span> <span class="nb">len</span><span class="p">(</span><span class="n">subset</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<h2 id="solver">Solver</h2>
<p>Here, we used the PuLP solver to obtain a solution to the formulated problem.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="c1"># Solve the problem using the CBC solver
</span><span class="n">prob</span><span class="p">.</span><span class="n">solve</span><span class="p">(</span><span class="n">pulp</span><span class="p">.</span><span class="n">PULP_CBC_CMD</span><span class="p">())</span>
<span class="c1"># Print the status of the solution
</span><span class="k">print</span><span class="p">(</span><span class="s">"Status:"</span><span class="p">,</span> <span class="n">pulp</span><span class="p">.</span><span class="n">LpStatus</span><span class="p">[</span><span class="n">prob</span><span class="p">.</span><span class="n">status</span><span class="p">])</span>
<span class="c1"># Print the optimal objective value
</span><span class="k">print</span><span class="p">(</span><span class="s">"Total distance traveled:"</span><span class="p">,</span> <span class="n">pulp</span><span class="p">.</span><span class="n">value</span><span class="p">(</span><span class="n">prob</span><span class="p">.</span><span class="n">objective</span><span class="p">))</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
</pre></td><td class="rouge-code"><pre>Welcome to the CBC MILP Solver
Version: 2.10.3
Build Date: Dec 15 2019
command line - /home/docker.datascience/.pyenv/versions/3.11.1/lib/python3.11/site-packages/pulp/solverdir/cbc/linux/64/cbc /tmp/111feaee2b8746bebc1aed7c856d7248-pulp.mps timeMode elapsed branch printingOptions all solution /tmp/111feaee2b8746bebc1aed7c856d7248-pulp.sol (default strategy 1)
At line 2 NAME MODEL
At line 3 ROWS
At line 29 COLUMNS
At line 138 RHS
At line 163 BOUNDS
At line 176 ENDATA
Problem MODEL has 24 rows, 12 columns and 72 elements
Coin0008I MODEL read with 0 errors
Option for timeMode changed from cpu to elapsed
Continuous objective value is 1.59909 - 0.02 seconds
Cgl0004I processed model has 18 rows, 12 columns (12 integer (12 of which binary)) and 60 elements
Cbc0038I Initial state - 0 integers unsatisfied sum - 0
Cbc0038I Solution found of 1.59909
Cbc0038I Before mini branch and bound, 12 integers at bound fixed and 0 continuous
Cbc0038I Mini branch and bound did not improve solution (0.06 seconds)
Cbc0038I After 0.06 seconds - Feasibility pump exiting with objective of 1.59909 - took 0.00 seconds
Cbc0012I Integer solution of 1.5990863 found by feasibility pump after 0 iterations and 0 nodes (0.06 seconds)
Cbc0001I Search completed - best objective 1.5990862673485, took 0 iterations and 0 nodes (0.06 seconds)
Cbc0035I Maximum depth 0, 0 variables fixed on reduced cost
Cuts at root node changed objective from 1.59909 to 1.59909
Probing was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Gomory was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Knapsack was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Clique was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
MixedIntegerRounding2 was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
FlowCover was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
TwoMirCuts was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
ZeroHalf was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Result - Optimal solution found
Objective value: 1.59908627
Enumerated nodes: 0
Total iterations: 0
Time (CPU seconds): 0.07
Time (Wallclock seconds): 0.07
Option for printingOptions changed from normal to all
Total time (CPU seconds): 0.08 (Wallclock seconds): 0.08
Status: Optimal
Total distance traveled: 1.5990862673485606
</pre></td></tr></tbody></table></code></pre></div></div>
<h2 id="solution-analysis">Solution Analysis</h2>
<p>Let’s understand what the solution is.</p>
<p>The minimal tour is the minimal distance necessary to visit all the nodes once.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="k">print</span><span class="p">(</span><span class="s">"minimal tour: "</span><span class="p">,</span> <span class="n">prob</span><span class="p">.</span><span class="n">objective</span><span class="p">.</span><span class="n">value</span><span class="p">())</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>minimal tour: 1.5990862673485606
</pre></td></tr></tbody></table></code></pre></div></div>
<p>The followin code extracts the route of the optimal solution.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre><span class="c1"># Extract the solution
</span><span class="n">solution</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">start_city</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">next_city</span> <span class="o">=</span> <span class="n">start_city</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
<span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">cities</span><span class="p">[</span><span class="s">'name'</span><span class="p">])):</span>
<span class="k">if</span> <span class="n">j</span> <span class="o">!=</span> <span class="n">next_city</span> <span class="ow">and</span> <span class="n">x</span><span class="p">[(</span><span class="n">next_city</span><span class="p">,</span> <span class="n">j</span><span class="p">)].</span><span class="n">value</span><span class="p">()</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span>
<span class="n">solution</span><span class="p">.</span><span class="n">append</span><span class="p">((</span><span class="n">next_city</span><span class="p">,</span> <span class="n">j</span><span class="p">))</span>
<span class="n">next_city</span> <span class="o">=</span> <span class="n">j</span>
<span class="k">break</span>
<span class="k">if</span> <span class="n">next_city</span> <span class="o">==</span> <span class="n">start_city</span><span class="p">:</span>
<span class="k">break</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="c1"># Print the solution
</span><span class="k">print</span><span class="p">(</span><span class="s">"Route:"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">solution</span><span class="p">)):</span>
<span class="k">print</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">solution</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="mi">0</span><span class="p">])</span> <span class="o">+</span> <span class="s">" -> "</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">solution</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="mi">1</span><span class="p">]))</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>Route:
0 -> 2
2 -> 1
1 -> 3
3 -> 0
</pre></td></tr></tbody></table></code></pre></div></div>
<h2 id="plot-solution">Plot Solution</h2>
<p>Here, we can plot the solution as a graph. The optimal solution gives the distance from city $i$ (row) to city $j$ (column).</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="n">sol_matrix</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="nb">len</span><span class="p">(</span><span class="n">solution</span><span class="p">),</span> <span class="nb">len</span><span class="p">(</span><span class="n">solution</span><span class="p">)))</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">solution</span><span class="p">)):</span>
<span class="n">sol_matrix</span><span class="p">[</span><span class="n">solution</span><span class="p">[</span><span class="n">i</span><span class="p">]]</span> <span class="o">=</span> <span class="n">cities</span><span class="p">[</span><span class="s">'distance'</span><span class="p">][</span><span class="n">solution</span><span class="p">[</span><span class="n">i</span><span class="p">]]</span>
<span class="n">sol_matrix</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre>array([[0. , 0. , 0.50666924, 0. ],
[0. , 0. , 0. , 0.31393952],
[0. , 0.64279003, 0. , 0. ],
[0.13568748, 0. , 0. , 0. ]])
</pre></td></tr></tbody></table></code></pre></div></div>
<p>The following function allow us to draw this graph using the distances as weights on the vertices.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
</pre></td><td class="rouge-code"><pre><span class="k">class</span> <span class="nc">Draw_Graph</span><span class="p">():</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">G</span><span class="p">,</span> <span class="n">pos</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">arc</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.25</span><span class="p">,</span> <span class="n">edge_attribute</span><span class="p">:</span> <span class="nb">str</span><span class="o">=</span><span class="s">'weight'</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">G</span> <span class="o">=</span> <span class="n">G</span>
<span class="bp">self</span><span class="p">.</span><span class="n">arc</span> <span class="o">=</span> <span class="n">arc</span>
<span class="bp">self</span><span class="p">.</span><span class="n">edge_attribute</span> <span class="o">=</span> <span class="n">edge_attribute</span>
<span class="bp">self</span><span class="p">.</span><span class="n">pos</span> <span class="o">=</span> <span class="n">pos</span>
<span class="k">def</span> <span class="nf">draw_edges</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">curved_edges</span> <span class="o">=</span> <span class="p">[</span><span class="n">edge</span> <span class="k">for</span> <span class="n">edge</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">G</span><span class="p">.</span><span class="n">edges</span><span class="p">()</span> <span class="k">if</span> <span class="nb">reversed</span><span class="p">(</span><span class="n">edge</span><span class="p">)</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">G</span><span class="p">.</span><span class="n">edges</span><span class="p">()]</span>
<span class="bp">self</span><span class="p">.</span><span class="n">straight_edges</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="nb">set</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">G</span><span class="p">.</span><span class="n">edges</span><span class="p">())</span> <span class="o">-</span> <span class="nb">set</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">curved_edges</span><span class="p">))</span>
<span class="n">nx</span><span class="p">.</span><span class="n">draw_networkx_edges</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">G</span><span class="p">,</span> <span class="bp">self</span><span class="p">.</span><span class="n">pos</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">,</span> <span class="n">edgelist</span><span class="o">=</span><span class="bp">self</span><span class="p">.</span><span class="n">straight_edges</span><span class="p">,</span> <span class="n">width</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="n">nx</span><span class="p">.</span><span class="n">draw_networkx_edges</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">G</span><span class="p">,</span> <span class="bp">self</span><span class="p">.</span><span class="n">pos</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">,</span> <span class="n">edgelist</span><span class="o">=</span><span class="bp">self</span><span class="p">.</span><span class="n">curved_edges</span><span class="p">,</span> <span class="n">connectionstyle</span><span class="o">=</span><span class="sa">f</span><span class="s">'arc3, rad = </span><span class="si">{</span><span class="bp">self</span><span class="p">.</span><span class="n">arc</span><span class="si">}</span><span class="s">'</span><span class="p">,</span> <span class="n">width</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">draw_labels</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">edge_weights</span> <span class="o">=</span> <span class="n">nx</span><span class="p">.</span><span class="n">get_edge_attributes</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">G</span><span class="p">,</span> <span class="bp">self</span><span class="p">.</span><span class="n">edge_attribute</span><span class="p">)</span>
<span class="bp">self</span><span class="p">.</span><span class="n">curved_edge_labels</span> <span class="o">=</span> <span class="p">{</span><span class="n">edge</span><span class="p">:</span> <span class="bp">self</span><span class="p">.</span><span class="n">edge_weights</span><span class="p">[</span><span class="n">edge</span><span class="p">]</span> <span class="k">for</span> <span class="n">edge</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">curved_edges</span><span class="p">}</span>
<span class="bp">self</span><span class="p">.</span><span class="n">straight_edge_labels</span> <span class="o">=</span> <span class="p">{</span><span class="n">edge</span><span class="p">:</span> <span class="bp">self</span><span class="p">.</span><span class="n">edge_weights</span><span class="p">[</span><span class="n">edge</span><span class="p">]</span> <span class="k">for</span> <span class="n">edge</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">straight_edges</span><span class="p">}</span>
<span class="bp">self</span><span class="p">.</span><span class="n">draw_networkx_edge_labels</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">G</span><span class="p">,</span> <span class="bp">self</span><span class="p">.</span><span class="n">pos</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">,</span> <span class="n">edge_labels</span><span class="o">=</span><span class="bp">self</span><span class="p">.</span><span class="n">curved_edge_labels</span><span class="p">,</span> <span class="n">rotate</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span><span class="n">rad</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">arc</span><span class="p">)</span>
<span class="n">nx</span><span class="p">.</span><span class="n">draw_networkx_edge_labels</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">G</span><span class="p">,</span> <span class="bp">self</span><span class="p">.</span><span class="n">pos</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">,</span> <span class="n">edge_labels</span><span class="o">=</span><span class="bp">self</span><span class="p">.</span><span class="n">straight_edge_labels</span><span class="p">,</span> <span class="n">rotate</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="o">@</span><span class="nb">staticmethod</span>
<span class="k">def</span> <span class="nf">draw_networkx_edge_labels</span><span class="p">(</span>
<span class="n">G</span><span class="p">,</span>
<span class="n">pos</span><span class="p">,</span>
<span class="n">edge_labels</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span>
<span class="n">label_pos</span><span class="o">=</span><span class="mf">0.5</span><span class="p">,</span>
<span class="n">font_size</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span>
<span class="n">font_color</span><span class="o">=</span><span class="s">"k"</span><span class="p">,</span>
<span class="n">font_family</span><span class="o">=</span><span class="s">"sans-serif"</span><span class="p">,</span>
<span class="n">font_weight</span><span class="o">=</span><span class="s">"normal"</span><span class="p">,</span>
<span class="n">alpha</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span>
<span class="n">bbox</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span>
<span class="n">horizontalalignment</span><span class="o">=</span><span class="s">"center"</span><span class="p">,</span>
<span class="n">verticalalignment</span><span class="o">=</span><span class="s">"center"</span><span class="p">,</span>
<span class="n">ax</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span>
<span class="n">rotate</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>
<span class="n">clip_on</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>
<span class="n">rad</span><span class="o">=</span><span class="mi">0</span>
<span class="p">):</span>
<span class="s">"""Draw edge labels.
Parameters
----------
G : graph
A networkx graph
pos : dictionary
A dictionary with nodes as keys and positions as values.
Positions should be sequences of length 2.
edge_labels : dictionary (default={})
Edge labels in a dictionary of labels keyed by edge two-tuple.
Only labels for the keys in the dictionary are drawn.
label_pos : float (default=0.5)
Position of edge label along edge (0=head, 0.5=center, 1=tail)
font_size : int (default=10)
Font size for text labels
font_color : string (default='k' black)
Font color string
font_weight : string (default='normal')
Font weight
font_family : string (default='sans-serif')
Font family
alpha : float or None (default=None)
The text transparency
bbox : Matplotlib bbox, optional
Specify text box properties (e.g. shape, color etc.) for edge labels.
Default is {boxstyle='round', ec=(1.0, 1.0, 1.0), fc=(1.0, 1.0, 1.0)}.
horizontalalignment : string (default='center')
Horizontal alignment {'center', 'right', 'left'}
verticalalignment : string (default='center')
Vertical alignment {'center', 'top', 'bottom', 'baseline', 'center_baseline'}
ax : Matplotlib Axes object, optional
Draw the graph in the specified Matplotlib axes.
rotate : bool (deafult=True)
Rotate edge labels to lie parallel to edges
clip_on : bool (default=True)
Turn on clipping of edge labels at axis boundaries
Returns
-------
dict
`dict` of labels keyed by edge
Examples
--------
>>> G = nx.dodecahedral_graph()
>>> edge_labels = nx.draw_networkx_edge_labels(G, pos=nx.spring_layout(G))
Also see the NetworkX drawing examples at
https://networkx.org/documentation/latest/auto_examples/index.html
See Also
--------
draw
draw_networkx
draw_networkx_nodes
draw_networkx_edges
draw_networkx_labels
"""</span>
<span class="k">if</span> <span class="n">ax</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">gca</span><span class="p">()</span>
<span class="k">if</span> <span class="n">edge_labels</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">labels</span> <span class="o">=</span> <span class="p">{(</span><span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">):</span> <span class="n">d</span> <span class="k">for</span> <span class="n">u</span><span class="p">,</span> <span class="n">v</span><span class="p">,</span> <span class="n">d</span> <span class="ow">in</span> <span class="n">G</span><span class="p">.</span><span class="n">edges</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="bp">True</span><span class="p">)}</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">labels</span> <span class="o">=</span> <span class="n">edge_labels</span>
<span class="n">text_items</span> <span class="o">=</span> <span class="p">{}</span>
<span class="k">for</span> <span class="p">(</span><span class="n">n1</span><span class="p">,</span> <span class="n">n2</span><span class="p">),</span> <span class="n">label</span> <span class="ow">in</span> <span class="n">labels</span><span class="p">.</span><span class="n">items</span><span class="p">():</span>
<span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">y1</span><span class="p">)</span> <span class="o">=</span> <span class="n">pos</span><span class="p">[</span><span class="n">n1</span><span class="p">]</span>
<span class="p">(</span><span class="n">x2</span><span class="p">,</span> <span class="n">y2</span><span class="p">)</span> <span class="o">=</span> <span class="n">pos</span><span class="p">[</span><span class="n">n2</span><span class="p">]</span>
<span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="o">=</span> <span class="p">(</span>
<span class="n">x1</span> <span class="o">*</span> <span class="n">label_pos</span> <span class="o">+</span> <span class="n">x2</span> <span class="o">*</span> <span class="p">(</span><span class="mf">1.0</span> <span class="o">-</span> <span class="n">label_pos</span><span class="p">),</span>
<span class="n">y1</span> <span class="o">*</span> <span class="n">label_pos</span> <span class="o">+</span> <span class="n">y2</span> <span class="o">*</span> <span class="p">(</span><span class="mf">1.0</span> <span class="o">-</span> <span class="n">label_pos</span><span class="p">),</span>
<span class="p">)</span>
<span class="n">pos_1</span> <span class="o">=</span> <span class="n">ax</span><span class="p">.</span><span class="n">transData</span><span class="p">.</span><span class="n">transform</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">(</span><span class="n">pos</span><span class="p">[</span><span class="n">n1</span><span class="p">]))</span>
<span class="n">pos_2</span> <span class="o">=</span> <span class="n">ax</span><span class="p">.</span><span class="n">transData</span><span class="p">.</span><span class="n">transform</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">(</span><span class="n">pos</span><span class="p">[</span><span class="n">n2</span><span class="p">]))</span>
<span class="n">linear_mid</span> <span class="o">=</span> <span class="mf">0.5</span><span class="o">*</span><span class="n">pos_1</span> <span class="o">+</span> <span class="mf">0.5</span><span class="o">*</span><span class="n">pos_2</span>
<span class="n">d_pos</span> <span class="o">=</span> <span class="n">pos_2</span> <span class="o">-</span> <span class="n">pos_1</span>
<span class="n">rotation_matrix</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([(</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">),</span> <span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">)])</span>
<span class="n">ctrl_1</span> <span class="o">=</span> <span class="n">linear_mid</span> <span class="o">+</span> <span class="n">rad</span><span class="o">*</span><span class="n">rotation_matrix</span><span class="o">@</span><span class="n">d_pos</span>
<span class="n">ctrl_mid_1</span> <span class="o">=</span> <span class="mf">0.5</span><span class="o">*</span><span class="n">pos_1</span> <span class="o">+</span> <span class="mf">0.5</span><span class="o">*</span><span class="n">ctrl_1</span>
<span class="n">ctrl_mid_2</span> <span class="o">=</span> <span class="mf">0.5</span><span class="o">*</span><span class="n">pos_2</span> <span class="o">+</span> <span class="mf">0.5</span><span class="o">*</span><span class="n">ctrl_1</span>
<span class="n">bezier_mid</span> <span class="o">=</span> <span class="mf">0.5</span><span class="o">*</span><span class="n">ctrl_mid_1</span> <span class="o">+</span> <span class="mf">0.5</span><span class="o">*</span><span class="n">ctrl_mid_2</span>
<span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="o">=</span> <span class="n">ax</span><span class="p">.</span><span class="n">transData</span><span class="p">.</span><span class="n">inverted</span><span class="p">().</span><span class="n">transform</span><span class="p">(</span><span class="n">bezier_mid</span><span class="p">)</span>
<span class="k">if</span> <span class="n">rotate</span><span class="p">:</span>
<span class="c1"># in degrees
</span> <span class="n">angle</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">arctan2</span><span class="p">(</span><span class="n">y2</span> <span class="o">-</span> <span class="n">y1</span><span class="p">,</span> <span class="n">x2</span> <span class="o">-</span> <span class="n">x1</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="mf">2.0</span> <span class="o">*</span> <span class="n">np</span><span class="p">.</span><span class="n">pi</span><span class="p">)</span> <span class="o">*</span> <span class="mi">360</span>
<span class="c1"># make label orientation "right-side-up"
</span> <span class="k">if</span> <span class="n">angle</span> <span class="o">></span> <span class="mi">90</span><span class="p">:</span>
<span class="n">angle</span> <span class="o">-=</span> <span class="mi">180</span>
<span class="k">if</span> <span class="n">angle</span> <span class="o"><</span> <span class="o">-</span><span class="mi">90</span><span class="p">:</span>
<span class="n">angle</span> <span class="o">+=</span> <span class="mi">180</span>
<span class="c1"># transform data coordinate angle to screen coordinate angle
</span> <span class="n">xy</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">((</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">))</span>
<span class="n">trans_angle</span> <span class="o">=</span> <span class="n">ax</span><span class="p">.</span><span class="n">transData</span><span class="p">.</span><span class="n">transform_angles</span><span class="p">(</span>
<span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">((</span><span class="n">angle</span><span class="p">,)),</span> <span class="n">xy</span><span class="p">.</span><span class="n">reshape</span><span class="p">((</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span>
<span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">trans_angle</span> <span class="o">=</span> <span class="mf">0.0</span>
<span class="c1"># use default box of white with white border
</span> <span class="k">if</span> <span class="n">bbox</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">bbox</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="n">boxstyle</span><span class="o">=</span><span class="s">"round"</span><span class="p">,</span> <span class="n">ec</span><span class="o">=</span><span class="p">(</span><span class="mf">1.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">),</span> <span class="n">fc</span><span class="o">=</span><span class="p">(</span><span class="mf">1.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">))</span>
<span class="k">if</span> <span class="ow">not</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">label</span><span class="p">,</span> <span class="nb">str</span><span class="p">):</span>
<span class="n">label</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">label</span><span class="p">:</span><span class="mf">0.2</span><span class="n">f</span><span class="si">}</span><span class="s">"</span> <span class="c1"># this makes "1" and 1 labeled the same
</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">ax</span><span class="p">.</span><span class="n">text</span><span class="p">(</span>
<span class="n">x</span><span class="p">,</span>
<span class="n">y</span><span class="p">,</span>
<span class="n">label</span><span class="p">,</span>
<span class="n">size</span><span class="o">=</span><span class="n">font_size</span><span class="p">,</span>
<span class="n">color</span><span class="o">=</span><span class="n">font_color</span><span class="p">,</span>
<span class="n">family</span><span class="o">=</span><span class="n">font_family</span><span class="p">,</span>
<span class="n">weight</span><span class="o">=</span><span class="n">font_weight</span><span class="p">,</span>
<span class="n">alpha</span><span class="o">=</span><span class="n">alpha</span><span class="p">,</span>
<span class="n">horizontalalignment</span><span class="o">=</span><span class="n">horizontalalignment</span><span class="p">,</span>
<span class="n">verticalalignment</span><span class="o">=</span><span class="n">verticalalignment</span><span class="p">,</span>
<span class="n">rotation</span><span class="o">=</span><span class="n">trans_angle</span><span class="p">,</span>
<span class="n">transform</span><span class="o">=</span><span class="n">ax</span><span class="p">.</span><span class="n">transData</span><span class="p">,</span>
<span class="n">bbox</span><span class="o">=</span><span class="n">bbox</span><span class="p">,</span>
<span class="n">zorder</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">clip_on</span><span class="o">=</span><span class="n">clip_on</span><span class="p">,</span>
<span class="p">)</span>
<span class="n">text_items</span><span class="p">[(</span><span class="n">n1</span><span class="p">,</span> <span class="n">n2</span><span class="p">)]</span> <span class="o">=</span> <span class="n">t</span>
<span class="n">ax</span><span class="p">.</span><span class="n">tick_params</span><span class="p">(</span>
<span class="n">axis</span><span class="o">=</span><span class="s">"both"</span><span class="p">,</span>
<span class="n">which</span><span class="o">=</span><span class="s">"both"</span><span class="p">,</span>
<span class="n">bottom</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">left</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">labelbottom</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">labelleft</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="p">)</span>
<span class="k">return</span> <span class="n">text_items</span>
<span class="k">def</span> <span class="nf">plot</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">__call__</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">__call__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">draw_edges</span><span class="p">()</span>
<span class="bp">self</span><span class="p">.</span><span class="n">draw_labels</span><span class="p">()</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>The graph can be shown below</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="n">G</span> <span class="o">=</span> <span class="n">nx</span><span class="p">.</span><span class="n">DiGraph</span><span class="p">(</span><span class="n">sol_matrix</span><span class="p">)</span>
<span class="n">pos</span> <span class="o">=</span> <span class="n">nx</span><span class="p">.</span><span class="n">spring_layout</span><span class="p">(</span><span class="n">G</span><span class="p">)</span>
<span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">20</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span>
<span class="n">p</span> <span class="o">=</span> <span class="n">Draw_Graph</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">pos</span><span class="p">)</span>
<span class="n">p</span><span class="p">.</span><span class="n">plot</span><span class="p">()</span>
<span class="n">nx</span><span class="p">.</span><span class="n">draw_networkx_nodes</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">pos</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">);</span>
<span class="n">nx</span><span class="p">.</span><span class="n">draw_networkx_labels</span><span class="p">(</span><span class="n">G</span><span class="p">,</span> <span class="n">pos</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">);</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p><img src="images/2023-08-29_travelling_salesman/output_33_0.png" alt="png" /></p>
<h2 id="references">References</h2>
<p>[1] https://soumenatta.medium.com/solving-the-traveling-salesman-problem-using-pulp-in-python-edd23a6aee4d</p>
<p>[2] https://towardsdatascience.com/solving-geographic-travelling-salesman-problems-using-python-e57284b14cd7</p>
<p>[3] https://soumenatta.medium.com/solving-the-p-median-problem-using-pulp-in-python-31d9bc13cc2d</p>Humberto STEIN SHIROMOTOThe Traveling Salesman Problem (TSP), a captivating conundrum in mathematical optimization, has seamlessly integrated itself into a multitude of industries, reshaping the way we approach efficiency and problem-solving.Are Model Performance Metrics Enough?2023-07-05T00:00:00+10:002023-07-05T00:00:00+10:00https://hsteinshiromoto.github.io/posts/2023/07/05/blog-post_are_model_performance_metrics_enough<p>My model has a “good enough” performance, is this sufficient for deployment? This post highlights the limitations of relying solely on performance metrics when assessing the readiness of a machine learning model for deployment. It emphasizes the importance of considering both correctness and performance as separate components in the evaluation process. By visualizing the relationship between correctness and performance in different regions, the blog post illustrates the need for critical evaluation and avoiding overconfidence in performance metrics. Furthermore, it emphasizes the impact of business assumptions on correctness and stresses the significance of scientifically-based decision-making.</p>
<h1 id="introduction">Introduction</h1>
<p>In general, we are inclined to assume that if the performance metrics of a model is above a certain threshold, then it is ready to be moved to the next stage of the development. As I show in this blog post, we must be more critical about this criterion, because performance metric only measures a part of a data science problem.</p>
<p>While working in a project in which the main deliverable is a machine learning model, the data scientist frequently needs to answer the question if the model performance is good enough. Implicitly, it is assumed that the metric chosen to measure the model performance also captures all the pieces of information about the of the process that generate the data and its characteristics such as distribution, dependencies etc. This assumption can be understood as the causal relationship: if the performance metrics is high, then the model is correct.</p>
<p>An illustration for the aforementioned implicit assumption is shown in the figure below</p>
<p><img src="https://raw.githubusercontent.com/hsteinshiromoto/blog/dev/images/are_model_performance_metrics_enough/correctnessvsperformance.svg" alt="Causal relationship between model performance and correcteness" /></p>
<p>If such a causal relationship is true, then a high performance necessarily implies that the predictions obtained from the model correspond to th reality. To see why this assumption is strong, one can verify that a performance metric is not a sufficient condition for a good prediction, otherwise we would never face problems such as target leakage, and overfitting.</p>
<h1 id="perfomance-x-correcteness">Perfomance x Correcteness</h1>
<p>Since we know that the model performance metric does not encapsulate all the pieces of information that are necessary to decided if a model is good enough for deployment, we need to understand what we can and cannot decided based on the performance metric, and what pieces of information are missing to make that decision. I propose to decompose the question “Is the model performance metric good enough to deployment?” into two components:</p>
<ol>
<li>Correcteness: How correct is the scientific methodology used to build the model?</li>
<li>Performance: How can I improve the model performance?</li>
</ol>
<p>We also need to keep in mind that the performance metric does not necessarily increase with the more correct the model methodology is. For example, IT/coding issues can lead to poor model performance.</p>
<p>Based on the above, we can visualize the relationship between correcteness and performance in a Cartesian plane below</p>
<p><img src="https://raw.githubusercontent.com/hsteinshiromoto/blog/dev/images/are_model_performance_metrics_enough/correctnessvsperformance_regions.svg" alt="Regions divisions" /></p>
<p>The regions can be described as follows:</p>
<ul>
<li>
<p><strong>Start Region</strong>. This is where the data science work normally starts. With poor knowledge of the data or the modelling methodology, the data scientist will make incorrect decision and the model built tends to have a poor performance.</p>
</li>
<li>
<p><strong>Learning Region</strong>. In this region, the data scientist starts to learn about the problem, the data, and the model methodology. Here, one can see a large number of hypotheses being formulated and tested. Many of these, will guide the work towards a more correct methodology. The improvements will come with the correct implementation of these methodologies.</p>
</li>
<li>
<p><strong>Deployment Region</strong>. Here, the work has enough quality to be put into operation, when evaluated in terms of correcteness and performance metrics.</p>
</li>
<li>
<p><strong>Fool’s Region</strong>. In this quadrant, one’s lack of care with respect to the correcteness of methodology leads to think that the model is ready for deployment, because the performance shows a value that is better than the defined threshold.</p>
</li>
</ul>
<h2 id="the-dunning-krueger-effect">The Dunning-Krueger Effect</h2>
<p>The division in four regions allows us to see similarities with the <a href="https://[dx.doi.org/](https://doi.org/10.1037/0022-3514.77.6.1121)">Dunning-Krueger</a> effect, a cognitive bias phenomenon that explains the difference between one’s perceived knowledge and one’s actual knowledge of a subject.</p>
<p>In a nutshell, the Dunning-Krueger effect states that our ability to perceive our knowledge does not grows linearly with the actual acquired knowledge: when we have “very limited knowledge” of a subject, we are inclined to think that we are “highly skilled” on this subject. For example, the majority of drivers believe that their driving skills are above average [citation needed]. The following figure illustrates the relationship between perceived knowledge and actual knowledge.</p>
<p><img src="https://raw.githubusercontent.com/hsteinshiromoto/blog/dev/images/are_model_performance_metrics_enough/dunning_kruger.svg" alt="Dunning Kruger" /></p>
<p>In our case, the fool’s region coincides with the top of graph, when the “actual knowledge is low”.</p>
<h2 id="the-role-of-business-assumptions">The role of business assumptions</h2>
<p>It is worth to note that many business assumptions and/or decision will tend to have a negative impact on the model correctness, as these tend not to be scientifically based.</p>
<p><img src="https://raw.githubusercontent.com/hsteinshiromoto/blog/dev/images/are_model_performance_metrics_enough/correctnessvsperformance_regions_bas.svg" alt="Regions divisions" /></p>
<p>One example of a common business assumption that is overlooked is the dependency of time on the prediction. In a marketing campaign setting, a machine learning model is often used to identify the most suitable customers. Frequently, the model predicting the customer behaviour does not take into account previous customers reactions to similar campaigns.</p>
<h1 id="conclusion">Conclusion</h1>
<p>In conclusion, relying solely on performance metrics to determine model readiness for deployment is insufficient. Evaluating correctness and performance as separate components is essential. The relationship between correctness and performance, visualized in different regions, emphasizes the need for critical evaluation and avoiding the “Fool’s Region” of overconfidence. Additionally, considering business assumptions and decisions that impact correctness is crucial. A comprehensive assessment of both correctness and performance metrics is necessary to make informed deployment decisions and ensure the model’s validity.</p>Humberto STEIN SHIROMOTOMy model has a “good enough” performance, is this sufficient for deployment? This post highlights the limitations of relying solely on performance metrics when assessing the readiness of a machine learning model for deployment. It emphasizes the importance of considering both correctness and performance as separate components in the evaluation process. By visualizing the relationship between correctness and performance in different regions, the blog post illustrates the need for critical evaluation and avoiding overconfidence in performance metrics. Furthermore, it emphasizes the impact of business assumptions on correctness and stresses the significance of scientifically-based decision-making.Using a Cost Functional to Optimize Hyperparameters Using Cross Validation2023-04-20T00:00:00+10:002023-04-20T00:00:00+10:00https://hsteinshiromoto.github.io/posts/2023/04/20/blog-post_blog-post_using_a_cost_functional_to_optimize_hyperparameters_using_cross_validation<p>This blog post discusses the importance of cost functions in mathematical optimization and how it applies to machine learning problems. The author argues that formulating the optimization of the performance metrics of a machine learning classifier in terms of a cost function is better than optimizing a single metric because it provides a more comprehensive and flexible framework for optimization, can help to address the trade-off between model complexity and performance, and can lead to better performance and generalization. An example of this formulation is provided for a binary classifier.</p>
<h2 id="what-is-a-cost-function">What is a cost function?</h2>
<p>In mathematical optimization, a cost function is a mathematical function that represents the cost or objective to be minimized or maximized in a given optimization problem. The cost function defines the relationship between the input variables and the output values of the problem. In optimization problems, the goal is to find the input values that minimize or maximize the cost function, subject to certain constraints. The cost function plays a crucial role in defining the optimization problem and in guiding the search for the optimal solution. The choice of the cost function depends on the nature of the problem and the desired optimization criteria. In many real-world problems, the cost function may be a complex, nonlinear function that requires advanced mathematical tools and techniques to be analyzed and optimized.</p>
<h2 id="advantages-of-using-a-cost-functional-to-optimize-hyperparameters-using-cross-validation">Advantages of Using a Cost Functional to Optimize Hyperparameters Using Cross Validation</h2>
<p>Formulating the optimization of the performance metrics of a machine learning classifier in terms of a cost function is better than optimizing a single metric for several reasons.</p>
<p>Firstly, machine learning problems often involve multiple metrics that need to be optimized simultaneously, such as accuracy, precision, recall, F1-score, and others. However, optimizing a single metric in isolation may not necessarily lead to the best overall performance of the classifier. For example, optimizing only for accuracy may lead to a model that performs poorly on a specific class or in a specific context.</p>
<p>Secondly, a cost function can provide a more comprehensive and flexible framework for optimization. By defining a cost function that combines multiple metrics and incorporates domain-specific constraints and preferences, we can optimize the model for a specific task and context in a more principled and efficient way.</p>
<p>Thirdly, a cost function can also help to address the trade-off between model complexity and performance. By penalizing complex models that are prone to overfitting, we can ensure that the model is not only accurate but also robust and generalizable.</p>
<p>Overall, formulating the optimization of the performance metrics of a machine learning classifier in terms of a cost function provides a more principled, flexible, and effective approach to model optimization that can lead to better performance and generalization.</p>
<h2 id="example">Example</h2>
<p>An implementation of this formulation is shown below for a binary classifier.</p>
<p>Import the required libraries.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="rouge-code"><pre><span class="kn">import</span> <span class="nn">sklearn.metrics</span> <span class="k">as</span> <span class="n">sm</span>
<span class="kn">from</span> <span class="nn">sklearn.datasets</span> <span class="kn">import</span> <span class="n">make_classification</span>
<span class="kn">from</span> <span class="nn">sklearn.model_selection</span> <span class="kn">import</span> <span class="n">GridSearchCV</span>
<span class="kn">from</span> <span class="nn">sklearn.linear_model</span> <span class="kn">import</span> <span class="n">LogisticRegression</span>
<span class="kn">from</span> <span class="nn">collections.abc</span> <span class="kn">import</span> <span class="n">Iterable</span><span class="p">,</span> <span class="n">Callable</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">from</span> <span class="nn">abc</span> <span class="kn">import</span> <span class="n">ABC</span><span class="p">,</span> <span class="n">abstractmethod</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>Define an abstract cost function.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
</pre></td><td class="rouge-code"><pre><span class="k">class</span> <span class="nc">CostFunction</span><span class="p">(</span><span class="n">ABC</span><span class="p">):</span>
<span class="s">"""Abstract class for cost functions"""</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">metrics</span><span class="p">:</span> <span class="n">Iterable</span><span class="p">[</span><span class="nb">str</span><span class="p">],</span> <span class="n">M</span><span class="p">:</span> <span class="s">'np.ndarray[float]'</span><span class="p">)</span> <span class="o">-></span> <span class="bp">None</span><span class="p">:</span>
<span class="s">"""_summary_
Args:
metrics (Iterable[str]): Iterable of strings of the form (metric_name).
M (np.ndarray[float]): Positive definite matrix of size len(metrics).
Raises:
ValueError: _description_
Returns:
_type_: _description_
"""</span>
<span class="bp">self</span><span class="p">.</span><span class="n">metrics</span> <span class="o">=</span> <span class="n">metrics</span>
<span class="bp">self</span><span class="p">.</span><span class="n">M</span> <span class="o">=</span> <span class="n">M</span> <span class="ow">or</span> <span class="n">np</span><span class="p">.</span><span class="n">identity</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">metrics</span><span class="p">))</span> <span class="c1"># type: ignore
</span> <span class="bp">self</span><span class="p">.</span><span class="n">_check_positive_definite</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">M</span><span class="p">)</span>
<span class="o">@</span><span class="n">abstractmethod</span>
<span class="k">def</span> <span class="nf">functional</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">y_true</span><span class="p">:</span> <span class="s">'np.ndarray[float]'</span><span class="p">,</span> <span class="n">y_pred</span><span class="p">:</span> <span class="s">'np.ndarray[float]'</span><span class="p">)</span> <span class="o">-></span> <span class="nb">float</span><span class="p">:</span>
<span class="s">"""_summary_
Args:
y_true (np.ndarray[float]): Array-like of true labels of length N.
y_pred (np.ndarray[float]): Array-like of predicted labels of length N.
"""</span>
<span class="k">pass</span>
<span class="o">@</span><span class="nb">staticmethod</span>
<span class="k">def</span> <span class="nf">_to_array</span><span class="p">(</span><span class="n">y</span><span class="p">:</span> <span class="n">Iterable</span><span class="p">[</span><span class="nb">float</span><span class="p">])</span> <span class="o">-></span> <span class="s">'np.ndarray[float]'</span><span class="p">:</span>
<span class="k">return</span> <span class="n">np</span><span class="p">.</span><span class="n">fromiter</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="nb">float</span><span class="p">)</span>
<span class="o">@</span><span class="nb">staticmethod</span>
<span class="k">def</span> <span class="nf">_check_positive_definite</span><span class="p">(</span><span class="n">M</span><span class="p">:</span> <span class="s">'np.ndarray[float]'</span><span class="p">)</span> <span class="o">-></span> <span class="bp">None</span><span class="p">:</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">np</span><span class="p">.</span><span class="nb">all</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">linalg</span><span class="p">.</span><span class="n">eigvals</span><span class="p">(</span><span class="n">M</span><span class="p">)</span> <span class="o">></span> <span class="mi">0</span><span class="p">):</span>
<span class="k">raise</span> <span class="nb">ValueError</span><span class="p">(</span><span class="sa">f</span><span class="s">'Matrix </span><span class="si">{</span><span class="n">M</span><span class="si">}</span><span class="s"> is not positive definite'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">make_scorer</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="n">Callable</span><span class="p">:</span>
<span class="k">return</span> <span class="n">sm</span><span class="p">.</span><span class="n">make_scorer</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">functional</span><span class="p">,</span> <span class="n">greater_is_better</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">__call__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">y_true</span><span class="p">:</span> <span class="n">Iterable</span><span class="p">[</span><span class="nb">float</span><span class="p">],</span> <span class="n">y_pred</span><span class="p">:</span> <span class="n">Iterable</span><span class="p">[</span><span class="nb">float</span><span class="p">])</span> <span class="o">-></span> <span class="nb">float</span><span class="p">:</span>
<span class="n">y_pred_array</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">_to_array</span><span class="p">(</span><span class="n">y_pred</span><span class="p">)</span>
<span class="n">y_true_array</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">_to_array</span><span class="p">(</span><span class="n">y_true</span><span class="p">)</span>
<span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">functional</span><span class="p">(</span><span class="n">y_true_array</span><span class="p">,</span> <span class="n">y_pred_array</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>Define a specific cost function for a classifier. The default performance metrics to optimize for are accuracy, f1, precision, recall, log loss and rocauc.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
</pre></td><td class="rouge-code"><pre><span class="k">class</span> <span class="nc">ClassificationCostFunction</span><span class="p">(</span><span class="n">CostFunction</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">metrics</span><span class="p">:</span> <span class="n">Iterable</span><span class="p">[</span><span class="nb">str</span><span class="p">],</span> <span class="n">M</span><span class="p">:</span> <span class="s">'np.ndarray[float]'</span> <span class="o">=</span> <span class="bp">None</span><span class="p">,</span> <span class="n">metric_class_opt_val_map</span><span class="p">:</span> <span class="nb">dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">tuple</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">float</span><span class="p">]]</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">proba_threshold</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.5</span><span class="p">):</span>
<span class="s">"""Defines cost functional for optimization of multiple metrics.
Since this is defined as a loss function, cross validation returns the negative of the score [1].
Args:
metrics (Iterable[str]): Iterable of strings of the form (metric_name).
M (np.ndarray[float]): Positive definite matrix of size len(metrics).
metric_class_map (dict[str, str], optional): Dictionary mapping metric to class or probability of the form {'metric': 'class' or 'proba'}. Defaults to {}.
proba_threshold (float, optional): Probability threshold used to convert probabilities into classes. Defaults to 0.5.
References:
[1] https://github.com/scikit-learn/scikit-learn/issues/2439
Example:
>>> y_true = [0, 0, 0, 1, 1]
>>> y_pred = [0.46, 0.6, 0.29, 0.25, 0.012]
>>> threshold = 0.5
>>> metrics = ["f1_score", "roc_auc_score"]
>>> cf = ClassificationCostFunction(metrics)
>>> np.isclose(cf(y_true, y_pred), 1.41, rtol=1e-01, atol=1e-01)
True
>>> X, y = make_classification()
>>> model = LogisticRegression()
>>> model.fit(X, y)
>>> y_proba = model.predict_proba(X)[:, 1]
>>> cost = cf(y, y_proba)
>>> f1 = getattr(sm, "f1_score")
>>> roc_auc = getattr(sm, "roc_auc_score")
>>> y_pred = np.where(y_proba > 0.5, 1, 0)
>>> scorer_output = np.sqrt((f1(y, y_pred) - 1.0)**2 + (roc_auc(y, y_proba) - 1.0)**2)
>>> np.isclose(cost, scorer_output)
True
"""</span>
<span class="nb">super</span><span class="p">().</span><span class="n">__init__</span><span class="p">(</span><span class="n">metrics</span><span class="p">,</span> <span class="n">M</span><span class="p">)</span>
<span class="bp">self</span><span class="p">.</span><span class="n">proba_threshold</span> <span class="o">=</span> <span class="n">proba_threshold</span>
<span class="bp">self</span><span class="p">.</span><span class="n">metric_class_opt_val_map</span> <span class="o">=</span> <span class="n">metric_class_opt_val_map</span> <span class="ow">or</span> <span class="p">{</span>
<span class="s">"accuracy_score"</span><span class="p">:</span> <span class="p">(</span><span class="s">"class"</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
<span class="s">"f1_score"</span><span class="p">:</span> <span class="p">(</span><span class="s">"class"</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
<span class="s">"log_loss"</span><span class="p">:</span> <span class="p">(</span><span class="s">"class"</span><span class="p">,</span> <span class="mi">0</span><span class="p">),</span>
<span class="s">"precision_score"</span><span class="p">:</span> <span class="p">(</span><span class="s">"class"</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
<span class="s">"recall_score"</span><span class="p">:</span> <span class="p">(</span><span class="s">"class"</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
<span class="s">"roc_auc_score"</span><span class="p">:</span> <span class="p">(</span><span class="s">"proba"</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
<span class="p">}</span>
<span class="k">def</span> <span class="nf">_to_class</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">array</span><span class="p">:</span> <span class="s">'np.ndarray[float]'</span><span class="p">,</span> <span class="n">metric</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-></span> <span class="s">'np.ndarray[float]'</span><span class="p">:</span>
<span class="c1"># sourcery skip: inline-immediately-returned-variable
</span> <span class="n">output</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">where</span><span class="p">(</span><span class="n">array</span> <span class="o">></span> <span class="bp">self</span><span class="p">.</span><span class="n">proba_threshold</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="bp">self</span><span class="p">.</span><span class="n">metric_class_opt_val_map</span><span class="p">[</span><span class="n">metric</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="s">"class"</span> <span class="k">else</span> <span class="n">array</span>
<span class="k">return</span> <span class="n">output</span>
<span class="k">def</span> <span class="nf">functional</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">y_true</span><span class="p">:</span> <span class="s">'np.ndarray[float]'</span><span class="p">,</span> <span class="n">y_pred</span><span class="p">:</span> <span class="s">'np.ndarray[float]'</span><span class="p">)</span> <span class="o">-></span> <span class="nb">float</span><span class="p">:</span>
<span class="bp">self</span><span class="p">.</span><span class="n">_check_positive_definite</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">M</span><span class="p">)</span>
<span class="n">opt_values</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="bp">self</span><span class="p">.</span><span class="n">metric_class_opt_val_map</span><span class="p">[</span><span class="n">metric</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="k">for</span> <span class="n">metric</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">metrics</span><span class="p">])</span>
<span class="n">metric_values</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="nb">getattr</span><span class="p">(</span><span class="n">sm</span><span class="p">,</span> <span class="n">metric</span><span class="p">)(</span><span class="n">y_true</span><span class="p">,</span> <span class="bp">self</span><span class="p">.</span><span class="n">_to_class</span><span class="p">(</span><span class="n">y_pred</span><span class="p">,</span> <span class="n">metric</span><span class="p">))</span> <span class="k">for</span> <span class="n">metric</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">metrics</span><span class="p">])</span>
<span class="k">return</span> <span class="n">np</span><span class="p">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">dot</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">dot</span><span class="p">(</span><span class="n">metric_values</span> <span class="o">-</span> <span class="n">opt_values</span><span class="p">,</span> <span class="bp">self</span><span class="p">.</span><span class="n">M</span><span class="p">),</span> <span class="n">metric_values</span> <span class="o">-</span> <span class="n">opt_values</span><span class="p">))</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>Run the code in a grid search strategy.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
</pre></td><td class="rouge-code"><pre><span class="n">metrics</span> <span class="o">=</span> <span class="p">[</span>
<span class="s">"accuracy_score"</span><span class="p">,</span>
<span class="s">"f1_score"</span><span class="p">,</span>
<span class="s">"log_loss"</span><span class="p">,</span>
<span class="s">"precision_score"</span><span class="p">,</span>
<span class="s">"recall_score"</span><span class="p">,</span>
<span class="s">"roc_auc_score"</span>
<span class="p">]</span>
<span class="n">param_grid</span> <span class="o">=</span> <span class="p">{</span><span class="s">"C"</span><span class="p">:</span> <span class="p">[</span><span class="mf">0.5</span><span class="p">,</span> <span class="mi">1</span><span class="p">]}</span>
<span class="n">scorer</span> <span class="o">=</span> <span class="n">ClassificationCostFunction</span><span class="p">(</span><span class="n">metrics</span><span class="p">,</span> <span class="n">proba_threshold</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
<span class="n">cv</span> <span class="o">=</span> <span class="n">GridSearchCV</span><span class="p">(</span><span class="n">LogisticRegression</span><span class="p">(),</span> <span class="n">param_grid</span><span class="p">,</span> <span class="n">scoring</span><span class="o">=</span><span class="n">scorer</span><span class="p">.</span><span class="n">make_scorer</span><span class="p">())</span>
<span class="n">X</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">make_classification</span><span class="p">()</span>
<span class="n">cv</span><span class="p">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="n">pd</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">.</span><span class="n">from_dict</span><span class="p">(</span><span class="n">cv</span><span class="p">.</span><span class="n">cv_results_</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<table>
<thead>
<tr>
<th>mean_fit_time</th>
<th>std_fit_time</th>
<th>mean_score_time</th>
<th>std_score_time</th>
<th>param_C</th>
<th>params</th>
<th>split0_test_score</th>
<th>split1_test_score</th>
<th>split2_test_score</th>
<th>split3_test_score</th>
<th>split4_test_score</th>
<th>mean_test_score</th>
<th>std_test_score</th>
<th>rank_test_score</th>
<th> </th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0.009353</td>
<td>0.003661</td>
<td>0.008929</td>
<td>0.002612</td>
<td>0.5</td>
<td>{‘C’: 0.5}</td>
<td>-1.732076</td>
<td>-6.922296</td>
<td>-1.732076</td>
<td>-3.464335</td>
<td>-3.461615</td>
<td>-3.462480</td>
<td>1.895201</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>0.006416</td>
<td>0.000833</td>
<td>0.006427</td>
<td>0.000340</td>
<td>1</td>
<td>{‘C’: 1}</td>
<td>-1.732076</td>
<td>-8.654072</td>
<td>-1.732076</td>
<td>-3.464335</td>
<td>-3.461615</td>
<td>-3.808835</td>
<td>2.543282</td>
<td>2</td>
</tr>
<tr>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
</tr>
</tbody>
</table>
<h2 id="conclusion">Conclusion</h2>
<p>In conclusion, cost functions play a critical role in mathematical optimization problems and are essential in guiding the search for the optimal solution. In machine learning problems, where multiple performance metrics need to be optimized simultaneously, using a cost function provides a more principled and efficient way to optimize the model for a specific task and context. Furthermore, it helps to address the trade-off between model complexity and performance, ensuring that the model is not only accurate but also robust and generalizable. By using a cost function to optimize performance metrics, machine learning practitioners can achieve better performance and generalization on their models, making it a valuable tool for model optimization.</p>Humberto STEIN SHIROMOTOThis blog post discusses the importance of cost functions in mathematical optimization and how it applies to machine learning problems. The author argues that formulating the optimization of the performance metrics of a machine learning classifier in terms of a cost function is better than optimizing a single metric because it provides a more comprehensive and flexible framework for optimization, can help to address the trade-off between model complexity and performance, and can lead to better performance and generalization. An example of this formulation is provided for a binary classifier.Adding a Dark/Light Theme Switcher to Minimal Mistakes2022-09-27T00:00:00+10:002022-09-27T00:00:00+10:00https://hsteinshiromoto.github.io/posts/2022/09/27/blog-post_adding_theme_switch_to_minimal_mistakes<p>This is how I found out how to add a switcher to toggle between light and dark modes of minimal mistakes theme.</p>
<p>I followed the instructions posted by sohamsaha99 in <a href="https://github.com/mmistakes/minimal-mistakes/discussions/2033">this Github thread</a> and copied here:</p>
<ol>
<li>Edit <code class="language-plaintext highlighter-rouge">_config.yml</code>: There are going to be two themes. The first one is declared as usual. And for the second one, we create a new entry caled minimal_mistakes_skin2. So, add the following lines:</li>
</ol>
<div class="language-yml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="na">minimal_mistakes_skin</span><span class="pi">:</span> <span class="s2">"</span><span class="s">default"</span>
<span class="na">minimal_mistakes_skin2</span><span class="pi">:</span> <span class="s2">"</span><span class="s">dark"</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<ol>
<li>Create a file in your project directory in the location <code class="language-plaintext highlighter-rouge">assets/css/theme2.scss</code> and insert the following lines in the file:</li>
</ol>
<div class="language-scss highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="nt">---</span>
<span class="nn">#</span> <span class="nt">Only</span> <span class="nt">the</span> <span class="nt">main</span> <span class="nt">Sass</span> <span class="nt">file</span> <span class="nt">needs</span> <span class="nt">front</span> <span class="nt">matter</span> <span class="o">(</span><span class="nt">the</span> <span class="nt">dashes</span> <span class="nt">are</span> <span class="nt">enough</span><span class="o">)</span>
<span class="nt">---</span>
<span class="o">@</span><span class="nt">charset</span> <span class="s2">"utf-8"</span><span class="p">;</span>
<span class="k">@import</span> <span class="s2">"minimal-mistakes/skins/dark"</span><span class="p">;</span> <span class="c1">// skin</span>
<span class="k">@import</span> <span class="s2">"minimal-mistakes"</span><span class="p">;</span> <span class="c1">// main partials</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<ol>
<li>Modify the following line in file <code class="language-plaintext highlighter-rouge">_includes/head.html</code> from:</li>
</ol>
<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nt"><link</span> <span class="na">rel=</span><span class="s">"stylesheet"</span> <span class="na">href=</span><span class="s">"/assets/css/main.css"</span><span class="nt">></span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>to</p>
<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nt"><link</span> <span class="na">rel=</span><span class="s">"stylesheet"</span> <span class="na">href=</span><span class="s">"/assets/css/main.css"</span> <span class="na">id=</span><span class="s">"theme_source"</span><span class="nt">></span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>and just after that line, add the code:</p>
<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
</pre></td><td class="rouge-code"><pre>
<span class="nt"><link</span> <span class="na">rel=</span><span class="s">"stylesheet alternate"</span> <span class="na">href=</span><span class="s">"/assets/css/theme2.css"</span> <span class="na">id=</span><span class="s">"theme_source_2"</span><span class="nt">></span>
<span class="nt"><script></span>
<span class="kd">let</span> <span class="nx">theme</span> <span class="o">=</span> <span class="nx">sessionStorage</span><span class="p">.</span><span class="nx">getItem</span><span class="p">(</span><span class="dl">'</span><span class="s1">theme</span><span class="dl">'</span><span class="p">);</span>
<span class="k">if</span><span class="p">(</span><span class="nx">theme</span> <span class="o">===</span> <span class="dl">"</span><span class="s2">dark</span><span class="dl">"</span><span class="p">)</span>
<span class="p">{</span>
<span class="nx">sessionStorage</span><span class="p">.</span><span class="nx">setItem</span><span class="p">(</span><span class="dl">'</span><span class="s1">theme</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">dark</span><span class="dl">'</span><span class="p">);</span>
<span class="nx">node1</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nx">getElementById</span><span class="p">(</span><span class="dl">'</span><span class="s1">theme_source</span><span class="dl">'</span><span class="p">);</span>
<span class="nx">node2</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nx">getElementById</span><span class="p">(</span><span class="dl">'</span><span class="s1">theme_source_2</span><span class="dl">'</span><span class="p">);</span>
<span class="nx">node1</span><span class="p">.</span><span class="nx">setAttribute</span><span class="p">(</span><span class="dl">'</span><span class="s1">rel</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">stylesheet alternate</span><span class="dl">'</span><span class="p">);</span>
<span class="nx">node2</span><span class="p">.</span><span class="nx">setAttribute</span><span class="p">(</span><span class="dl">'</span><span class="s1">rel</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">stylesheet</span><span class="dl">'</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">else</span>
<span class="p">{</span>
<span class="nx">sessionStorage</span><span class="p">.</span><span class="nx">setItem</span><span class="p">(</span><span class="dl">'</span><span class="s1">theme</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">light</span><span class="dl">'</span><span class="p">);</span>
<span class="p">}</span>
<span class="nt"></script></span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>The names <code class="language-plaintext highlighter-rouge">light</code> and <code class="language-plaintext highlighter-rouge">dark</code> are generics of <code class="language-plaintext highlighter-rouge">skin1</code> and <code class="language-plaintext highlighter-rouge">skin2</code>. These strings have nothing to do with the actual skin names.</p>
<ol>
<li>Add an icon next to navigation. In <code class="language-plaintext highlighter-rouge">_includes/masterhead.html</code> find <code class="language-plaintext highlighter-rouge">{ % if site.search == true % }</code> and above that add:</li>
</ol>
<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>
<span class="nt"><i</span> <span class="na">class=</span><span class="s">"fas fa-fw fa-sun"</span> <span class="na">aria-hidden=</span><span class="s">"true"</span> <span class="na">onclick=</span><span class="s">"node1=document.getElementById('theme_source');node2=document.getElementById('theme_source_2');if(node1.getAttribute('rel')=='stylesheet'){node1.setAttribute('rel', 'stylesheet alternate'); node2.setAttribute('rel', 'stylesheet');sessionStorage.setItem('theme', 'dark');}else{node2.setAttribute('rel', 'stylesheet alternate'); node1.setAttribute('rel', 'stylesheet');sessionStorage.setItem('theme', 'light');} return false;"</span><span class="nt">></i></span>
</pre></td></tr></tbody></table></code></pre></div></div>Humberto STEIN SHIROMOTOThis is how I found out how to add a switcher to toggle between light and dark modes of minimal mistakes theme.My RegEx Cheatsheet2022-09-27T00:00:00+10:002022-09-27T00:00:00+10:00https://hsteinshiromoto.github.io/posts/2022/10/01/blog_post_regex_cheatsheet<p>In this post, I compile a cheatsheet of the main regexes that I use in my projects.</p>
<p>Digital communication relies heavily on regular expressions to make it work. These are sequences of characters that specify a search pattern in the text. It is usually these types of patterns that are used by string-searching algorithms when they are attempting to “find” and/or “replace” strings or when they are attempting to validate input. Regular expression techniques are developed in theoretical computer science and formal language theory.</p>
<p>A regular expression (regex) is a sequence of characters that specifies a search pattern. Regexes are commonly used in text processing tasks, such as finding and replacing specific patterns of characters in a body of text.</p>
<p>Regexes can be used to search for patterns of characters in a string, or to match or replace strings based on specific patterns. They are often used in text editors, programming languages, and command-line utilities to perform these types of tasks.</p>
<p>Regexes are powerful because they allow you to define complex search patterns using a compact and concise syntax. For example, you can use a regex to search for all the email addresses in a document, or to find and replace all instances of a particular word in a piece of text.</p>
<p>There are many different flavors of regexes, with different syntax and capabilities. Some of the most commonly used regexes are based on the Perl programming language, and are known as “Perl-compatible regular expressions” or PCREs.</p>
<p>Regexes can be used for a wide range of text processing tasks, such as:</p>
<ul>
<li>Searching for specific patterns of characters in a body of text</li>
<li>Validating that a string matches a specific pattern (e.g. to ensure that a password meets certain criteria)</li>
<li>Extracting specific substrings from a larger string (e.g. to extract all the email addresses from a document)</li>
<li>Finding and replacing strings based on specific patterns (e.g. to replace all instances of a particular word in a piece of text)</li>
</ul>
<p>It is common to use regular expressions and other text processing utilities, for example <code class="language-plaintext highlighter-rouge">sed</code> and <code class="language-plaintext highlighter-rouge">AWK</code>, to search and replace in text processors, as well as in lexical analysis and in text processing. The majority of general-purpose programming languages support regex capabilities either natively or with the aid of libraries. Examples of such languages include Python, C, C++, Java, and JavaScript.</p>
<p>An example of a regular expression is to locate a word spelled two different ways in a text editor, the regular expression <code class="language-plaintext highlighter-rouge">seriali[sz]e</code> matches both “serialise” and “serialize”.</p>
<h2 id="table-of-contents">Table of Contents</h2>
<ul>
<li><a href="#table-of-contents">Table of Contents</a></li>
<li><a href="#character-classes">Character Classes</a></li>
<li><a href="#pythons-regex-module">Python’s regex module</a>
<ul>
<li><a href="#refindall"><code class="language-plaintext highlighter-rouge">re.findall</code></a></li>
<li><a href="#refinditer"><code class="language-plaintext highlighter-rouge">re.finditer</code></a></li>
<li><a href="#research"><code class="language-plaintext highlighter-rouge">re.search</code></a></li>
<li><a href="#resplit"><code class="language-plaintext highlighter-rouge">re.split</code></a></li>
<li><a href="#resub"><code class="language-plaintext highlighter-rouge">re.sub</code></a></li>
<li><a href="#recompile"><code class="language-plaintext highlighter-rouge">re.compile</code></a></li>
<li><a href="#reescape"><code class="language-plaintext highlighter-rouge">re.escape</code></a></li>
<li><a href="#flags">Flags</a></li>
</ul>
</li>
<li><a href="#cookbook">Cookbook</a>
<ul>
<li><a href="#select-everything-between-the-keywods-start-and-end">Select everything between the keywods <code class="language-plaintext highlighter-rouge">start</code> and <code class="language-plaintext highlighter-rouge">end</code></a></li>
<li><a href="#select-email-addresses">Select email addresses</a></li>
</ul>
</li>
<li><a href="#references">References:</a></li>
</ul>
<h2 id="character-classes">Character Classes</h2>
<p>All characters used in digital communication can be categorized the classes shown in the table below.</p>
<table>
<thead>
<tr>
<th>Character Class</th>
<th>Same as</th>
<th>Meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:alnum:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[0-9A-Za-z]</code></td>
<td>Letters and digits</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:alpha:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[A-Za-z]</code></td>
<td>Letters</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:ascii:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[\x00-\x7F]</code></td>
<td>ASCII codes 0-127</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:blank:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[\t ] </code></td>
<td>Space or tab only</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:cntrl:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[\x00-\x1F\x7F]</code></td>
<td>Control characters</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:digit:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[0-9]</code></td>
<td>Decimal digits</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:graph:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[[:alnum:][:punct:]] </code></td>
<td>Visible characters (not space)</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:lower:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[a-z]</code></td>
<td>Lowercase letters</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:print:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[ -~] == [ [:graph:]] </code></td>
<td>Visible characters</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:punct:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[!"#$%&’()*+,-./:;<=>?@[]^_`{\|}~]</code></td>
<td>Visible punctuation characters</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:space:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[\t\n\v\f\r ] </code></td>
<td>Whitespace</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:upper:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[A-Z]</code></td>
<td>Uppercase letters</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:word:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[0-9A-Za-z_]</code></td>
<td>Word characters</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:xdigit:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[0-9A-Fa-f]</code></td>
<td>Hexadecimal digits</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:<:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[\b(?=\w)] </code></td>
<td>Start of word</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">[[:>:]]</code></td>
<td><code class="language-plaintext highlighter-rouge">[\b(?<=\w)] </code></td>
<td>End of word</td>
</tr>
</tbody>
</table>
<h2 id="pythons-regex-module">Python’s regex module</h2>
<p>The regular expressions module can be imported using the command</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="kn">import</span> <span class="nn">re</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>It contains the following functions to be used.</p>
<h3 id="refindall"><code class="language-plaintext highlighter-rouge">re.findall</code></h3>
<p>Returns a list containing all matches:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="o">>>></span> <span class="n">re</span><span class="p">.</span><span class="n">findall</span><span class="p">(</span><span class="sa">r</span><span class="s">'\bs?pare?\b'</span><span class="p">,</span> <span class="s">'par spar apparent spare part pare'</span><span class="p">)</span>
<span class="p">[</span><span class="s">'par'</span><span class="p">,</span> <span class="s">'spar'</span><span class="p">,</span> <span class="s">'spare'</span><span class="p">,</span> <span class="s">'pare'</span><span class="p">]</span>
<span class="o">>>></span> <span class="n">re</span><span class="p">.</span><span class="n">findall</span><span class="p">(</span><span class="sa">r</span><span class="s">'\b0*[1-9]\d{2,}\b'</span><span class="p">,</span> <span class="s">'0501 035 154 12 26 98234'</span><span class="p">)</span>
<span class="p">[</span><span class="s">'0501'</span><span class="p">,</span> <span class="s">'154'</span><span class="p">,</span> <span class="s">'98234'</span><span class="p">]</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<h3 id="refinditer"><code class="language-plaintext highlighter-rouge">re.finditer</code></h3>
<p>Returns an iterable of match objects (one for each match):</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="o">>>></span> <span class="n">m_iter</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">finditer</span><span class="p">(</span><span class="sa">r</span><span class="s">'[0-9]+'</span><span class="p">,</span> <span class="s">'45 349 651 593 4 204'</span><span class="p">)</span>
<span class="o">>>></span> <span class="p">[</span><span class="n">m</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">m</span> <span class="ow">in</span> <span class="n">m_iter</span> <span class="k">if</span> <span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span> <span class="o"><</span> <span class="mi">350</span><span class="p">]</span>
<span class="p">[</span><span class="s">'45'</span><span class="p">,</span> <span class="s">'349'</span><span class="p">,</span> <span class="s">'4'</span><span class="p">,</span> <span class="s">'204'</span><span class="p">]</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<h3 id="research"><code class="language-plaintext highlighter-rouge">re.search</code></h3>
<p>Returns a Match object if there is a match anywhere in the string:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="o">>>></span> <span class="n">sentence</span> <span class="o">=</span> <span class="s">'This is a sample string'</span>
<span class="o">>>></span> <span class="nb">bool</span><span class="p">(</span><span class="n">re</span><span class="p">.</span><span class="n">search</span><span class="p">(</span><span class="sa">r</span><span class="s">'this'</span><span class="p">,</span> <span class="n">sentence</span><span class="p">,</span> <span class="n">flags</span><span class="o">=</span><span class="n">re</span><span class="p">.</span><span class="n">I</span><span class="p">))</span>
<span class="bp">True</span>
<span class="o">>>></span> <span class="nb">bool</span><span class="p">(</span><span class="n">re</span><span class="p">.</span><span class="n">search</span><span class="p">(</span><span class="sa">r</span><span class="s">'xyz'</span><span class="p">,</span> <span class="n">sentence</span><span class="p">))</span>
<span class="bp">False</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<h3 id="resplit"><code class="language-plaintext highlighter-rouge">re.split</code></h3>
<p>Returns a list where the string has been split at each match:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="o">>>></span> <span class="n">re</span><span class="p">.</span><span class="n">split</span><span class="p">(</span><span class="sa">r</span><span class="s">'\d+'</span><span class="p">,</span> <span class="s">'Sample123string42with777numbers'</span><span class="p">)</span>
<span class="p">[</span><span class="s">'Sample'</span><span class="p">,</span> <span class="s">'string'</span><span class="p">,</span> <span class="s">'with'</span><span class="p">,</span> <span class="s">'numbers'</span><span class="p">]</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<h3 id="resub"><code class="language-plaintext highlighter-rouge">re.sub</code></h3>
<p>Replaces one or many matches with a string:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="o">>>></span> <span class="n">ip_lines</span> <span class="o">=</span> <span class="s">"catapults</span><span class="se">\n</span><span class="s">concatenate</span><span class="se">\n</span><span class="s">cat"</span>
<span class="o">>>></span> <span class="k">print</span><span class="p">(</span><span class="n">re</span><span class="p">.</span><span class="n">sub</span><span class="p">(</span><span class="sa">r</span><span class="s">'^'</span><span class="p">,</span> <span class="sa">r</span><span class="s">'* '</span><span class="p">,</span> <span class="n">ip_lines</span><span class="p">,</span> <span class="n">flags</span><span class="o">=</span><span class="n">re</span><span class="p">.</span><span class="n">M</span><span class="p">))</span>
<span class="o">*</span> <span class="n">catapults</span>
<span class="o">*</span> <span class="n">concatenate</span>
<span class="o">*</span> <span class="n">cat</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>Tip: You can also use string methods {: .notice–info} {: .text-justify}</p>
<h3 id="recompile"><code class="language-plaintext highlighter-rouge">re.compile</code></h3>
<p>Compiles a regular expression pattern for later use:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="o">>>></span> <span class="n">pet</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="nb">compile</span><span class="p">(</span><span class="sa">r</span><span class="s">'dog'</span><span class="p">)</span>
<span class="o">>>></span> <span class="nb">type</span><span class="p">(</span><span class="n">pet</span><span class="p">)</span>
<span class="o"><</span><span class="k">class</span> <span class="err">'</span><span class="nc">_sre</span><span class="p">.</span><span class="n">SRE_Pattern</span><span class="s">'>
>>> bool(pet.search('</span><span class="n">They</span> <span class="n">bought</span> <span class="n">a</span> <span class="n">dog</span><span class="s">'))
True
>>> bool(pet.search('</span><span class="n">A</span> <span class="n">cat</span> <span class="n">crossed</span> <span class="n">their</span> <span class="n">path</span><span class="s">'))
False
</span></pre></td></tr></tbody></table></code></pre></div></div>
<h3 id="reescape"><code class="language-plaintext highlighter-rouge">re.escape</code></h3>
<h3 id="flags">Flags</h3>
<table>
<thead>
<tr>
<th>code (short)</th>
<th>code (long)</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code class="language-plaintext highlighter-rouge">re.I</code></td>
<td><code class="language-plaintext highlighter-rouge">re.IGNORECASE</code></td>
<td>Ignore case</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">re.M</code></td>
<td><code class="language-plaintext highlighter-rouge">re.MULTILINE</code></td>
<td>Multiline</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">re.L</code></td>
<td><code class="language-plaintext highlighter-rouge">re.LOCALE</code></td>
<td>Make <code class="language-plaintext highlighter-rouge">\w</code>, <code class="language-plaintext highlighter-rouge">\b</code>, <code class="language-plaintext highlighter-rouge">\s</code> locale dependent</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">re.S</code></td>
<td><code class="language-plaintext highlighter-rouge">re.DOTALL</code></td>
<td>Dot matches all (including newline)</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">re.U</code></td>
<td><code class="language-plaintext highlighter-rouge">re.UNICODE</code></td>
<td>Make <code class="language-plaintext highlighter-rouge">\w</code>, <code class="language-plaintext highlighter-rouge">\b</code>, <code class="language-plaintext highlighter-rouge">\d</code>, <code class="language-plaintext highlighter-rouge">\s</code> unicode dependent</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">re.X</code></td>
<td><code class="language-plaintext highlighter-rouge">re.VERBOSE</code></td>
<td>Readable style</td>
</tr>
</tbody>
</table>
<h2 id="cookbook">Cookbook</h2>
<p>Suppose we have two paragraphs as such</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre><span class="n">paragraph</span> <span class="o">=</span> <span class="s">"""
Start:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Sodales ut eu sem integer vitae
justo eget magna.
Tincidunt praesent semper feugiat nibh sed pulvinar proin
gravida. Praesent semper feugiat nibh sed. Mi proin sed libero enim sed faucibus
turpis. Tortor pretium viverra suspendisse potenti nullam ac. end
"""</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<h3 id="select-everything-between-the-keywods-start-and-end">Select everything between the keywods <code class="language-plaintext highlighter-rouge">start</code> and <code class="language-plaintext highlighter-rouge">end</code></h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="rouge-code"><pre><span class="o">>>></span> <span class="n">result</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">search</span><span class="p">(</span><span class="sa">r</span><span class="s">"(?<=Start:)((.|\n)*)(?=end)"</span><span class="p">,</span> <span class="n">paragraph</span><span class="p">).</span><span class="n">group</span><span class="p">()</span>
<span class="o">>>></span> <span class="k">print</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
<span class="n">Lorem</span> <span class="n">ipsum</span> <span class="n">dolor</span> <span class="n">sit</span> <span class="n">amet</span><span class="p">,</span> <span class="n">consectetur</span> <span class="n">adipiscing</span> <span class="n">elit</span><span class="p">,</span> <span class="n">sed</span> <span class="n">do</span> <span class="n">eiusmod</span> <span class="n">tempor</span>
<span class="n">incididunt</span> <span class="n">ut</span> <span class="n">labore</span> <span class="n">et</span> <span class="n">dolore</span> <span class="n">magna</span> <span class="n">aliqua</span><span class="p">.</span> <span class="n">Sodales</span> <span class="n">ut</span> <span class="n">eu</span> <span class="n">sem</span> <span class="n">integer</span> <span class="n">vitae</span>
<span class="n">justo</span> <span class="n">eget</span> <span class="n">magna</span><span class="p">.</span>
<span class="n">Tincidunt</span> <span class="n">praesent</span> <span class="n">semper</span> <span class="n">feugiat</span> <span class="n">nibh</span> <span class="n">sed</span> <span class="n">pulvinar</span> <span class="n">proin</span>
<span class="n">gravida</span><span class="p">.</span> <span class="n">Praesent</span> <span class="n">semper</span> <span class="n">feugiat</span> <span class="n">nibh</span> <span class="n">sed</span><span class="p">.</span> <span class="n">Mi</span> <span class="n">proin</span> <span class="n">sed</span> <span class="n">libero</span> <span class="n">enim</span> <span class="n">sed</span> <span class="n">faucibus</span>
<span class="n">turpis</span><span class="p">.</span> <span class="n">Tortor</span> <span class="n">pretium</span> <span class="n">viverra</span> <span class="n">suspendisse</span> <span class="n">potenti</span> <span class="n">nullam</span> <span class="n">ac</span><span class="p">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<h3 id="select-email-addresses">Select email addresses</h3>
<p>Suppose we want to extract the emails contained in the following paragraph:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre><span class="n">paragraph</span> <span class="o">=</span> <span class="s">"""
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Sodales ut eu sem integer vitae
justo eget magna. John Silva Doe <john.silva_3.doe@email.com>
Josh Tree Done 'jpsj_3@gmail.com'
Jane Doe <jane_doe4@email.com>
Malesuada fames ac turpis egestas integer eget. Cras semper auctor neque vitae
tempus. Sed adipiscing diam donec adipiscing tristique risus nec.
"""</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="o">>>></span> <span class="n">result</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">findall</span><span class="p">(</span><span class="sa">r</span><span class="s">"<?(\S+@[\w.-]+\.[a-zA-Z]{2,4}\b)"</span><span class="p">,</span> <span class="n">paragraph</span><span class="p">)</span>
<span class="o">>>></span> <span class="n">result</span>
<span class="p">[</span><span class="s">'john.silva_3.doe@email.com'</span><span class="p">,</span> <span class="s">'jpsj_3@gmail.com'</span><span class="p">,</span> <span class="s">'jane_doe4@email.com'</span><span class="p">]</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<h2 id="references">References:</h2>
<ul>
<li>[1] https://www.regexr.com</li>
<li>[2] https://quickref.me/regex</li>
<li>[3] https://www.regex101.com</li>
</ul>Humberto STEIN SHIROMOTOIn this post, I compile a cheatsheet of the main regexes that I use in my projects.Optimization References2021-12-23T00:00:00+11:002021-12-23T00:00:00+11:00https://hsteinshiromoto.github.io/posts/2021/12/23/blog-post_optimization_references<p>My list of reference materials containing for mathematical optimisation, based on <a href="https://www.quora.com/What-are-some-good-resources-to-learn-about-optimization">Quora</a>.</p>
<h1 id="lecture-notes">Lecture notes</h1>
<p>Highly recommended: video lectures by Prof. S. Boyd at Stanford, this is a rare case where watching live lectures is better than reading a book.</p>
<ul>
<li>
<p>EE263: Introduction to Linear Dynamical Systems (video): http://www.stanford.edu/~boyd/ee263/videos.html</p>
</li>
<li>
<p>EE363: Linear Dynamical Systems: http://www.stanford.edu/class/ee363/</p>
</li>
<li>
<p>EE364a: Convex Optimization I (video): http://www.stanford.edu/class/ee364a/videos.html</p>
</li>
<li>
<p>EE364b: Convex Optimization II (video): http://www.stanford.edu/class/ee364b/videos.html</p>
</li>
<li>
<p>6.253: Convex analysis and optimization: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-253-convex-analysis-and-optimization-spring-2010/lecture-notes/</p>
</li>
<li>
<p>Optimization courses at MIT: http://optimization.mit.edu/classes.php</p>
</li>
<li>
<p>Optimisation Course in CMU 10-725 Optimization Fall 2012</p>
</li>
</ul>
<h1 id="books">Books</h1>
<ul>
<li>
<p>S. Bubeck, “Convex Optimization: Algorithms and Complexity”, <a href="https://arxiv.org/abs/1405.4980">arXiv:1405.4980</a>, 2015</p>
</li>
<li>
<p>F. Clarke, “Functional Analysis, Calculus of Variations and Optimal Control”, Springer, 2013</p>
</li>
<li>
<p>Liberzon, D., “Calculus of Variations and Optimal Control Theory - A Concise Introduction”, Princeton University Press, 2012</p>
</li>
<li>
<p>S. Boyd and L. Vandenberghe, “Convex Optimization”, Cambridge University Press, 2004</p>
</li>
<li>
<p>G. Calafiore and L. El Ghaoui, “Optimization Models”, Cambridge University Press, 2014</p>
</li>
<li>
<p>R. T. Rockarfellar and R. J. B. Wets, “Variational Analysis”, Springer, 1998</p>
</li>
<li>
<p>D. G. Luenberger and Y. Ye, “Linear and Nonlinear Programming”, 4th ed., Springer, 2016</p>
</li>
<li>
<p>J. Frédéric Bonnans, J. Charles Gilbert, C. Lemaréchal and C. A. Sagastizábal, “Numerical Optimization”, 2nd ed., Springer, 2006</p>
</li>
<li>
<p>Papadimitriou & Steiglitz, Combinatorial Optimization: Algorithms and Complexity: http://www.amazon.com/Combinatorial-Optimization-Algorithms-Christos-Papadimitriou/dp/0486402584</p>
</li>
<li>
<p>Lawson & Hanson, Solving Least Squares Problems: http://books.google.com/books/about/Solving_Least_Squares_Problems.html?id=ROw4hU85nz8C</p>
</li>
<li>
<p>Bellman, Dynamic Programming: http://www.amazon.com/Dynamic-Programming-Richard-Bellman/dp/0486428095/</p>
</li>
<li>
<p>Bellman, Applied Dynamic Programming: http://www.amazon.com/Applied-Dynamic-Programming-Richard-Bellman/dp/B0000CLNVK</p>
</li>
<li>
<p>Bellman, Adaptive Control Processes: http://www.amazon.com/Adaptive-Control-Processes-Bellman/dp/0691079013</p>
</li>
<li>
<p>Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning: http://www.amazon.com/Genetic-Algorithms-Optimization-Machine-Learning/dp/0201157675</p>
</li>
<li>Gill, Murray, Wright, Practical Optimization: http://www.amazon.com/Practical-Optimization-Philip-Gill/dp/0122839528</li>
<li>
<p>Ben-Tal & Nemirovsky, Lectures on Modern Convex Optimization: http://www.amazon.com/Lectures-Modern-Convex-Optimization-Applications/dp/0898714915</p>
</li>
<li>
<p>Bertsekas, Introduction to Linear Optimization: http://www.amazon.com/Introduction-Linear-Optimization-Scientific-Computation/dp/1886529191</p>
</li>
<li>
<p>Bertsekas, Convex Analysis and Optimization: http://www.amazon.com/Convex-Analysis-Optimization-Dimitri-Bertsekas/dp/1886529450</p>
</li>
<li>
<p>Bertsekas, Nonlinear programming: http://www.amazon.com/Nonlinear-Programming-Dimitri-P-Bertsekas/dp/1886529000/</p>
</li>
<li>
<p>Bertsekas, Dynamic Programming and Optimal Control: http://www.amazon.com/Dynamic-Programming-Optimal-Control-Vol/dp/1886529086</p>
</li>
<li>
<p>Rockafellar, Convex Analysis: http://www.amazon.com/Analysis-Princeton-Landmarks-Mathematics-Physics/dp/0691015864/</p>
</li>
<li>
<p>Nesterov, Introductory Lectures on Convex Optimization: A Basic Course: http://www.amazon.com/Introductory-Lectures-Convex-Optimization-Applied/dp/1402075537</p>
</li>
<li>
<p>Ruszczynski, Nonlinear Optimization: http://www.amazon.com/Nonlinear-Optimization-Andrzej-Ruszczynski/dp/0691119155/</p>
</li>
<li>
<p>Fletcher, Practical Methods of Optimization: http://www.amazon.com/Practical-Methods-Optimization-R-Fletcher/dp/0471494631</p>
</li>
<li>
<p>Nocedal & Wright, Numerical Optimization: http://www.amazon.com/Numerical-Optimization-Operations-Financial-Engineering/dp/0387303030/
Press et al.</p>
</li>
<li>
<p>Numerical Recipes: http://www.amazon.com/Numerical-Recipes-3rd-Scientific-Computing/dp/0521880688</p>
</li>
<li>
<p>Dennis & Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations: http://www.amazon.com/Numerical-Unconstrained-Optimization-Nonlinear-Mathematics/dp/0898713641</p>
</li>
<li>
<p>Cornuejols & Tütüncü, Optimization Methods in Finance:
http://www.amazon.com/Optimization-Methods-Finance-Mathematics-Risk/dp/0521861705/</p>
</li>
<li>
<p>Stengel, Optimal Control and Estimation: http://www.amazon.com/Optimal-Control-Estimation-Advanced-Mathematics/dp/0486682005/</p>
</li>
<li>
<p>Kirk, Optimal Control Theory: http://www.amazon.com/Optimal-Control-Theory-Donald-Kirk/dp/0486434842/</p>
</li>
<li>
<p>Spall, Introduction to Stochastic Search and
Optimization: http://www.amazon.com/Introduction-Stochastic-Search-Optimization-James/dp/0471330523/</p>
</li>
<li>
<p>Lasdon, Optimization Theory for Large Systems: http://www.amazon.com/Optimization-Theory-Large-Systems-Lasdon/dp/0486419991</p>
</li>
<li>
<p>Deb & Kalyanmoy, Multi-Objective Optimization Using Evolutionary Algorithms: http://www.amazon.com/Multi-Objective-Optimization-Using-Evolutionary-Algorithms/dp/047187339X</p>
</li>
<li>
<p>Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning: http://www.amazon.com/Genetic-Algorithms-Optimization-Machine-Learning/dp/0201157675/</p>
</li>
<li>
<p>Minoux, Mathematical Programming: http://www.amazon.com/Mathematical-Programming-Wiley-Interscience-mathematics-optimization/dp/0471901709</p>
</li>
<li>
<p>Camacho & Alba: Model Predictive Control: http://www.amazon.com/Predictive-Control-Advanced-Textbooks-Processing/dp/1852336943</p>
</li>
<li>
<p>Hillier, Introduction to Operations Research: http://www.amazon.com/Introduction-Operations-Research-Student-Access/dp/0077298349/</p>
</li>
<li>
<p>Puterman, Markov Decision Processes: http://www.amazon.com/Markov-Decision-Processes-Programming-Probability/dp/0471727822</p>
</li>
<li>Powell, Approximate Dynamic Programming: http://www.amazon.com/Approximate-Dynamic-Programming-Dimensionality-Probability/dp/0470171553/</li>
</ul>
<h1 id="other">Other</h1>
<ul>
<li>
<p>Grešovnik, Optimization Links: http://www2.arnes.si/~ljc3m2/igor/links.html</p>
</li>
<li>
<p>8 Arsham, Intro to Modeling and Optimization: http://home.ubalt.edu/ntsbarsh/opre640a/partviii.htm</p>
</li>
<li>
<p>Matlab Optimization Toolbox resources: http://www.mathworks.com/help/toolbox/optim/</p>
</li>
<li>
<p>Bennett et al., The Interplay of Optimization and Machine Learning
Research: http://jmlr.csail.mit.edu/papers/volume7/MLOPT-intro06a/MLOPT-intro06a.pdf</p>
</li>
<li>
<p>Evolutionary algorithms chapter in Jason Brownlee’s book : http://www.cleveralgorithms.com/nature-inspired/evolution.html</p>
</li>
<li>
<p>Brent, Algorithms for Minimization without Derivatives: http://maths-people.anu.edu.au/~brent/pub/pub011.html</p>
</li>
</ul>Humberto STEIN SHIROMOTOMy list of reference materials containing for mathematical optimisation, based on Quora.Optimizing Marketing Campaigns Part 1: Clustering2021-12-15T00:00:00+11:002021-12-15T00:00:00+11:00https://hsteinshiromoto.github.io/posts/2021/12/15/blog-post_optimizing_marketing_campaigns_part_1<p>In this series of posts, we analyze how to maximize the profit of marketing campaigns using mathematical optimization techniques. In the first part, we use optimize the profit of campaign for a cluster of customers. To do this, we model the profit and cost of the campaigns of two products. Furthermore, the constraints on maximum number of offers, budget and return on investment are also modelled and considered to maximize the profit.</p>
<ul>
<li><a href="#1-modelling-the-profit">1. Modelling the Profit</a></li>
<li><a href="#2-modelling-the-constraints">2. Modelling the Constraints</a>
<ul>
<li><a href="#21-maximum-number-of-offers-for-each-cluster">2.1. Maximum Number of Offers for each Cluster</a></li>
<li><a href="#22-maximum-budget">2.2. Maximum Budget</a></li>
<li><a href="#23-minimum-number-of-offers-of-each-product">2.3. Minimum Number of Offers of each Product</a></li>
<li><a href="#24-minimum-roi">2.4. Minimum ROI</a></li>
<li><a href="#25-recap-of-the-optimization-model">2.5. Recap of the Optimization Model</a></li>
</ul>
</li>
<li><a href="#3-data">3. Data</a></li>
<li><a href="#4-python-implementation">4. Python Implementation</a></li>
<li><a href="#5-in-the-next-post">5. In the Next Post</a></li>
<li><a href="#6-references">6. References</a></li>
</ul>
<p>The estimated individual expected profit can be determined with machine learning models. For example, a response model such as uplift model can be used to estimate the individual expected profit.</p>
<p>The key idea is to cluster the estimated individual expected profits and then consider the cluster centroids as representative of the data for all the individual customers within a single cluster. This aggregation enables the problem to be formulated as a linear programming problem so that rather than assigning offers to individual customers, the model identifies proportions within each cluster for each product offer that maximizes the marketing campaign return on investment while considering the business constraints.</p>
<p>From the technical viewpoint, the model is formulated as a <a href="https://hsteinshiromoto.github.io/posts/2021/12/08/blog-post_mixed_integer_programming">mixed-integer linear programming</a> problem.</p>
<h1 id="1-modelling-the-profit">1. Modelling the Profit</h1>
<p>Maximize total expected profit from marketing campaign and heavily penalize any correction to the budget. Let $K$ be the set of clusters and $J$ the set of products, we define the profit function as</p>
<p>\(\max_{y,z} \sum_{k \in K} \sum_{j \in J} \pi_{k,j} \cdot y_{k,j} - M \cdot z\;
\tag{Profit}\)
where</p>
<ul>
<li>
<p>$\pi_{k,j}$: is the expected profit to the bank from the offer of product $j \in J$ to an average customer of cluster $k \in K$.</p>
</li>
<li>
<p>$y_{k,j} \geq 0$: is the number of customers in cluster $k \in K$ that are offered product $j \in J$.</p>
</li>
<li>
<p>$M$: Big M penalty. This penalty is associated with corrections on the budget that are necessary to satisfy other business constraints.</p>
</li>
<li>
<p>$z \geq 0$: Increase in budget in order to have a feasible campaign.</p>
</li>
</ul>
<h1 id="2-modelling-the-constraints">2. Modelling the Constraints</h1>
<h2 id="21-maximum-number-of-offers-for-each-cluster">2.1. Maximum Number of Offers for each Cluster</h2>
<p>Maximum number of offers of products for each cluster is limited by the number of customers in the cluster.</p>
<p>\(\sum_{j \in J} y_{k,j} \leq N_{k} \quad \forall k \in K\;,
\tag{Max Number of Offers}\)
where $N_k$ is the number of customers in cluster $k \in K$.</p>
<h2 id="22-maximum-budget">2.2. Maximum Budget</h2>
<p>The marketing campaign budget constraint enforces that the total cost of the campaign should be less than the budget campaign. There is the possibility of increasing the budget to ensure the feasibility of the model, the minimum number of offers for all the product may require this increase in the budget.</p>
<p>\(\sum_{k \in K} \sum_{j \in J} \nu_{k,j} \cdot y_{k,j} \leq B + z\;,
\tag{Max Budget}\)
where $\nu_{k,j}$ is the average variable cost associated with the offer of product $j \in J$ to an average customer of cluster $k \in K$ and $B$ is the marketing campaign budget.</p>
<h2 id="23-minimum-number-of-offers-of-each-product">2.3. Minimum Number of Offers of each Product</h2>
<p>Minimum number of offers of each product.</p>
<p>\(\sum_{k \in K} y_{k,j} \geq Q_{j} \quad \forall j \in J\;,
\tag{Min Number of Offers}\)
where $Q_j$ is the minimum number of offers of product $j \in J$ to be made.</p>
<h2 id="24-minimum-roi">2.4. Minimum ROI</h2>
<p>The minimum ROI constraint ensures that the ratio of total profits over cost is at least one plus the corporate hurdle rate.</p>
<p>\(\sum_{k \in K} \sum_{j \in J} \pi_{k,j} \cdot y_{k,j} \geq (1+R) \cdot \sum_{k \in K} \sum_{j \in J} \nu_{k,j} \cdot y_{k,j}\;,
\tag{Minimum ROI}\)
where $R$ is the corporate hurdle rate. This hurdle rate is used for the ROI calculation of the marketing campaign.</p>
<h2 id="25-recap-of-the-optimization-model">2.5. Recap of the Optimization Model</h2>
<p>The optimization model is formulated as a mixed-integer linear programming problem. The objective function is defined as the maximum expected profit from the marketing campaign. The constraints are defined as the maximum number of offers of each product for each cluster, the budget, the minimum ROI, and the minimum number of offers of each product.</p>
\[\begin{array}{rlr}
\max_{y,z}&\sum_{k \in K} \sum_{j \in J} \pi_{k,j} \cdot y_{k,j} - M \cdot z&\text{(Profit)}\\
\text{s.t.}&\sum_{j \in J} y_{k,j} \leq N_{k} \quad \forall k \in K&\text{(Max Number of Offers)}\\
&\sum_{k \in K} \sum_{j \in J} \nu_{k,j} \cdot y_{k,j} \leq B + z&\text{(Max Budget)}\\
&\sum_{k \in K} y_{k,j} \geq Q_{j}&\text{(Min Number of Offers)}\\
&\sum_{k \in K} \sum_{j \in J} \pi_{k,j} \cdot y_{k,j} \geq (1+R) \cdot \sum_{k \in K} \sum_{j \in J} \nu_{k,j} \cdot y_{k,j}&\text{(Minimum ROI)}
\end{array}\]
<h1 id="3-data">3. Data</h1>
<p>We consider two products, ten customers, and two clusters of customers. The corporate hurdle-rate is twenty percent.</p>
<p>The following table defines the expected profit of an average customer in each cluster when offered a product.</p>
<table>
<thead>
<tr>
<th><i></i></th>
<th>Product 1</th>
<th>Product 2</th>
</tr>
</thead>
<tbody>
<tr>
<td>cluster 1</td>
<td>$2 000</td>
<td>$1 000</td>
</tr>
<tr>
<td>cluster 2</td>
<td>$3 000</td>
<td>$2 000</td>
</tr>
</tbody>
</table>
<p>The expected cost of offering a product to an average customer in a cluster is determined by the following table.</p>
<table>
<thead>
<tr>
<th><i></i></th>
<th>Product 1</th>
<th>Product 2</th>
</tr>
</thead>
<tbody>
<tr>
<td>cluster 1</td>
<td>$200</td>
<td>$100</td>
</tr>
<tr>
<td>cluster 2</td>
<td>$300</td>
<td>$200</td>
</tr>
</tbody>
</table>
<p>The budget available for the marketing campaign is $200.</p>
<p>The number of customers in each cluster is given by the following table.</p>
<table>
<thead>
<tr>
<th><i></i></th>
<th>Num. Customers</th>
</tr>
</thead>
<tbody>
<tr>
<td>cluster 1</td>
<td>5</td>
</tr>
<tr>
<td>cluster 2</td>
<td>5</td>
</tr>
</tbody>
</table>
<p>The minimum number of offers of each product is provided in the following table,</p>
<table>
<thead>
<tr>
<th><i></i></th>
<th>Min Offers</th>
</tr>
</thead>
<tbody>
<tr>
<td>product 1</td>
<td>2</td>
</tr>
<tr>
<td>product 2</td>
<td>2</td>
</tr>
</tbody>
</table>
<h1 id="4-python-implementation">4. Python Implementation</h1>
<div class="fluidMedia" style="height: 100vh;">
<iframe src="https://nbviewer.org/github/hsteinshiromoto/blog/blob/master/posts/2021-12-31-blog-post_optimizing_marketing_campaigns_part_1/2021-12-31-blog-post_optimizing_marketing_campaigns_part_1.ipynb" style="height: 100%; width: 100%;" frameborder="0" id="iframe"> </iframe>
</div>
<h1 id="5-in-the-next-post">5. In the Next Post</h1>
<p>We will learn how to optimize the campaigns at an individual customer level.</p>
<h1 id="6-references">6. References</h1>
<ul>
<li>
<p>https://gurobi.github.io/modeling-examples/marketing_campaign_optimization/marketing_campaign_optimization.html</p>
</li>
<li>
<p>M.-D. Cohen, “<em>Exploiting response models—optimizing cross-sell and up-sell opportunities in banking</em>”, Information Systems 29 (2004) 327–341</p>
</li>
</ul>Humberto STEIN SHIROMOTOIn this series of posts, we analyze how to maximize the profit of marketing campaigns using mathematical optimization techniques. In the first part, we use optimize the profit of campaign for a cluster of customers. To do this, we model the profit and cost of the campaigns of two products. Furthermore, the constraints on maximum number of offers, budget and return on investment are also modelled and considered to maximize the profit.Mixed Integer Programming2021-12-08T00:00:00+11:002021-12-08T00:00:00+11:00https://hsteinshiromoto.github.io/posts/2021/12/08/blog-post_mixed_integer_programming<p>Mixed Integer Programming (MIP) are a form of optimization that is formulated using a combination of equations that are continous and discrete.</p>
<p>MIPs typically appear when one or more decision variable is boolean, ie, assume value 0 or 1. This type of optimization problem is formulated as, find $x\in\mathbb{R}^n$ such that</p>
\[\begin{array}{rll}
\min& x^T Q x + q^Tx\\
\text{subject to}& l \leq x \leq u & (\text{bound constraints})\\
&x^T Q x + q^T x \leq b & (\text{quadratic constraints})\\
&\exists i\in[1,n]\subset\mathbb{N}\text{ such that } x_i\in\mathbb{Z} &(\text{integrality constraints}),
\end{array}\]
<p>where $x=(x_1,\ldots,x_n)$ is the vector of decision variables, $Q\in\mathbb{R}^{n\times n}$ is the matrix of coefficients of the objective function, $q\in\mathbb{R}^n$ is the vector of coefficients of the linear part of the objective function, $l\in\mathbb{R}^n$ is the vector of lower bounds, $u\in\mathbb{R}^n$ is the vector of upper bounds, and $b\in\mathbb{R}^n$ is the vector of the right-hand side of the quadratic constraints.</p>
<p>An example of the implementation of the above formulation is shown in the notebook below.</p>
<script src="https://gist.github.com/20e5d8fa03362a6e12dd5d8cdc4165df.js"> </script>Humberto STEIN SHIROMOTOMixed Integer Programming (MIP) are a form of optimization that is formulated using a combination of equations that are continous and discrete.