In [1], we looked at the following problem:
Mathematical Model |
---|
\[ \begin{align} \min& \sum_{i,j} \color{darkblue}a_{i,j} \cdot \color{darkred}x_{i,j} \\ & \sum_j \color{darkblue}a_{i,j}\cdot \color{darkred}x_{i,j} \ge \color{darkblue}r_i && \forall i \\ & \sum_i \color{darkblue}a_{i,j}\cdot \color{darkred}x_{i,j} \ge \color{darkblue}c_j && \forall j \\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align}\] |
This is a network problem, so instead of solving it as a MIP, we can solve it as a pure LP or as a min-cost flow network problem.
Here we try to use some open-source network codes. And thus, we need to formulate the problem as a pure network. This would be relatively straightforward if the algorithm supports lower bounds on the flows. However, most open-source network algorithms don't support these. (I think the C++ library Lemon has a code that supports lower bounds). Here is a workaround when lower bounds are not supported:
Assume we have an edge \(i \rightarrow j\) with bounds \(f_{i,j} \in [\ell,u]\). We can reformulate this as:
- demand in node \(i\) is \(\ell\)
- supply in node \(j\) is \(\ell\)
- capacity of the edge \(i \rightarrow j\) becomes \(u-\ell\).
A streamlined version of this is used in my implementations below.
I will use the following network for our problem:
The orange nodes are supply nodes, while the blue ones are demand nodes.
- The source node src has a supply of \(\sum_{i,j} \color{darkblue}a_{i,j}-\sum_i \color{darkblue}r_i\). We subtract \(\sum_i \color{darkblue}r_i\) to simulate the lower bounds on the flows \(\mathit{src}\rightarrow i_k\). Part of the output of the source node flows to the assignment network; the rest is unused and goes directly to the sink node snk.
- The nodes \(i\) have a supply of \(\color{darkblue}r_i\). This is to make sure each \(i\) receives at least \(\color{darkblue}r_i\).
- The nodes \(j\) have a demand of \(\color{darkblue}c_j\). This is to make sure we obey the lower bound of \(\color{darkblue}c_j\).
- Finally, the node snk has a demand of \(\sum_{i,j} \color{darkblue}a_{i,j}-\sum_j \color{darkblue}c_j\).
- You can verify that the total supply is equal to the total demand.
- The assignment arcs \(i \rightarrow j\) have a cost of 1 and a capacity of 1.
- All other arcs have no cost and have unlimited capacity. I used \(\sum_{i,j} \color{darkblue}a_{i,j}\) to represent "unlimited capacity" where needed.
We get the same objective function value as reported in [1]. The performance is not something to write home about.
Here the network setup is actually more expensive than solving it. One reason for better performance is that the networkx algorithm seems to be written in Python, while ortools uses C++. The underlying algorithms are also different (network simplex vs. scaling push-relabel method).
Formulate a min-cost flow model and solve as LP
Min-Cost Flow Model |
---|
\[ \begin{align} \min& \sum_{E(v,w)} \color{darkblue}c_{v,w} \cdot \color{darkred}f_{v,w} \\ & \sum_{w|E(w,v)} \color{darkred}f_{w,v} + \color{darkblue}{\mathit{supply}}_v = \sum_{w|E(v,w)} \color{darkred}f_{v,w} && \forall v \\ & \color{darkred}f_{v,w} \in [\color{darkblue}\ell_{v,w},\color{darkblue}u_{v,w}] && \forall E(v,w) \end{align}\] |
Here \(v,w\) are nodes (corresponding to constraints in the LP model), and \(E(v,w)\) are the edges. The lower bounds allow us to simply say that the src node has a supply \( \color{darkblue}a_{i,j}\), and the snk node has a demand of the same size. The edges \(\mathit{src}\rightarrow i\) and \(j \rightarrow \mathit{snk}\) have a lower bound of \(\color{darkblue}r_i\) and \(\color{darkblue}c_j\). When we solve this we see:
Optimal solution found
Objective: 504711.000000
Conclusions
References
Appendix: Min-cost flow model networkx implementation
import networkx as nx import numpy as np import pickle #--------------------------------------------------------# import data#-------------------------------------------------------- data = pickle.load(open("c:/tmp/test/arc.pck","rb")) a = data['a'] # 2d numpy array r = data['r'] # list of floats c = data['c'] # list of floats I = range(len(r)) J = range(len(c)) # these are all integers represented as double precision numbers# convert to integers a = a.astype(int) r = [int(r[i]) for i in I] c = [int(c[j]) for j in J] suma = np.sum(a) sumr = sum(r) sumc = sum(c) #--------------------------------------------------------# form network#-------------------------------------------------------- G = nx.DiGraph() ## x(i,j) form the bipartite part#for i in I: G.add_node(f'i{i}',demand=-r[i]) for j in J: G.add_node(f'j{j}',demand=c[j]) for i in I: for j in J: if a[i,j]==1: G.add_edge(f'i{i}',f'j{j}',weight=1,capacity=1) ## add src and snk node# G.add_node('src',demand=sumr-suma) G.add_node('snk',demand=suma-sumc) for i in I: G.add_edge('src',f'i{i}') for j in J: G.add_edge(f'j{j}','snk') G.add_edge('src','snk') #--------------------------------------------------------# solve#-------------------------------------------------------- print(G) flowCost, flowDict = nx.network_simplex(G) print(f'objective:{flowCost}')
Appendix: min-cost flow model ortools implementation
import pickle import numpy as np from ortools.graph.python import min_cost_flow #--------------------------------------------------------# import data#-------------------------------------------------------- data = pickle.load(open("c:/tmp/test/arc.pck","rb")) a = data['a'] # 2d numpy array r = data['r'] # list of floats c = data['c'] # list of floats I = range(len(r)) J = range(len(c)) # these are all integers represented as double precision numbers# convert to integers a = a.astype(int) r = [int(r[i]) for i in I] c = [int(c[j]) for j in J] suma = np.sum(a) # also functions as largest possible capacity sumr = sum(r) sumc = sum(c) #--------------------------------------------------------# form network#-------------------------------------------------------- mcf = min_cost_flow.SimpleMinCostFlow() # numbering scheme for the nodes:# i = 0..len(r)-1# j = len(r)..len(r)+len(c)-1# src = len(r)+len(c)# snk = len(r)+len(c)+1 src = len(r)+len(c) snk = src + 1 lenr = len(r) # assignment part of the graph, i->jfor i in I: for j in J: if a[i,j]==1: mcf.add_arcs_with_capacity_and_unit_cost(i,lenr+j,1,1) # src -> ifor i in I: mcf.add_arcs_with_capacity_and_unit_cost(src,i,suma,0) # j -> snkfor j in J: mcf.add_arcs_with_capacity_and_unit_cost(lenr+j,snk,suma,0) # src -> snk mcf.add_arcs_with_capacity_and_unit_cost(src,snk,suma,0) # suppliesfor i in I: mcf.set_node_supply(i,r[i]) for j in J: mcf.set_node_supply(lenr+j,-c[j]) mcf.set_node_supply(src,suma-sumr) mcf.set_node_supply(snk,sumc-suma) #--------------------------------------------------------# solve#-------------------------------------------------------- print(f'nodes:{mcf.num_nodes()}, arcs:{mcf.num_arcs()}') status=mcf.solve() print(status) print(f'objective:{mcf.optimal_cost()}')
Appendix: min-cost flow GAMS implementation
*----------------------------------------------------------------------------------- * network *-----------------------------------------------------------------------------------
sets v 'nodes' /i1*i20000,j1*j500,src,snk/ e(v,v) 'edges' ;
* e is populated below
*----------------------------------------------------------------------------------- * Original data *-----------------------------------------------------------------------------------
sets i(v) /i1*i20000/ j(v) /j1*j500/ ;
set a(i,j) 'randomly filled with 10% ones'; a(i,j)$(uniform(0,1)<0.1) = yes;
* calculate rows and column sums parameter rowsum(i), colsum(j); rowsum(i) = sum(a(i,j),1); colsum(j) = sum(a(i,j),1);
parameter r(i),c(j); r(i) = ceil(rowsum(i)/2); c(j) = ceil(colsum(j)/2);
*----------------------------------------------------------------------------------- * populate arcs in network *-----------------------------------------------------------------------------------
e(a) = yes; e('src',i) = yes; e(j,'snk') = yes; e('src','snk') = yes;
parameter supply(v) 'supply at nodes (demand is negative supply)'; supply('src') = card(a); supply('snk') = -card(a);
*----------------------------------------------------------------------------------- * min cost flow *-----------------------------------------------------------------------------------
alias (v,w); positive variable f(v,w);
* non-default bounds on arcs f.up(i,j) = 1; f.lo('src',i) = r(i); f.lo(j,'snk') = c(j);
variable z 'objective';
equations obj 'objective' nodebal(v) 'node balance' ;
obj.. z =e= sum(a,f(a)); nodebal(v).. sum(e(w,v),f(e)) + supply(v) =e= sum(e(v,w),f(e));
model m /all/; m.solprint = %solprint.Silent%; option threads = 0; solve m minimizing z using lp; display z.l; |