Team:UESTC-China/Model4

description

Problem Restatement

For the marketization, the deinkers need to be deployed in the city. However, where should our deinkers be deployed and how many units should be deployed in one place is the key point, which can help reduce plenty of waste. That's something we really need to think about. So we did mathematical modeling for this problem. Combined with our survey data in the city, we hope to get a more reasonable decision-making model of deinkers in the city.

Data Requirements and Acquisition

Since deinkers need to be applied to cities in the future, the data we use for simulation must be close to reality. In this way, our model can be applied quickly and the number of parameters that need to be adjusted can greatly reduce if we deploy it in cities in the future. Therefore, our first requirement for data is the authenticity, accuracy and practicability.

Secondly, the object of our arrangement is the city. For any city, there are many different types of buildings, such as hospitals, schools, factories and so on. In these buildings people do not have exactly the same occupation. Therefore, for the different groups, their willingness to use our deinkers is not the same. For example, compared with the staff who sits in the office all day doing scientific research and the workers who work on the assembly line, the former's willingness to use the deinkers is obviously stronger than the latter. With this in mind, the data we capture must cover as many occupation objects and different types of buildings as possible. Combined with the willingness of people of different professions to use our deinkers, our model can be applied to a wider range of places, and will provide certain help for the model migration in the future.

According to the above requirements and considering the authoritative data sources certified by the authorities, we used the API of Baidu Map to obtain the relevant data map of a certain city in China, including the distribution map of 17 kinds of POI. The data is detailed and realistic, which perfectly meets our needs. In terms of being realistic, the model completes the first step [1]. [17 kinds of POI: Food, hotels, shopping, life services, beauty, tourist attractions, leisure and entertainment, sports and fitness, Education and Training, culture and media, Medical care, car services, transportation facilities, real estate, companies, government agencies, natural features.]

Figure1. Distribution map of 17 kinds of POI in cities

Considering the large amount of data in the urban scatter map containing 17 kinds of POI, we only selected some representative areas for trial site decision in the later stage. Due to the extensibility and portability of the model, part represents the whole in consideration of the computing cost and time cost under big data.

Algorithm Description

After we get the data, our first task is to classify the areas on the map. We divided the original data graph into several rectangular areas of the same size. We listed as many topological structures as possible including 17 kinds of POI according to the actual situation, and then compared the topological structure with all rectangles in the original data graph. In fact, this step is to take each topology graph as a regional slide window, similar to the convolution kernel, and conduct convolution operation on the data graph to obtain the graph of similarity. After doing this, each region in the data graph has a similarity matrix compared to the sample topology. At this point, the preprocessing is complete. Below we discuss the distribution of deinkers in different topological structure diagram. In this way, each topology becomes the basis for our calculations. Only linear operation is needed to get the distribution of each rectangular area in the data graph, and after expansion, the distribution of the city can be easily obtained.

Figure2. A distribution map containing 17 kinds of POI, where the direct connection of each POI is omitted for easy observation

Let's describe the pseudo-code of the algorithm we designed and adopted. Our goal is for people in any topology structure to be able to find the nearest deinker within a certain radius. Under the condition of reasonable resource scheduling, the usage frequency of one deinker will not be significantly lower than other deinkers.

I'm going to expand the description algorithm in two parts. In the first part, we optimize the layout algorithm inside the topology structure. In the second part, we will combine the layout results of multiple topology structures to build the whole city's layout network. The theoretical basis for the perfect connection of these two parts is that each topological structure can become the basis of the sampling rectangle [2].

In this part of the algorithm, we mainly solve the problem of placing deinkers in a specific topology. As for the design of related parameters, such as the maximum number of times that the deinker can be used per unit time and the willingness of people of different professions to use the deinker, we complished them by referring to the survey data of human practices. By replacing the relevant parameters, our location decision model can be applied to a large area covering more occupations, or to a specific office building. As a result, the model will be highly extensible.

In this part of the algorithm, we combine many topologies and optimize the parts connected by different topologies. We analyze sites in particular locations individually and find the best matches for them. The special point mentioned here can be a point on the boundary of a block or a location where there are multiple locally optimal points. We optimize for the above specific areas and extend the base topology to the whole area. In this way, each sample rectangle can be represented as a linear combination between different bases, greatly enhancing the mobility of the model. When we consider the application of the model to other cities in China, as long as they have similar geomorphic features and urban building distribution, we can make a suitable layout decision.

At this point, we improve the algorithm to optimize the convergence speed. The major change is that we set the final decision result to an acceptable probabilistic form. This can not only accelerate the convergence speed of the algorithm, but also improve the accuracy [3].

However, we can consider combining algorithms that do not use probability models with algorithms that use probability models to build a simple new computing framework [4].

Next, we carry out theoretical optimization of algorithm iteration. When we find the minimum value of the correlation function, we first consider using the Newton method, and then using the fastest descent when solving the parameters. The purpose of this step is to avoid the starting point is too far away and the algorithm will iterate too slowly. If the steepest descent is adopted, the algorithm can proceed to the optimum or local optimum at a faster speed at the beginning. The iteration expression is as follows.

$$x^{(k+1)}=x^{(k)}-α_kF^{-1}(x^{(k)})g^{(k)}$$

$$α_k=arg\,\min_{α>0}f(x^{(k)}-αF^{-1}(x^{(k)})g^{(k)})$$

However, after further modification, we found that the function F was not positive definite, so it was obvious that the matrix could not be directly inverted. So we need to construct a positive definite function that is homologous to F. We consider using the identity matrix and related parameters for construction.

$$F(x^{(k)})+μ_kI$$

Obviously, the above function must be positive definite. So we're going to use it to replace F and iterate. And we can easily get a new iteration formula.

$$x^{(k+1)}=x^{(k)}-(F(x^{(k)})+μ_kI)^{-1}g^{(k)}$$

However, in practical calculation, we find that sometimes the function is too large to solve the inverse of the Hessian matrix. After all, the algorithm complexity brought by solving the inverse matrix is very high, so we consider using another function to approximate the inverse matrix of the Hessian matrix. On the basis of rank 1 modification, we construct a new matrix H conforming to the conditions.

$$rank(z^{(k)}z^{(k)T})=1$$

So, our new iteration process is as follows:

$$d^{(k)}=-H_kg^{(k)}$$

$$x^{(k+1)}=x^{(k)}+α_kg^{(k)}$$

$$α_k=arg\,\min_{α>0}f(x^{(k)}+αd^{(k)})$$

$$H_{(k+1)}=H_k+\frac {(Δx^{(k)}-H_kΔg^{(k)})(Δx^{(k)}-H_kΔg^{(k)})^T}{Δg^{(k)T}(Δx^{(k)}-H_kΔg^{(k)})}$$

At the same time, aiming at the multi-parameter programming problem, we fixed other elements and optimized them in one direction at a time through coordinate descent. The optimal solution is obtained by distributed iteration. (Here we mainly use the SMO algorithm in SVM)

$$ \begin{aligned} \mathop{min}\limits_{\alpha_i,\alpha_j}\quad &\frac12(\alpha_i^2y_i^2\phi(x_i)^T\phi(x_i)+\alpha_j^2y_j^2\phi(x_j)^T\phi(x_j)\\ &+2\alpha_i\alpha_jy_iy_j\phi(x_i)^T\phi(x_j))-(\alpha_i+\alpha_j)\\\\ s.t.\quad &\alpha_iy_i+\alpha_jy_j=c,\\ &c=-\sum_{k\neq i,j}\alpha_k y_k,\\ &0\leq \alpha_i\leq \xi_i,\\ &0\leq \alpha_j\leq \xi_j \end{aligned} $$

When we completed comparative iteration on the POI distribution with multiple topological structures, we found that there were still some rectangular regions with low probability of similarity to multiple topologies, and some points on the block boundary were still not well solved.

By observing the distribution of these points, we found that most of them are at the junction of urban buildings and natural features (such as mountains, lakes, etc.). We count similar regions within multiple blocks and consider these data separately. Due to the terrain, our idea is to develop a specific deinker layout decision for these specific areas within a certain range. In turn, resources can be saved to avoid waste.

We tried to see if the data could be pieced together before making decisions. In other words, does the data exist in some unique way on the map? Thinking of this, we first decided to use the Gaussian Mixture model to try to fit.

We first assume that there are K Gaussian distributions in the data. In the equation, $ \sum_{k=1}^{K}\pi_k=1 $ and $ 0 \leq \pi_k \leq 1 $.

$$p(x)=\sum_{k=1}^{K}\pi_kp(x|\mu_k,\Sigma_k)$$

We introduce a K dimensional random varibles $ {z}=[z_1,z_2,...,z_k] $. $ z_k\in \{0,1\} $ and $ \sum_{k=1}^{K}z_k=1 $. $ z_k=1 $ express that sample ${x}$ is from part model k. Varible $ {z} $ is called latent varible, and $ ({x},{z}) $ is called complete data. So, we have:

$$p(x|z_k=1)=p(x|\mu_k,\Sigma_k)$$

According to Bayes'theorem, we can easily list equation:

$$ \begin{split} p(z_k=1|x) &= \frac{p(z_k=1)p(x|z_k=1)}{p(x)}\\ &= \frac{\pi_kp(x|\mu_k,\Sigma_k)}{\sum_{j=1}^{K}\pi_jp(x|\mu_j,\Sigma_j)} \end{split} $$

At this time, we write $ \gamma(z_k)=p(z_k=1|x) $, and it is called reponsibility from part model k to $ x $. As for sample set $ X=\{x_1,x_2,...,x_n\} $, the latent varibles of $ x_n $ is $ z_n=[z_{n1},z_{n2},...,z_{nK}], n=1,2,..,K$. The responsiblity from part model k to $ x $ is as follows.

$$\gamma_{nk}=p(z_{nk}=1|x)=\frac{\pi_kp(x_n|\mu_k,\Sigma_k)}{\sum_{j=1}^{K}\pi_jp(x_n|\mu_j,\Sigma_j)}$$

The likilihood function of sample set:

$$p(X|\pi,\mu,\Sigma)=\prod_{n=1}^{N}[\sum_{k=1}^{K}\pi_kp(x_n|\mu_k,\Sigma_k)]$$

Actually, we use MLE to solve the mixed Gaussian distributions. And the problem is transformed into an optimization problem.

We take Lagrange mutilplier method to solve the optimization problem.

$$\mathcal{L}(\pi,\mu,\Sigma)=\ln p(X|\pi,\mu,\Sigma)-\lambda(\sum_{k=1}^{K}\pi_k-1)$$

Then we take partial derivation of $\pi,\mu,\Sigma$ separately. We get the results.

$$ \begin{aligned} \pi_k&= \frac{N_k}{N}\\ \mu_k&= \frac{1}{N_k}\sum_{n=1}^{N}[\gamma(z_{nk})x_n]\\ \Sigma_k&= \frac{1}{N_k}\sum_{n=1}^{N}\gamma(z_{nk})(x_m-\mu_k)(x_m-\mu_k)^T \end{aligned} $$

In the equations, $ N_k=\sum_{n=1}^{N}\gamma(z_{nk}) $.

After completing the theoretical derivation, we extract the scatters to be processed from the original scatters to conduct Mixture Gaussian Clustering. We found that it was obvious that the data could be clustered well, so that we performed a separate point analysis for each category based on the clustering results [5][6].

References

[1] RichieBao. https://github.com/richieBao/python-urbanPlanning, 2020.

[2] Si, Shoukui, and Xijing Sun. "Mathematical modeling algorithm and application." National defense industry press (2015): 425-428.

[3] Qiyuan, Jiang. "Mathematical Experiments and Mathematical Modeling." Mathematics in Practice and Theory 5 (2001): 102-106.

[4] Wei, Lijun, et al. "A simulated annealing algorithm for the capacitated vehicle routing problem with two-dimensional loading constraints." European Journal of Operational Research 265.3 (2018): 843-859.

[5] G. McLachlan and K. Basford, Mixture Models: Interference and Applications to Clustering. New York: Marcel Dekker, 1988.

[6] R. Singh, B. C. Pal and R. A. Jabr, "Statistical Representation of Distribution System Loads Using Gaussian Mixture Model," in IEEE Transactions on Power Systems, vol. 25, no. 1, pp. 29-37, Feb. 2010, doi: 10.1109/TPWRS.2009.2030271.

footer