VIII. Cluster Analysis
VIII. Cluster Analysis
Cluster analysis aims to group similar entities into clusters. It classifies individuals into groups as homogeneous as possible based on observed variables.
The cluster analysis was performed for all 127 countries according to their values in LP, PPR and IPR. Additionally, we included illustrative variables that do not influence the formation of the cluster but will bring an important contribution to describe them. Those variables were the ones we used to calculate correlations (chapter VII), mainly to expose the conditions or features in the resulting clusters.
In order to seize the variability in the analysis -given the great differences among the countries in the IPRI- we used Ward's Method with squared Euclidean distance that groups countries with minimal loss inertia.
In a first moment, a Principal Component Analysis (PCA) was applied with the aim of handling variables by factors, given the high correlation among them. The results of the PCA express that the three components of the IPRI (LP, PPR, IPR) define a dimension, that was called IPRI, which collects 85.90% of the inertia. The second and third factors - with inertias of 9.64% and 4.46% respectively - are the residue of the inertia. These entities do not contribute to the first factor inertia and are generally very close to the origin of the first factor. They could be subdivided into groups more associated to the PPR dimension –defining the second factor – and those more associated to LP and IPR defining the third factor.
Next, we used the mobile centers algorithm to show the inertia within groups and the criteria to decide the optimal number of classes or clusters (see Table 15).
Table 15. Cluster Analysis
The analysis showed that the three clusters were sufficient to explain the grouping of countries, more specifically, where the observed inertia within each group does not exceed the inertia among groups. In this sense the clusters are formed as shown in Table 16 and illustrated in Figure 26.
Although the first factor contains 85.90% of inertia, which is enough to illustrate the formation of the clusters, Fig. 26 illustrates Factors 1 and 2 as well as the three clusters centroids (yellow). Cluster 1(red) displays countries located in the negative coordinates of the first factor includes countries with low values of the LP, PPR and IPR. Cluster 2 (green) includes countries placed very close to the origin, showing average values of the LP, PPR and IPR. Cluster 3 (blue) contains countries located in the positive coordinates of the first factor and its members are linked to high values of the LP, PPR and IPR. The second factor consists mostly of countries in Cluster 2, including those whose scores are very close to the average, including both neighboring countries between Cluster 2 and Cluster 1, and those neighboring Cluster 2 and Cluster 3. Cluster 1 and Cluster 3 are outright opposites and their individuals are not directly associated with each other.
It is important to emphasize that in comparing this year’s clusters with those in the previous edition (IPRI 2016) we find a significant translation of most of the countries to an improved position (see also Fig. 16). Therefore, it is expected that the cluster’s centroids will move to the right, as it has occurred in this IPRI edition. This situation explains the fact that some countries that in 2016-IPRI were in Cluster 3, now appear in Cluster 2, while showing similar or even improved scores, but with a lesser improvement than the average of the Cluster. Clear examples of this situation are: Chile, Czech Rep., Malta, Portugal and South Africa which last year belonged to Cluster 3 and this year belong to Cluster 2, all of these countries improved their IPRI scores.
Besides the clusters, Figure 26 also shows the contribution of each country explaining the inertia gathered by the factors, hence the bigger the dot size representing the country, the higher its contribution. Very close countries show how they are similar and how they differ as the distance increases between them.
In the central circle are those countries that have no-statistically significant contribution to the definition of the factors, and as it has already been mentioned that they are close to the average and are mostly members of Cluster 2. In addition, arrows represent each of the three dimensions of the IPRI, their definite direction indicates the direct relationship with the individuals, i.e., as countries are in the same direction of the vector, countries tend to have a closer relationships with this dimension; and as a country direction diverts from the vector, the relationship between the country decreases to point of being contrary to it. This can be exemplified with the case of Brunei Darussalam, which is totally opposite to the direction of vector PPR which coincides with its low score in this sub-index.
Subsequently, clusters composition using income, population, participation in economic and regional integration agreements and regional and development criteria are shown in Fig. 27a-27d, where font size represent the frequency of the groupings in the cluster.
The analysis of each cluster can describe the internal characteristics of the countries within it. In this regard Table 16 exhibits the features that are statistically significant in each group. Additional statistics are shown in Table 17 and Appendix IV.
Figure 27.Clusters’ Members and Centroids. Factor 1 and Factor 2.
Figure 27a. Cluster Composition by Income Classification
Figure 27b. Cluster Composition by Regional and Development Criteria
Figure 27c. Cluster Composition and Population weight (thousands)
Figure 27d. Cluster Composition by Economic and Regional Integration Agreements
Table 16. Cluster Statistics
Table 17. Illustrative Variables. Averages by Clusters
Table 18. Regional Integration Agreements and Cluster
VIII.1. Cluster Description
Cluster 1 is composed of 59 countries with a population of more than 1.9 billion people. The country closest to its centroid is Algeria, followed by Egypt, Macedonia, Kazakhstan and Argentina. Cyprus is by far the most remote country of the Cluster, followed by Yemen, Brunei Darussalam, Bangladesh, Moldova and Venezuela.
A close look at Cluster 1 and the country coordinates reveal that Tunisia and Tanzania are the closest to the Cluster 2 Centroid. Looking simultaneously to Cluster 1 and Cluster 2, the closest countries are Tunisia (Cluster 1) and Mexico (Cluster 2), which signifies similarity in conditions (see Fig. 26).
Countries in Cluster 1 are statistically significant for LP, PPR and IPR components with low scores in each category. The same is true for the Gender component and the IPRI-GE. Cluster 1 countries also show low levels in all the dimensions we analyzed, that is, they show poor performances in Economic outcomes, Human Capabilities, Social Capital, Research and Innovation, Ecological Performance and Liberties. We may hypothesize that this is the result of the lack of policy to improve key elements such as entrepreneurship, social opportunities, levels of liberty, social capital, or research and development.
Under the regional and development classifications of the IMF and the income groupings of the World Bank, the Sub-Saharan Africa group and the Upper-Middle-Income, Lower-Middle-Income and Low-Income groups are highly represented in this cluster.
The Southern African Development Community (7/10 members) and the Economic Community of West African States (7/8 members) have most of their members in this cluster; followed by Organization of the Petroleum Exporting (6/10 members) and the Commonwealth of Independent States (all members).
Cluster 2 is composed of 43 countries with a population of more than 4.1 billion people. The country closest to its centroid is Jamaica, followed by Poland, Morocco, Saudi Arabia and China. South Africa is the farthest country from the centroid, followed by Israel, Guatemala, Indonesia and Greece. It is important to note that the most populous countries in the world, China and India, are included in this cluster, both very close to its centroid. While Figure 26 illustrates that Brazil is the country closest to the centroid of Cluster 1. Those closest to Cluster 3 are Israel, Chile, Malta and Czech Republic. Chile (Cluster 2) and Estonia (Cluster 3) are the closest countries between the clusters.
As Cluster 2 is very near to the origin of the factors axes (the distance of the first factor to the centroid is 0.38237), this gives rise to non-significant results for most of the variables, as most of the results are very close to average values.
Under the regional and development criteria of the IMF, Latin America and the Caribbean, and Advanced economies are highly represented in this cluster; whereas by the income criteria of the World Bank, the High-Income and Upper-Middle-Income countries exhibit the highest frequency in the cluster. Following the perspective that focuses on economic and regional integration agreements, we can see that the OECD (13/35 members) and the European Union (with 11/28 members) have the highest frequency in Cluster 2. At a lesser frequency we find countries of the Pacific Alliance (all members).
Cluster 3 is composed of 25 countries with a total a population of more than 848 million people. The country closest to its centroid is Austria, followed by Australia, Canada, United Kingdom and the Netherlands. The farthest country of the group is Taiwan, followed by Qatar, France, Estonia and the United Arab Emirates. Estonia is the closest country to Cluster 2.
Compared to Cluster 1, countries belonging to Cluster 3 exhibit opposite results: all the variables are significant, but with positive and high values, showing good performances in Economic outcomes, Human Capabilities, Liberties, Social Capital, Research and Innovation, and Ecological performance, with positive results in human development, liberties and opportunities for their citizens.
Using the regional and development criteria of the IMF, the Advanced Economies group is highly represented in this cluster. By the Income criteria of the World Bank, the High-Income group in the only one represented in this cluster. Looking at economic and regional integration agreements, the OECD (20/35 members) and the European Union (12/28 members) are highly represented in Cluster 3, followed by the Trans-Pacific Partnership (6/12 members).
When speaking on economic and regional integration agreements, the following should be noted: Of the 127 countries included in the IPRI-2017 selection, there are 13 that do not belong to any of the agreements chosen, 58 that belong to only one agreement, 50 countries that are members of two of them, and there are 5 countries that are members of three integration agreements, and one that is part of 4 of them. Also, there is a great disparity in the number of countries that are part of the agreements, some with many members (OECD has 35 members and EU has 28 members), others with just a few.
The Organization for Economic Co-operation and Development, European Union, Association of Southeast Asian Nations, Organization of the Petroleum Exporting Countries and the Trans-Pacific Partnership have members in the three clusters. The members of The Central African Economic and Monetary Community, Pacific Alliance, Commonwealth of Independent States, Caribbean Community and European Free Trade Association, belongs only to one cluster. The rest of the agreements have members in two clusters in different proportions.
The data suggests that most of the chosen integration agreements demonstrate some level of heterogeneity in terms of the strength of the property right systems among their members. In presence of homogeneity it would be easier for an integration agreement to promote common policies to enhance the strength of property rights. Heterogeneity could also be seen as an advantage, as the policies could be targeted to specific members of the agreement.
On the other hand, the integration agreements showing members in just one cluster reveal homogeneity amongst their countries’ property right systems. Even those agreements participating in two clusters show members in cluster boundaries and could be seen as a possible transition from one cluster to the other.
In conclusion of the cluster analysis we find that:
We used the statistical software SPAD® which allows the inclusion of illustrative variables in the analysis.
Ward’s Method joins cases looking for minimizing the variance within each group, creating homogeneous groups. First, it calculates the media of all variables in each cluster, then the distance between each case and the cluster’ media, that will be added. Subsequently, clusters are grouped in a way to minimize increases in the sum of distances inside each cluster.