Explaining the dynamics of relatedness: The role of co‐location and complexity

Agglomeration and Social Networks Lendület Research Group, Centre for Economic and Regional Studies, Hungarian Academy of Sciences, Budapest, Hungary Laboratory for Networks, Technology and Innovations, Corvinus Institute for Advanced Studies, Corvinus University of Budapest, Budapest, Hungary University of Stavanger Business School, Stavanger, Norway CRIE, Center of Regional and Innovation Economics, University of Bremen, Bremen, Germany Department of Human Geography and Planning, Faculty of Geosciences, Utrecht University, Utrecht, the Netherlands

However, few studies have analysed relatedness as a dependent variable, and how relatedness changes and evolves over time. In fact, in most of the empirical studies, relatedness is either treated as time-invariant, or as an exogenously given component. While one might argue that relatedness is rather stable in the short run, this is less likely to be the case in the long-run (Balland, Boschma, & Frenken, 2015;Boschma, 2017;Broekel, 2015). Consequently, so far we know little about how (technological) relatedness comes into existence, and how it develops over time (Menzel, 2015). This gap motivates the present paper studying the emergence and evolution of technological relatedness. More precisely, we evaluate two dimensions' relations with the emergence of technological relatedness and its intensification over time: co-location and complexity. Making use of patent data, our empirical findings suggest that co-location of technology pairs support the emergence of relatedness in Europe during the period 1980-2010. Moreover, both co-location and the level of complexity are correlated to the intensification of relatedness over time.
The paper is organized as follows. Section 2 provides a short introduction to the concept of relatedness and its possible interplay with co-location and complexity. Section 3 introduces the data, key variables, and modelling approaches. Section 4 presents and discusses the empirical findings, while Section 5 contains several robustness checks. The paper concludes with a discussion that outlines limitations, implications, and possibilities for future research.

| DYNAMICS OF RELATEDNESS
Relatedness has become a key input to outline possible technological and economic re-combination and diversification opportunities in particular. There is a consensus that the probabilities of firms, regions and countries to enter new and specific activities is a function of the number of the related activities they are specialized in (Boschma, Balland, & Kogler, 2015;Hidalgo et al., 2018;Neffke et al., 2011). Similarly, regions and countries with access to related variety tend to outperform economically those that are either highly specialized or overly diversified (Fornahl et al., 2011;Frenken et al., 2007;Hidalgo et al., 2007). Thereby, relatedness refers to two activities (e.g., technologies, industries, occupations, research fields) being based on similar underlying knowledge, skills or other inputs (Boschma, 2017;Hidalgo et al., 2018). The present paper focuses on technological relatedness, which is defined as the degree to which two technologies are proximate in a technological or cognitive dimension Breschi, Lissoni, & Malerba, 2003;Kogler, Rigby, & Tucker, 2013;Rigby, 2015).
In the quickly growing literature using (technological) relatedness as a key explanatory variable, most studies put little emphasis on relatedness as an evolving property. In fact, we know surprisingly little about how relatedness comes into existence, and how it develops over time Menzel, 2015). In most studies on (related) diversification, relatedness is being treated as time-invariant or exogenously given. For instance, relatedness is frequently added to the empirical models as an independent variable that is constant over the observed time period (e.g., Breschi et al., 2003;Frenken et al., 2007;Neffke et al., 2011). This assumption seems reasonable when considering short time-periods, but it is much less likely to be the case over the long run Boschma, 2017;Cowan, Jonard, & Zimmermann, 2007), and in particular when technological paradigms shift (Dosi, 1982) and thereby reshuffle the technology space (Rigby, 2015). This is exemplified by the rise of electronics and biotech industries in recent years (Krafft, Quatraro, & Saviotti, 2011. Notably, in a number of studies, relatedness is modelled as being time-variant, that is, it may change between time periods (e.g., Boschma et al., 2015;Mewes & Broekel, 2020;Rigby, 2015). However, none of these studies systematically discuss how relatedness emerges and evolves over time. Nevertheless, there are several ideas that give a hint about what factors and conditions might be important for the emergence and development of relatedness.
The relevance of geographical proximity is frequently highlighted to enhance knowledge spillovers between economic agents (Audretsch & Feldman, 1996;Jaffe, Trajtenberg, & Henderson, 1993). Geographical co-location of actors facilitates localized learning processes and lays the foundation for technological (re-)combinations. Consequently, co-located actors with expertise in different technologies are more likely to engage in knowledge exchange.
It increases the likelihood that they learn from each other, which may eventually (in some instances) even lead to the (re-)combination of distinct technological knowledge. In the latter case, their geographical proximity may have helped in growing the relatedness of their respective fields of expertise (technologies). These arguments fit well those of Jacobs (1969), who argues that the diversity of ideas, knowledge, and technologies in cities (i.e., within the geographic vicinity) enables and triggers cross-fertilization and knowledge re-combination. Put differently, the colocation of experts with specific knowledge on technologies increases the chances that connections may be discovered between previously unrelated technologies and that complementarities are identified, which in turn create or intensify relatedness (Boschma, 2017;Broekel & Brachert, 2015;Desrochers & Leppälä, 2011). This is summarized in the following two hypotheses: Hypothesis 1A Co-location enhances the emergence of relatedness between technologies.
Hypothesis 1B : Co-location enhances the intensification of relatedness between technologies.
In addition to geographical proximity, we argue that another dimension which is of relevance in this context is complexity, as it is likely to influence knowledge combinatorial processes. The complexity of a technology is often understood as a function of the number of its (sub-)components and their interdependencies (Fleming & Sorenson, 2001;Simon, 1962). Combining two technologies requires a basic understanding and the mastering of both. Gaining such an understanding is more difficult the more complex technologies are (Balland & Rigby, 2017;Fleming & Sorenson, 2001;Hidalgo & Hausmann, 2009;Sorenson & Fleming, 2004). When combining multiple technologies, these difficulties will scale up with the number of technologies.
It can also be expected that it is more difficult to combine two complex technologies than it is to combine a simple technology with a complex one. One reason is that to combine multiple technologies, mastery in all involved is required, implying that the necessary learning efforts are cumulative. Another reason is that the result of the combinatorial activity, that is, an innovation or new technology, is most probably at least as complex as the most complex technology it is based upon. Consequently, combining two technologies and mastering the result thereof, go beyond the sum of the efforts required to obtain mastery in its constitutive technological elements. Significantly, this will even hold when the combinatorial process is conducted by a team. Members of the team must have a basic understanding of the technologies involved, which, as argued above, is more difficult in the cases of more complex technologies. Moreover, complex technologies are made of heterogeneous and relatively unrelated (sub-)components implying they are characterized by a greater knowledge diversity (Broekel, 2019). This implies that larger cognitive distances need to be bridged within the team. In turn, it makes co-ordination, communication and learning harder, which translates into greater efforts and higher costs of collaboration (Gross & McMullen, 1982;Lundvall, 1993;von Hippel, 1987). Consequently, due to its "cost-and-efforts" driving nature, the complexity of technologies will have a negative impact on the likelihood and frequency of the technologies being combined.
Yet, this relation might be countered by economic reasons. Technologies have a value that reflects their supply and demand. Technologies that are simple to learn can diffuse easily between economic agents through re-invention, reverse engineering, or copying. Hence, they offer little potential for competitive advantage and consequently they have relatively little economic value (Maskell & Malmberg, 1999 to replicate and imitatewhich particularly applies to complex technologieswill not diffuse easily. Complex technologies therefore offer a high potential for competitive advantage and thus, a higher level of socio-economic value (Fleming & Sorenson, 2001). In addition, Hidalgo and Hausmann (2009) argue that complex technologies are crucial stepping stones for acquiring additional technological competences in the future. This adds further to their value.
Accordingly, while complex technologies and their combinations are more likely to be associated with higher economic rewards (Dalmazzo, 2002;Hidalgo & Hausmann, 2009), they are also characterized by greater difficulties in being understood and combined. Considering this, we expect complexity to act as an (initial) barrier to technological combinations and thereby to the emergence of relatedness. When two technologies have not previously been combined, the (economic) potential of their combination is unknown, which will lower the actors' willingness and likelihood to explore it. This will change once this barrier has been overcome and the hidden potential becomes clearer. In this case, the potentially higher social-economic benefits associated with combinations of complex technologies should attract the interest of the actors and motivate them to invest. Consequently, there will be greater exploration of such combinations and quicker convergence of the two technologies in terms of their relatedness.
In sum, complexity represents a dimension of technologies that is likely to impact the likelihood and frequency of their (re-)combination and consequently to shape the evolution of their relatedness. Our second set of hypotheses summarizes this: Hypothesis 2A More complex technologies are less likely to become related.
Hypothesis 2B Once they have become related, the relatedness of two complex technologies is more likely to intensify than that between two simple technologies.
As pointed out above, despite the great interest in relatedness and complexity, little is known about their relation and their relation over time. The following empirical investigations seek to shed some light on this for the first time.

| Dataset and relatedness of technology pairs
Following much of the literature, we rely on patent data as our primary data source (Breschi et al., 2003). More precisely, we use the OECD REGPAT dataset (version 2018) covering patents registered by the European Patent Office (EPO). It contains detailed information on patents' application year, technology classes, inventors and inventor locations since 1976. The pros and cons of patent data have been discussed extensively, so we refrain from this and refer to the relevant literature (Desrochers, 1998;Griliches, 1990).
We focus on the period 1976-2010, as this period offers reliable patent information. Figure 1 1976-1980, the second 1981-1985, and so on. By pooling the data of multiple periods, we increase the stability of our measures, as patent numbers tend to fluctuate strongly between years. This is particularly the case when they are aggregated at the regional or technological level (Buerger, Broekel, & Coad, 2012).
To empirically represent technological relatedness, we first need to define the technologies. We follow existing studies and consider technologies being represented by four-digit CPC classes Broekel, 2019).
We exclude technologies starting with the letter Y, indicating cross-technological patents, 1 that is, patents that could 1 The 'Y' CPC class is only a technical category with significant overlap to other classes.
not be clearly assigned to other classes. This leaves 645 distinct technologies. Figure 1(b) illustrates that almost all of these technologies contributed patents to the data set since 1980. Accordingly, four-digit CPC classes provide a consistent representation of technologies over this period.
There are many ways in which technological proximity can be estimated. Many studies use the information on technologies jointly appearing on patents, so-called co-occurrences, as an indication of combinatorial innovation processes and technological distance (Breschi et al., 2003). We follow this literature and concentrate on the joint appearance of technologies (four-digit CPC) classes on patents.
The 645 technologies translate into 207,690 potential technological combinations (excluding self-combinations), which serve as units of observation. For each of these pairs, we count the number of occurrences on patent documents within each of the seven time periods with no geographical restriction (including all the patents in OECD REGPAT from all over the world). More precisely, we count how frequently different four-digit CPC classes are combined on patents during the given period, thus abstracting from the same combinations of four-digit classes on the same patent for the purpose of simplification. In general, each patent represents a possibility of two technologies co-occurring in the data. Consequently, the likelihood of two technologies jointly appearing on patents is primarily determined by the size of the two technologies, that is, by the number of patents that are assigned to these technologies (jointly and independently).
In the literature, measures of relatedness are often constructed in such a way that they account for this size effect. For instance, some studies employ the Cosine index (Breschi et al., 2003;Ejermo, 2003) or alternative measures of co-occurrence standardization (Kogler et al., 2013;Mewes, 2019). However, specifying the correct standardization/normalization is far from straightforward (van Eck & Waltman, 2009).
We are not so much interested in which technologies are significantly related in statistical terms, and which are not. We simply seek to approximate technological distance between two technologies in a continuous manner.
Therefore, we will account for this size effect in the later regression analysis, following roughly the idea of Neffke (2009) and Neffke et al. (2011).

| Co-location of technologies
As previously discussed, we expect technologies to become more related when they frequently co-locate. Empirically, this corresponds to the overlap of their spatial distributions. Empirically, we first assign patents to the NUTS 2 regions of their inventors' addresses. Accordingly, patents with multiple inventors are assigned to multiple regions, without counting fractions. Next, we obtain the spatial distribution of a technology's patents, by counting the number of its patents assigned to each region. On this basis, we assess the overlap in two technologies' distributions with the technology pair level co-agglomeration measure proposed by Ellison, Glaeser, and Kerr (2010). For two technologies i and j, the measure looks as follows: where P ir is the number of patents in technology i in region r and s ir = Pir P R r 0 = 1 P ir 0 is the share of technology i in region r (and for j accordingly). x r is the mean of these shares in region r across all technologies. The index has been used in multiple studies investigating industrial co-agglomeration (Diodato, Neffke, & O'Clery, 2018;Ellison et al., 2010). Its main advantage is that it is largely invariant to the distribution of inventors, applicants, and the type of spatial units considered (Diodato et al., 2018).
We calculate the co-location of the 207,690 technology pairs in all seven periods. To reduce the noise in the data, we set the patent counts to zero for all regions with less than ten patents in a particular technology. Moreover, we exclusively consider the 316 EU and EFTA NUTS 2 regions with sufficiently large patent numbers and the patents associated to these. 2 To make the co-location score comparable across technologies, we calculate its z-score.
Detailed descriptive statistics of the COAGGLOM variable can be found in Table 1.

| Technological complexity
Empirically approximating technological complexity is a challenge, as the literature offering few approaches that are applicable to the data at hand. Balland and Rigby (2017) transferred the idea of the economic complexity index of Hidalgo and Hausmann (2009) to patent data and calculated an index of technological complexity. In essence, the method utilizes information on the extent to which technologies are ubiquitous, since few regions specialize in them, and how diversified regions are; how many technologies do they specialize in? By means of an iterative procedure, the index sets these two aspects into relation and measures the complexity of technologies by their tendency of coconcentration with other ubiquitous technologies.
Earlier, Fleming and Sorenson (2001) employed an number of components/degree of interdependence between components (N/K) model to derive a technological complexity index based on the frequencies with which patent subclasses co-occur on patents. The underlying argument is that more complex combinations are harder to realize and thereby less likely to be observed. The difficulty of (re-)combining two classes is empirically assessed by evaluating the currently observed co-occurrence of patent subclasses on patents in relation to the cumulative frequency of their co-occurrence in previous years.
While both approaches have their merits, Broekel (2019) argues that some of the core assumptions underlying these measures are problematic. Moreover, he shows that both indices fail in resembling several stylized facts commonly associated with technological complexity. However, more importantly, in our analyses both indices carry the danger of being very closely related to key variables in the model. The complexity index of Balland and Rigby (2017) is based on spatial co-location of technologies, which is at the heart of the measure of co-agglomeration. Fleming and Sorenson's (2001) approach utilizes the frequency of patent class co-occurrences on patents as a central ingredient, which is the basis of our technological relatedness measure. To avoid the potential of (artificial) empirical overlap, we turn to an alternative measure of complexity that does not raise these concerns.
We adopt the structural complexity index developed by Broekel (2019). It measures technological complexity by modeling technologies (four-digit CPC classes) as combinatorial networks of (knowledge) components. While in this, co-occurrence information of patent classes on patents is used as well, in this case, these networks are dichotomized eliminating large portions of the frequency information, which is central in our measure of technological distance.
Broekel (2019) argues that the combinatorial networks of complex technologies are characterized by a greater diversity of (sub-)network topologies. Network science shows that a greater diversity implies a larger information content of these networks. Consequently, it can be argued that this demands greater R&D efforts for the invention of such technologies and also represents greater obstacles to them being learned or copied. Both aspects make complex technologies more exclusive and therefore potentially more economically valuable (Broekel, 2019).
To capture this diversity, Broekel (2019) proposes the network diversity score of Emmert-Streib and Dehmer (2012), as empirical approximation of structural diversity of technologies that is, as empirical measure of technological proximity.
This score is estimated as follows. For each technology T (four-digit CPC class), its combinatorial network is constructed in period t, by extracting all patents with at least one patent subclass (ten-digit CPC subclass) belonging to technology T. Next, the co-occurrence matrix of all patent subclasses appearing on these patents is created and dichotomized with all positive entries being one, and all others remaining zero. This matrix represents the combinatorial network of technology T in which complexity is assessed by means of the network diversity score (NDS). In the estimation of the NDS, multiple subsamples are drawn from the networks' main component. For each subsample i, the share of modules (ɑ module ), the variability of module sizes (v module ), the variability of the Laplacian matrix (V Laplacian ), and the relation of graphlets of size three and four (r graphlets ) are calculated and the individual NDS score is estimated as: The iNDS is subsequently averaged over sample networks giving the NDS for this technology's combinatorial network. It is log-transformed and multiplied by −1 to obtain the final complexity measure of structural diversity, which signals higher complexity with larger values. For a more detailed introduction to the measure see Broekel (2019).
In practice, we use the values of structural complexity given in Broekel (2019). 3 These are annual complexity values for each of the 645 four-digit CPC classes. As structural complexity scores are relatively stable over short time (Broekel, 2019), we used the complexity values of technologies given at the end of each period.
As discussed in Section 2, we expect complexity to influence the likelihood and the frequency of two technologies being combined, that is, co-occurring on patents. Crucially, this influence may be negative, as it is more difficult to combine complex technologies. However, complexity may also have a positive impact by signalling the potential of greater rewards associated to the combination of two complex technologies. In any case, it is less the individual complexity values of technologies which matter most, but rather their joint configuration. We model this with a variable representing the sum of structural complexity values of two technologies (COMPLEX_SUM). Figure 2(a) gives a first impression on this by visualizing how, potentially, all technology pairs' complexity values potentially align. The figures suggest that most of the realized combinations are pairs with mid or higher-value of complexity, which supports our theoretical argumentation of complexity positively correlating to the (economic and technological) potential of combinations of technologies. To control for the possibility that the sum of two technologies' complexity may be relatively high as a result of a combination of a highly complex and a rather simple technology, we also include the absolute difference of two technologies' complexity values (COMPLEX_ABSDIFF) in the models. Descriptive statistics on both complexity variables can be found in Table 1. As we are interested in both the emergence and intensification of relatedness between technology pairs, we use two different estimation strategies. First, to observe the relationship between the first appearance of a technological combination and the level of co-location and the level of technological complexity, we rely on logit models similar to the entry models frequently used in economic geography and related fields to study diversification processes Boschma et al., 2015). For this exercise we filter the dataset to those technology pairs that are not present in every period and become related at some point during the observed time. In other words, we restricted observations with zero relatedness in every period and higher than zero relatedness in every period, which results an unbalanced panel setting.

| Empirical model
Our model specification for the logit regression is the following: where our dependent variable y ijt takes the value of zero when there is no observed co-occurrence of technology i and effects constructed by the concatenation of the first letter of the two technology codes combined and η t is a period fixed effect. Detailed descriptive statistics regarding this specific model setting in comparison to the entire dataset is represented in Table 1. Correlation between the main variables are presented in Table A1 in the Appendix.
Second, we explore the relationship between the changes in the strength of technological relatedness and the colocation and complexity of technologies. To do so we use panel OLS models on the growth rates of technological relatedness. For this second exercise our focus is only on technology pairs with higher than zero combinations at time t.
For the growth models, we use the corresponding specification: where y ijt is the rate of relatedness growth from t to t + 1. log(y ijt ) is the level of relatedness at time t, COAGGLOM, COMPLEX_SUM and COMPLEX_ABSDIFF denote the characteristics of the technology pair i and j in period t. N it and N jt are the number of patents in technology i and j at time t as in gravity model specifications. τ t is a technology fields effects constructed by the concatenation of the first letter of the two technology codes combined and δ t is a period fixed effect. We applied several further estimations to improve the robustness of our results. We summarize these robustness checks in Section 5.    198,193 198,239 198,109 198,069 Notes: The regressions are based on all the combinations that appear at least once during the 1980-2010 period and not observed in each period. 'Technology fields FE' refers to a control variable constructed by the concatenation of the first letter of the two technology codes combined. We also clustered standard errors by these groups. Standard errors in parentheses, * p < 0.1; ** p < 0.05; *** p < 0.01. variables, the number of patents of the considered technologies (No. Patents CPC1, CPC2) are significantly positive and their coefficients are relatively large in all four models. The finding reflects that technologies characterized by many patents have more opportunities to co-occur on patents.

| RESULTS
The first key variable is COAGGLOM approximating the extent to which concentrations of technology 1 tend to appear in the same regions as technology 2. The positive and significant coefficient for COAGGLOM in model (1) suggests that the more two technologies are co-located, the more likely they are related. In model (2) COM-PLEX_SUM is introduced and its coefficient is insignificant. Consequently, the complexity of two technologies does not have explanatory power of whether they will become related over time. In model (3) COMPLEX_ABSDIFF is added. In contrast to COMPLEX_SUM its coefficient is significant, but negative. Accordingly, the more two technologies differ in their levels of complexity, the less likely it is that they will become related. Model (4) contains all of our main explanatory variables. It shows that their signs and significances do not change when they are simultaneously included. These findings support our first Hypothesis 1A: geographical co-location of technologies facilitates the emergence of relatedness. In contrast, model (4) suggests rejecting our Hypothesis 2A: complexity does not seem to be crucial for the emergence of technological relatedness. What is striking about the results is that the two significant coefficients COAGGLOM and COMPLEX_ABSDIFF are of almost similar magnitude, implying that both dimensions are of somewhat equal empirical importance in this respect, albeit co-agglomeration being of somewhat higher relevance.
The second set of models focuses on the explanation of the intensification of relatedness, that is, its development after the first connection between technologies has been made. Table 3 represents the results of the growth model specified in Equation 4. Again, we include our main variables step-by-step from models (5)   empirical support to the hypothesis H1B that the relatedness between technologies becomes stronger in case they are strongly co-located. In model (7), COMPLEX_SUM is added and model (8) COMPLEX_ABSDIFF. The significantly positive coefficient of COMPLEX_SUM suggests that the complexity of two technologies significantly correlates to the growth of their relatedness. This holds even after controlling for their differences in terms of their complexity (COMPELX_ABSDIFF). It thereby shows that technologies at the upper end of the complexity spectrum are more likely to intensify their relatedness than those at the lower end. The finding backs Hypothesis 2B and provides some support for our argument that the larger the economic rewards associated to the combination of complex technologies, in contrast to less complex ones, lead to more activities being (successfully) undertaken in that respect.

| ROBUSTNESS CHECKS
We applied a variety of robustness checks to validate our results. As the emergence of relations between technologies is a binary variable, logit regression with a large number of fixed effects with few time periods can lead to biased results because of the prominent incidental parameters problem (Mewes & Broekel, 2020;Neyman & Scott, 1948).
Therefore, we re-estimate the binary (entry) regression models as linear probability model (LPM). The results are shown in Table A2 in the Appendix. By and large, they support our findings.
Our logistic models only focus on the emergence of relatedness (first co-occurrence). To validate the findings, we also estimate models for the disappearance of technological relatedness, i.e. no more co-occurrences (see Appendix Table A3). To identify this disappearance event, we exclude all technology pairs with strictly positive co-occurrence counts in all periods (RELATEDNESS > 0) and those that never occur (RELATEDNESS = 0) during the examined periods. The disappearing dummy is then constructed such that it is one if in any of the periods we observe a positive number of co-occurrences and simultaneously, in a subsequent period, none. In these models, we find the coefficient of COAGGLOM to be significantly positive. However, its level of significant is smaller than in the case of the first-co-occurrence model. More importantly, the coefficient of COM-PLEX_SUM is significantly negative. This suggests that the more complex two technologies are, the less likely it is that their combinations will disappear. This is in line with our arguments that once complex technologies are combined and their relatedness is discovered, they represent economically valuable technology combinations that tend to be used continuously in the future.
In an alternative specification, we combine the two models (emergence and intensification of relatedness) into a zero-inflated negative binomial (ZINB) regressions model, which addresses the issue of the access of zeros in our dependent variable (Figure 1(c)). The result of the ZINB regression are shown in Table A4 in the Appendix. As expected, they confirm the earlier findings.
Even though our main variables and estimations are based on different (temporal) subsets of the patent database, endogeneity is a crucial concern. Frequently, this is addressed by means of an instrumental variable regression.
However, our empirical observations are based on technology pairs, which complicates the identification of proper instrumental variables. Nevertheless, to gain a better understanding of the potential causal directions of the observed relations, we follow the approach of Broekel (2015) who applies an empirical strategy in a similar setting that is popular in the firm growth literature (see, e.g., Coad, 2009). In Table A5 in the Appendix, we transform our central variables (RELATEDNESS, COAGGLOM, COMPLEX_SUM and COMPLEX_ABSDIFF) into annual growth rates and use these as explanatory and dependent variables in different OLS regression settings. More precisely, we regress onto a focal variable's growth in t, its level in t, and the growth rates and the respective levels of all the other variables in t. We find that while RELATEDNESS explains COAGGLOM and both complexity variables in all models, colocation and the complexity of technology pairs seems to be not linked in most models. Notably, none of the models contradicts our previous findings, which highlight their robustness. However, they contain some additional insights, which we will address in more detail in the concluding section.

| CONCLUSION
This paper seeks to explore the evolution of technological relatedness over time and to identify what factors contribute to this. Thereby, we shed light on an issue that received little attention so far in studies on relatedness in economic geography and innovation studies: relatedness is usually treated as independent, almost exogenous factor shaping the co-location of technologies (e.g., Kogler et al., 2013;Rigby, 2015), rather than a endogenous structure that requires explanation in itself.
We argue that geographical co-concentration of technologies provides mechanisms that impact the evolution of their relatedness. We provide empirical support for this and show that the more the spatial distributions of two technologies overlap, the more likely they become related, and the more this relatedness tends to grow in the future. This suggests that mechanisms such as local knowledge spillovers enhance the emergence and intensification of technological relatedness. This finding is important for the interpretation of outcomes in studies on related diversification, which suggest an impact of relatedness on the location of industrial, technological, and economic activities (Kogler et al., 2013;Neffke et al., 2011). The present study emphasizes that this process is most likely bi-directional: colocation drives the development of relatedness, and vice versa. Knowledge spillovers stimulated by co-location contribute to (re-)combinatorial invention processes, which facilitate technologies converging in terms of relatedness.
A somewhat different role seems to be played by technological complexity in this context. In contrast to geographic co-location, it we do not find indications of its influence on the emergence of relatedness. However, once complementarity between technologies has been achieved (i.e., a minimum level of relatedness has been reached), we show that complex technologies are more likely to be combined subsequently. We interpret this is to be driven by the higher economic incentives associated to the combination of complex technologies.
Our paper has several implications for future research. First, while our study makes a first step to increase our understanding of how technological relatedness emerges and evolves over time, we need more research to fully comprehend the evolution of technologies and their relations over time (Castaldi, Frenken, & Los, 2015;Desrochers & Leppälä, 2011). This includes gaining more knowledge about where and when re-combinatorial processes create new patterns of relatedness and what type of benefits are related to this (Pinheiro, Alshamsi, Hartmann, Boschma, & Hidalgo, 2018).
Second, our study has shown that differences in technologies' levels of complexity can act as barriers to the development of technological relatedness, while two highly complex technologies are more likely to intensify their level of relatedness over time as compared to two less complex ones. Still, these are early insights requiring further validation in future research. For instance, it might be the case that simple and complex technologies represent different underlying knowledge infrastructures and human capital, which restrict their complementarity and consequently their integration.
In this sense, complexity might reflect fundamental differences between technologies. Put differently, in the extremes, it might be another way of identifying unrelatedness. This certainly calls for additional research.
Third, our study points towards endogenous relations shaping the interplay of relatedness, co-location, and complexity of technologies. For instance, our results indicate that the relation between co-location and relatedness is bidirectional: co-location is linked to the emergence and intensification of relatedness, and vice versa. Our study thereby complements the rich literature on relatedness that has identified relatedness as being a crucial driver behind the spatial distribution of industries and technologies (e.g., Hidalgo et al., 2007;Neffke et al., 2011;Rigby, 2015). Our results suggest that this relationship is more complex and characterized by the existence of coevolutionary dynamics between the spatial concentration, complexity, and relatedness of different technologies. This has hardly been addressed in the literature. However, the empirical approaches employed do not provide causal evidence on which dimension is driving which. To establish this, future research needs a more elaborate approach utilizing causal modelling strategies. While we have made an attempt to get first insights into the magnitude and direction of these relations, we are far from disentangling the interplay of complexity, co-location, and relatedness. Undoubtedly, such endeavours are crucial, as all these dimensions are applied and used to an increasing extent in current regional innovation policies ( Finally, while the paper provides insights on the evolution of relatedness and explored the role of several factors therein, it leaves a universe of other dimensions unexplored. The consideration of other factors such as the role of institutions or policy interventions will surely add to our understanding of the underlying mechanisms. Moreover, our empirical study relies exclusively on patent data. Despite the many advantages of this approach, it is undoubtedly advisable to repeat and extend the analysis using other types of data in the future (Boschma, 2017).

ACKNOWLEDGEMENT
Sándor Juhász acknowledges financial help from the Hungarian Scientific Research Fund (OTKA K-129207).
[Correction added on 7 October 2020, after first online publication: funding information has been added to this article.]