Axon and dendrite geography predict the specificity of synaptic connections in a functioning spinal cord network

Background How specific are the synaptic connections formed as neuronal networks develop and can simple rules account for the formation of functioning circuits? These questions are assessed in the spinal circuits controlling swimming in hatchling frog tadpoles. This is possible because detailed information is now available on the identity and synaptic connections of the main types of neuron. Results The probabilities of synapses between 7 types of identified spinal neuron were measured directly by making electrical recordings from 500 pairs of neurons. For the same neuron types, the dorso-ventral distributions of axons and dendrites were measured and then used to calculate the probabilities that axons would encounter particular dendrites and so potentially form synaptic connections. Surprisingly, synapses were found between all types of neuron but contact probabilities could be predicted simply by the anatomical overlap of their axons and dendrites. These results suggested that synapse formation may not require axons to recognise specific, correct dendrites. To test the plausibility of simpler hypotheses, we first made computational models that were able to generate longitudinal axon growth paths and reproduce the axon distribution patterns and synaptic contact probabilities found in the spinal cord. To test if probabilistic rules could produce functioning spinal networks, we then made realistic computational models of spinal cord neurons, giving them established cell-specific properties and connecting them into networks using the contact probabilities we had determined. A majority of these networks produced robust swimming activity. Conclusion Simple factors such as morphogen gradients controlling dorso-ventral soma, dendrite and axon positions may sufficiently constrain the synaptic connections made between different types of neuron as the spinal cord first develops and allow functional networks to form. Our analysis implies that detailed cellular recognition between spinal neuron types may not be necessary for the reliable formation of functional networks to generate early behaviour like swimming.


Table showing distribution of axons and dendrites of different neuron types.
The cord is divided dorso-ventrally into 10% bins. The table then shows the probability of axons and dendrites of neurons of each type occupying each of these bands. 2) Modelling axon growth

A. Introduction
The traditional approach to modelling axon growth is based on the idea that the growth cone follows different molecular gradients (Goodhill et al., 2004;Krottje and van Ooyen, 2007). Here we do not consider the details of growth cone navigation in steep and shallow chemical gradients; instead we build a simple computational model reflecting several key attraction and repulsion processes guiding the axon development. As specified in the main text, the model is described by a dynamical system with a random variable (see equations (1), (2) and (4)) and includes four parameters.
In this Appendix we will describe the optimization procedure which allows us to find a set of optimal parameter values. As usual optimality means minimising a cost function. Thus, starting from discussion of parameters we will define the cost function which measures the similarity between experimental axon measurements and model generated axons, describe the optimization procedure which minimises the cost function and provides the best similarity of experimental and generated model axons, report the optimal parameter values for each neuron type, and demonstrate that optimal parameter values allow us to generate model axons which are similar enough to the biological axons measured in experiments (testing procedure).

B. Model explanation
Experimental measurements have been made for different tadpoles with different spinal cord heights which are about 100 µm. To reflect that in the model, we consider axon growth in one side of the spinal cord represented as a rectangle with some height H randomly chosen in the range of admissible biological height values (about 100 µm) and length W= 1000 µm (this length is the same for all model simulations, therefore, when an experimental axon is longer than W , this axon was cut to have length exactly W= 1000 µm). Thus, we consider the rectangle H x W and growing axons are allocated inside of this field. To start an axon simulation we need to choose the initial position of the axon and initial angle, after that a process of axon generation is governed by model equations (1), (2), (4) described in the main text.
Developing the model equations, we implicitly assume that chemical gradients experienced by the growth cone are exponential, which for a single gradient would produce a constant rate of turning independent of the location within the gradient (but not independent of the current growth angle). The dependence of axon growth angle on dorso-ventral position (note, that dorso-ventral axis corresponds to vertical axis (height) of the rectangle and longitudinal location is considered to lie along the horizontal axis (length) of the rectangle) that we observe is assumed to be the consequence of interaction between at least two gradient-following processes: the noise in the axon growth angle and the tendency to grow towards some particular location. The noise component describes a random deviation of the current angle from deterministic component (see equation (4)). Thus, the noise component is a random variable uniformly distributed in the interval (-α, α), where parameter α defines the boundary for the angle deviation (Fig. S1). Thus, the noise is modelled by a uniformly distributed random variable with mean equal to zero and variance equal to Fig. S1. The deterministic direction of growth is shown by black line connecting point A and the yellow dot. The angle specifies boundaries of random deviation. The red line shows the chosen direction of growth for the next time step.
The parameter γ represents the tendency of the axon to turn towards an angle of 0 degrees -in other words the tendency of the growth cone to orient towards longitudinal growth. If γ is zero (see equation (3) in the main text) then the deterministic part of the growth angle is not changed at each time step and random deviation applies to this direction. When 0< γ <1, the deterministic component of the growth angle will decay to zero. This part of the model can be justified by experimental findings which show that this orienting process towards zero angle is dependent on the current deviation from longitudinal growth -the steeper the current growth angle, the stronger the tendency to straighten towards horizontal growth. The effect of parameter γ can also be interpreted as the consequence of a longitudinal gradient-following process, which would be expected to produce the same dependence on growth cone angle.
The parameter y represents the dorsoventral position of an attractor to which axon trajectories are drawn with a strength which can be described by parameter µ (see equation (4)). Thus, parameters µ and y characterise the interaction between two opposing gradient-following processes. The parameter y is the dorsoventral position at which these processes effectively cancel each other out. The parameter µ represents the strength of the net attraction towards y . The effects of these parameters can be interpreted as a system with two repulsive gradients, one pushing from the ventral side to the dorsal direction (we know that there is some drive here at least with the commissural neurons) and one pushing from the dorsal side to the ventral direction. The relative sensitivity of the axon to these two gradients would determine the value of the parameter y and the absolute sensitivity of the growth cone to ligands would determine the value of the parameter µ.

C. Preparation for optimization
Our goal is to simulate an axon growth process which can generate axons that are similar to biological axons which are measured experimentally. Similarities are measured using a cost function with two components: 1) similarity of distributions of axon projections in the dorso-ventral dimension and 2) similarity of tortuosities. Thus, the optimization procedure looks for a set of values of the four model parameters which provide the minimum cost function. Below we formulate the optimization procedure in detail.
The available experimental data provide measurements of spinal cord axons for each neuron type in both descending and ascending directions when both are present. We will consider aIN ascending axons as an example to demonstrate how the optimization procedure works. Available experimental data provide measurements of 10 axons from different tadpoles. The longitudinal dimension in the model was always 1000 µm however axons can be shorter or longer than this full length. Experimental measurements of dorso-ventral axon position (in micrometers) were made every 50 µm. An example of a measured axon is shown in Fig. S2. The horizontal red line (y=94.7 µm) shows dorso-ventral height boundary of the spinal cord and the vertical red line shows the longitudinal boundary of considered spinal cord measurements. The next step is the projection of axon measurements in the dorso-ventral direction and the calculation of the distribution of these projections. Because the dorso-ventral height of the rectangles varies for different axons extracted from tadpoles with different length of the spinal cord, we normalise axon coordinates before projecting them. Thus, we normalise both axes by dividing both vertical and horizontal axon coordinates by the dorso-ventral height of the spinal cord (in this example the height H=94.7 µm). Division of both coordinates allows us to keep the angle structure unchanged. Of course, after this transformation a step along the horizontal axis will be different from 50 µm however the image representation will be the same. After normalization we project all aIN ascending axon data to the vertical axis and repeat this procedure for each axon. The total number of axon measurements is ) 167 ( = e e n n and these data represent the dorso-ventral distribution of axon measurements in the interval 0-100. We divide this interval into 10 bins, count the number of measurements in each bin, and normalize it by the total number m to get a probability of finding an axon measurement in the bin. The resulting distribution is shown in Fig. S3. Moreover, we would also like the cost function to take into account the extent to which the path of the axon is circuitous rather than direct. Results from multiple model simulations suggest that tortuosity (total path length divided by straight line distance between start and end points) is an appropriate measure for this purpose. Thus, we calculate the tortuosity of each axon using the following formula: are measured coordinates of the axon, and k is the number of measurements for the axon. After that we calculate the average tortuosity of experimental axons e T . Now we would like to describe the process of axon generation. Suppose that values of the four parameters of the model are known, then we can start the process of axon generation described by equations (1), (2), and (4) in the main text. For that we need to choose initial values for variables of the dynamical system, i.e. coordinates of the starting point of the axon ) , ( 0 0 y x and the initial growth angle 0 θ . Also, we need to fix the length of the generated axon.
For generation of all axons we choose the same initial point in the horizontal axis: To choose the initial value of the vertical coordinate, we first calculate the sample distribution (10 bins for 0-100 interval) of normalised initial vertical coordinates of all experimentally measured axons and generate a random number ran according to this distribution, thus, ran y = 0 . After that, to choose the initial angle, we consider the bin of distribution where ran is and study initial angles Similarly, for the axon length we build the distribution of experimental axon length and generate the random number according to this distribution. Also, we use the same procedure to generate dorso-ventral height of the spinal cord: we build the distribution of experimentally measured dorso-ventral heights and generate the random number (ran_height) according to this distribution; thus, we have chosen the rectangle (ran_height x 1000) where all model generated axons will be allocated, i.e. we use the same rectangle to generate several axons and allocate them to the same rectangle.
After fixing all initial values and axon length we run the system of difference equations (1), (2), (4) in the main text and generate an axon. For axon generation we use step ∆ = 1 µm. To get generated axon data similar to the experimental measurements we sample model axon coordinates every 50 µm and use these sampled data for the following procedures: projection of axons, building D-V distribution, calculation of tortuosity, etc. Fig. S4 shows an example of a generated model axon for the optimal parameter values fitted to aIN ascending experimental measurements. Green line shows generated model axon with 1 µm step, markers show measurements at 50 µm steps along the horizontal axis, the same sampling as in experiments. The lower panel of Fig. S4 shows the same generated axon in the "correct" scale where vertical and horizontal axes are proportional and angles are not distorted. It is important to note that the procedure for choosing initial values and axon length involves generating random numbers. This means that repetition of the same procedure will result in the generation of a different axon with different initial values and a different length. Thus, we repeat this procedure r times (r =70), generate r axons allocated inside the same rectangle, and calculate the dorso-ventral distribution (10 bins for 0-100 interval) of all vertical coordinates of all generated axons. This distribution we denote by It is known in statistics that the upper boundary of 95% confidence interval for hypothesis testing about similarity of two distributions is 3.9. Thus, this value can serve as guidance for understanding the scale of cost function values and judging the quality of the optimization process.
The second term of the cost function is the squared difference between average experimental tortuosity e T and average model tortuosity m T . The two terms of the cost function have very different scales and to balance them we consider a weight coefficient w which makes these terms consistent and with values in the same interval. Thus, the final expression for the cost function is: It is worth noting that the cost function includes a stochastic component, therefore, repeated calculation for the same parameter values will always result in different values of the cost function. Thus, gradient based methods are not appropriate for optimization because they usually require the cost function to be smooth which it is not in our case. We use the Nelder-Mead simplex method which can deal with cost functions that are non-smooth, even if they include a stochastic component.

D. Numerical algorithm for optimization: The Nelder-Mead simplex method
To solve the optimization problem and find parameter values which provide a minimum cost function, we use the MATLAB procedure "fminsearch" which is based on the Nelder-Mead simplex method (Lagarias et al., 1998).
Let us formulate the idea of the simplex method. For two variables the simplex is a triangle, and the method is a pattern search that compares function values at the three vertices of a triangle. The worst vertex, when the function value is largest, is rejected and replaced with a new vertex. A new triangle is formed and the search is continued. The process generates a sequence of triangles (for which the function values at the vertices get smaller and smaller). The size of the triangles is reduced and the coordinates of the minimum point are found. The algorithm is stated using the term simplex (a generalised triangle in N dimensions) and will find the minimum of the function of N variables. It is effective and computationally compact.

)
, ( y x f be the function that is to be minimised.
To start we are given 3 vertices of a triangle Mid-point of the good side: Reflection using point M: Logical decisions:

E. Results of optimization and testing the optimization quality
It is important to note that the result of the optimization procedure is a random variable. This means that if we have found a set of optimal parameter values and use them to calculate the cost function several times, we will get different cost function values, because the random number generator will start from different initial values resulting in generation of different axons. Thus, we would like to test the result of optimization studying the distribution of the cost function values generated for the set of optimal parameter values.
The optimization procedure was run for each cell type and separately for their descending and ascending axons. The best values of model parameters and quality of optimization are summarised in the Table 1. To characterise quantitatively the quality of optimization we define the measure Q in the following way. One trial of the testing procedure includes the generation of 300 axons for the optimal parameter values and calculation of the cost function. We repeat this procedure 100 times, generate 100 values of the cost function, and build a histogram which we call the testing histogram (examples are given in Section 6). We denote by Q a value of the cost function such that the interval (0,Q) corresponds to 90% of the area of the histogram. We consider the optimisation procedure to be: very good quality if Q is less than 4; good quality if 4<Q<8.5;and poor quality if Q>8.5. The last two columns of Table 1 show the quantitative and qualitative characteristics of the optimization procedure for each cell type.

Sensitivity of optimal parameter values to parameter variation
Following the optimization process, we tested the sensitivity of the optimal parameter values to small variations. We consider 3 levels of variation: 5%, 10% and 20% (accordingly, the fractions of variation are Fr=0.05, Fr=0.1; Fr=0.2) and for each level, (e.g. 5%), we consider three cases for each of the four parameters: 1) the parameter value decreases by 5% (index= -1); 2) the parameter value is not disturbed (index=0); 3) the parameter value increases by 5% (index=+1   Table 2 shows the performance of the model for different degrees of parameter variation in terms of the measure Q (described above), for the case of cIN ascending axons. The first column of the table shows the sequential number of the test case (1-81), the This table indicates that for 75% of these cases at the 5% variation level, the value of Q remains below 8.5, which is our criterion for good quality results. At the 10% variation level, 42% of cases produce good quality results, and even at the 20% variation level, about 21% of cases give good quality results. In this particular case of the cIN ascending axons considered here, the poorest results occur when y is reduced (made more ventral), especially when µ is decreased at the same time, as in these cases the axons do not reach a sufficiently dorsal position to adequately match the experimental data set (see Fig. S4.3 below which shows the dorso-ventral distribution of these axons).

F. Optimization results for each neuron type and examples of generated model axons compared with measured axons
The following 10 pages contain three figures for each neuron type. The first figure shows the test histograms and the caption of this figure indicates the quality of the optimization. The second figure shows model axons (upper panel) and experimental axons (lower panel); the caption gives both experimental and model axon tortuosities. The third figure shows distributions of dorso-ventral coordinates of experimental and generated axons.