LINEAGE ANALYSIS USING RETROVIRAL VECTORS: INTRODUCTION AND METHODS

Constance L. Cepko *

Elizabeth Ryder 1

Christopher Austin 2

Jeffrey Golden 3

Shawn Fields-Berry *

John Lin 4

Chris Walsh5

Santiago Rompani*

 

 

*            Department of Genetics, Harvard Medical School and Howard Hughes Medical Institute, Boston, MA 02115

1            Department of Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609

2            NHGRI, NIH, Bethesda, Md. 20892

3. Department of Pathology, Children's Hospital of Philadelphia, Philadelphia, PA 19104

4. Rinat Neuroscience Corp. Palo Alto, Ca.

5. Department of Genetics, Children’s Hospital, Boston, Ma. 02115

 

INTRODUCTION

Knowledge of the geneological relationships of cells during development can allow one to gain insight into when and where developmental decisions are being made. Geneological relationships can be revealed by a variety of methods, all of which involve marking a progenitor cell and/or a group of cells and then following the progeny. We have used retroviral vectors to mark individual progenitor cells with genes that can be detected histochemically, allowing for identification of the progeny of a single progenitor cell. An overview of the relevant aspects of the retroviral life cycle, and the strategies and current methods in use in our laboratory are described. One of our current projects is also summarized below.

TRANSDUCTION OF GENES VIA RETROVIRUS VECTORS

               A retrovirus vector is an infectious virus that transduces a non viral gene into mitotic cells in vivo or in vitro [1]. These vectors utilize the same efficient and precise integration machinery of naturally-occurring retroviruses to produce a single copy of the viral genome stably integrated into the host chromosome. Those that are useful for lineage analysis have been modified so that they are replication incompetent and thus cannot spread from one infected cell to another. They are however faithfully passed on to all daughter cells of the originally infected progenitor cell, making them ideal for lineage analysis.

               Retroviruses use RNA as their genome, which is packaged into a membrane-bound protein capsid. They produce a DNA copy of their genome immediately after infection via reverse transcriptase, a product of the viral pol gene which is included in the viral particle. The DNA copy is integrated into the host cell genome and is thereafter referred to as a "provirus". Integration of the genome of most retroviruses requires that the cell go through an M phase [2], and thus only mitotic cells will serve successfully as hosts for integration of most retroviruses. (However, there is a more recent generation of retrovirus vectors based upon HIV, a lentivirus,  [3], which can integrate into postmitotic cells. As lineage analysis is designed to ask about the fate of daughter cells, infection of postmitotic cells is not desirable. However, delivery of a lentiviral vector to tissue where the majority of cells are mitotic is a reasonable strategy and one that we have taken.) Most vectors began as proviruses that were cloned from cells infected with a naturally-occurring retrovirus. Although extensive deletions of proviruses were made, vectors retain the cis-acting viral sequences necessary for the viral lifecycle. These include the packaging sequence (necessary for recognition of the viral RNA for encapsidation into the viral particle), reverse transcription signals, integration signals, viral promoter, enhancer, and polyadenylation sequences. These latter elements for viral transcription can be modified such that the integrated virus does not have active transcription from the viral promoter, and instead uses an internal cellular promoter. A cDNA can be expressed in a vector using the transcription regulatory sequences provided by the virus, or the aforementioned cellular promoter elements. Since replication-incompetent retrovirus vectors usually do not encode the structural genes whose products comprise the viral particle, these proteins must be supplied through complementation. The products of the genes, gag, pro, pol, and env are typically supplied by "packaging" cell lines or cotransfection with packaging constructs into highly transfectable cell lines (for review see Cepko and Pear in Ausubel et al., 1997 [4]). Packaging cell lines are stable lines that contain the gag, pro, pol, and env genes as a result of the introduction of these genes by transfection. However, these lines do not contain the packaging sequence on the viral RNA that encodes the structural proteins. Thus, the packaging lines, or cells transfected with packaging constructs, make viral particles that do not contain the genes gag, pro, pol, or env. Alternatively, as plasmid transfection into lines such as 293 cells is so efficient, viral stocks can be made by cotransfection of the vector plasmid, along with plasmids encoding the packaging genes.

               Retrovirus vector particles are essentially identical to naturally occurring retrovirus particles. They enter the host cell via interaction of a viral envelope glycoprotein (a product of the viral env gene) with a host cell receptor. The murine viruses have several classes of env glycoprotein which interact with different host cell receptors. The most useful class for lineage analysis of rodents is the ecotropic class. The ecotropic env glycoprotein allows entry only into rat and mouse cells via the ecotropic receptor on these species. It does not allow infection of humans, and thus is considered relatively safe for gene transfer experiments. The first packaging line commonly in use was the y2 line [5]. It encodes the ecotropic env gene and makes high titers of vectors. However, it can also lead to the production of helper virus (discussed below). A second generation of ecotopic packaging lines, yCRE [6], GP+E-86 [7] and yE [8] have not been reported to lead to production of helper virus to date. A third generation of "helper-free" packaging lines, exemplified by the ecotropic lines, Bosc23 [9] and Phoenix (Gary Nolan, Stanford University), were made in 293T cells, and have the advantage over the earlier lines of giving high titer stocks transiently after transfection. Similarly, cotransfection of 293T cells with packaging constructs and vectors can lead to the transient production of high titer stocks [10]. The first two generations of packaging lines, which are based upon mouse fibroblasts, require production of stably transduced lines for production of high titer stocks. They are typically not used any longer due to the ease and efficiency of transient transfection protocols using 293 cells.

               For infection of non-rodent species, an envelope glycoprotein other than the ecotropic glycoprotein must be used to allow entry into the host cells. The one that endows the greatest host range is the VSV G glycoprotein, which allows infection of most species, including fish [11]. The G protein apparently also makes for a more stable particle, which allows for greater concentration of the virus preparations. For lineage analysis of avian species, packaging lines and vectors based upon avian retroviruses are available [12-14]. In addition, we recently found that avian retroviruses with the VSV G protein on their surface were more efficient at infecting chick embryos Figure 1 and [15]. Such virions gave the same titer as those with the avian A type env protein when they were titered on avian cells in vitro. However, when injected in vivo, the VSV G carrying particles give an approximately 350 fold more efficient infection, as judged by the number of clones in the retina, than the particles carrying the avian A env protein. Similar increases in efficiencies were noted throughout the embryo. We interpret these data to mean that cells within the avian embryo do not express high enough levels of the receptor for the A type env to be readily infected, but are not limited with respect to the ubiquitous phospholipid receptor for the VSV G protein [16]. We did not see this effect of increased efficiency of infection using murine vectors and murine embryos, but it is possible that different mouse strains vary in this regard. Gaiano et al. found that murine virions carrying VSV G gave a slightly different spectrum of cell types following infection of the early mouse brain than virions carrying the murine ecotropic env [17].

               Two other parameters to be considered when choosing a vector for lineage analysis are the reporter gene and the promoter that drives its expression. The reporters that have been used include cytoplasmic lacZ [18], nuclear lacZ [19], human placental alkaline phosphatase (PLAP) [20], and avian gag [21]. More recently, there are vectors encoding green fluorescent protein (GFP) [22] and many of its derivatives, and/or other fluorescent proteins, such as dsRed.. We have found advantages and disadvantages in the use of each of these reporters. When deciding which reporter to use, we first consider the background activities in the tissue of interest. Although the lacZ gene gives a stable, reliable, and specific signal in most cells using Xgal detection [23], there is problematic b-galactosidase background in a few tissues. Control staining with Xgal thus should be done to determine if it is a problem in an area of interest. Changes in the fixation and staining conditions can reduce b-galactosidase background [24]. Similarly, PLAP staining is reliable and stable, with heat treatment of the infected tissue rendering most endogenous alkaline phosphatases inactive. However, in some tissues residual alkaline phosphatases cause a background problem. In such cases, inhibitors of endogenous alkaline phosphatases may solve the problem [25]. When using GFP, the signal from the introduced GFP can be weak, and thus background fluorescence in the tissue can be a problem. We have found that tissue sections prepared on a vibratome yield bright GFP+ cells, but the same tissue sectioned on a cryostat gives much dimmer and more ill-defined GFP+ cells. The situation can be remedied by the use of the excellent anti-GFP antibodies that are commercially available.

The second issue to consider is how one wishes to define the cell types expressing the reporter gene. In some cases, the morphology of the infected cells indicates their identity. In such cases, we recommend that one use lacZ or PLAP with histochemical detection, which is the simplest and most rapid way to find the infected cells. Moreover, the Xgal precipitate formed by b-galactosidase and the XP/NBT product of PLAP are stable for months of storage, allowing one time to analyze many sections. However, there are differences between lacZ and PLAP that will direct your choice of which one to use for morphological identification. LacZ typically does not fill the cell bodies of large cells, such as neurons, as completely as PLAP. Thus, when it is desirable to characterize cells via their morphology, PLAP, which associates with the plasma membrane, is superior to lacZ. PLAP is also the most sensitive of the reporters that we have used, most likely due to the fact that it is a very stable enzyme. However, PLAP can produce such a dense stain that, when clonally-related cells are close together, we have been unable to count the number of cells in a clone or distinguish the morphologies of individual cells (see [26] for an example). In such cases, nuclear lacZ is very useful. GFP and its many alleles, including the membrane bound form, are excellent alternatives that offer the advantage that imaging using the most high resolution fluorescent microscopes reveals excellent morphological details. In some cases, it is as sensitive as PLAP. However, in many cases, we have been forces to use PLAP as it remains more sensitive.

If one cannot use morphological criteria to identify the types of cells carrying the reporter gene, one option is to use immunohistochemistry to detect defining cellular antigens. The Xgal product of lacZ and the reaction product of PLAP make it difficult to detect a fluorescent immunohistochemical product as they absorb fluorescence. Moreover, the Xgal product and the X-P/NBT precipitate produced by PLAP often are too dark to allow simultaneous detection of another colored precipitate produced by immunohistochemical detection of a cellular antigen. However, occasionally, this will work (e.g. see [15, 27]). One can use double immunohistochemical procedures by employing antisera to detect PLAP or lacZ. However, double immunohistochemical procedures are much more time consuming then histochemistal procedures and immunohistochemical detection of lacZ and PLAP is not always as sensitive as the histochemical procedures. For all of these reasons, GFP is a better choice. GFP allows simultaneous detection of the GFP reporter and an immunohistochemical signal e.g. see [28]. However, as mentioned above, GFP expression is sometimes weak and, in addition, storage of sections over a long period of time (i.e. months) does not allow for preservation of the GFP signal. The newest reporter to be described, b-lactamase [29], may offer advantages over GFP in terms of sensitivity, but it has not yet been tested in vivo, where it may suffer from leakage of the product from infected cells. If this is a problem, future substrates might overcome this limitation.

               The choice of the promoter to drive expression of a reporter gene also requires consideration. In order to see all the progeny of infected cells, a constitutive promoter should be used. We have had success using the LTR of Moloney Murine Leukemia Virus (Mo-MLV) for work in rats and mice and the LTR of Avian Leukosis Viruus for work in chicks. We compared several alternative promoters located internal to the Mo-MLV LTR, in the context of a wild type LTR promoter and in the context of an LTR promoter with an enhancer deletion, using infections of murine tissue in vivo [30, 31]. The LTR promoter performed the best of several promoters tested, including the human histone 4, chicken b-actin, CMV immediate early, and the SV40 early promoters. However, Gaiano et al. [17] reported that the Mo-MLV promoter was relatively inactive in early (E8.5 to E9.5) progenitor cells of the CNS. This problem is reminiscent of the failure of the LTR to express in embryonic stem cells and preimplantation embryos, which appears to be due to inhibition of the LTR in stem cells. Gaiano et al. found that an internal promoter of Ef1a or CMV/b-actin resulted in more expression in early progenitor cells as well as in later neurons. These findings suggest that one should test for stable expression in the area of interest, using the infection time and site that will be used for future experiments, before choosing the promoter. However, even after performing such preliminary experiments, and even with the choice of an apparently constitutive promoter, it is important to restrict one's conclusions about lineal relationships to cells that are marked and not to make assumptions about their relationships to cells that are unmarked. We have found, in some clones, that not all cells express a reporter gene. This is true even in control situations, such as in clones of NIH-3T3 fibroblasts infected with either a lacZ or PLAP vector in vitro. This observation has been made using several different promoters and vector designs. We currently are using several vectors and markers, including our original Mo-MLV vectors encoding lacZ, PLAP, or fluorescent proteins. Newer alternatives are the Clonetech, pQC, as well as derivatives of the lentiviral vector, FUGW, of Lois et al.

              

DETERMINATION OF SIBLING RELATIONSHIPS

               When performing lineage analysis, it is critical to unambiguously define cells as descendants of the same progenitor. This can be relatively straightforward when sibling cells remain rather tightly, and reproducibly, grouped. An example of such a straightforward case is the rodent retina, where the descendants of a single progenitor migrate to form a coherent radial array [31, 32]. The 2 analyses described below were applied to the rodent retina, and are applicable in any system where clones are arranged simply and reproducibly. The first assay is to perform a standard virological titration in which a particular viral inoculum is serially diluted and applied to tissue. In the retina, the number of radial arrays, their average size, and their cellular composition were analyzed in a series of animals infected with dilutions that covered a 3 log range. The number of arrays was found to be linearly related to the inoculum size, while the size and composition were unchanged. Such results indicate that the working definition of a clone, in this case a radial array, fulfilled the statistical criteria expected of a single hit event.

               The second assay is to perform a mixed infection using 2 different retroviruses in which the histochemical reporter genes are distinctive. Two such viruses might encode cytoplasmically localized vs. nuclear-localized b-gal. This can work when the cytoplasmically-localized b-gal is easily distinguished from the nuclear-localized b-gal [19, 33]. We have found that this is not the case in rodent nervous system cells as the cytoplasmically-localized b-gal quite often is restricted to neuronal cell bodies and is therefore difficult to distinguish from nuclear localized b-gal. In order to overcome this problem, we created the afore mentioned DAP virus [20], which is distinctive from the lacZ-encoding BAG virus. A stock containing BAG and DAP was produced by y2 producer cells grown on the same dish. The resulting supernatant was concentrated and used to infect rodent retina. The tissue was then analyzed histochemically for the presence of blue (due to BAG infection) and purple (due to DAP infection) radial arrays. If radial arrays were truly clonal, then each one should be only 1 color. Analysis of approximately 1100 arrays indicated that most were clonal. However, 5 comprised both blue and purple cells. This value will be an underestimate of the true frequency of incorrect assignment of clonal boundaries as sometimes 2 BAG or 2 DAP virions will infect adjacent cells and thus not lead to formation of bi-colored arrays. A closer approximation of the true frequency can be obtained by using the following formula (for derivation, see Fields-Berry et al. 1992 [20]).

               (# bi-colored arrays)  (a + b)2

                              2ab       = % errors

               # total arrays

where a and b are the relative titers comprising the virus stock. The relative titer of BAG and DAPused in the co-infection was 3:1 and thus the value for percent errors in clonal assignments was 1.2%.

               The value of 1.2% for errors in assignment of clonal boundaries includes errors due to both aggregation and independent virions (e.g. perhaps due to helper-virus mediated spread) infecting adjacent progenitors. The percent errors in other areas of an animal will depend upon the particular circumstances of the injection site, and upon the multiplicity of infection (MOI, the ratio of infectious virions to target cells). Most of the time MOI will be quite low (e.g. in the retina it was approximately 0.01 at the highest concentration of virus injected). Concerning the injection site, injection into a lumen, such as the lateral ventricles, should not promote aggregation nor high local MOI, but injection into solid tissue in which the majority of the inoculum has access to a limited number of cells at the inoculation site, could present problems. By co-injecting BAG and DAP, one can monitor the frequency of these events and thus determine if clonal analysis is feasible.

               An error rate as small as 1.2% does not affect the interpretation of "clones" that are frequently found in a large data set. However, as with any experimental procedure that relies in some way on statistical analysis, rare associations of cell types must be interpreted with some caution and conclusions cannot be drawn independently of other data.

               The above analysis was performed using viruses that were produced on the same dish and concentrated together. This was done as we felt that the most likely way that 2 adjacent progenitors might become infected would be through small aggregates of virions. Aggregation most likely occurs during the concentration step as one often can see macroscopic aggregates after resuspending pellets of virions. Thus, when the 2-marker approach is used to analyze clonal relationships, it is best to co-concentrate the 2 vectors together in order for the assay to be sensitive to aggregation due to this aspect of the procedure. (Although aggregation of virions may frequently occur during concentration, it apparently does not frequently lead to problems in lineage analysis presumably due to the high ratio of non-infectious particles to infectious particles found in most retrovirus stocks. It is estimated that only 0.1–1.0 % of the particles will generate a successful infection. Moreover, most aggregates are probably not efficient as infectious units; it must be difficult for the rare infectious particle(s) within such a clump to gain access to the viral receptors on a target cell.)

METHODS FOR DETERMINING LINEAGE ANALYSIS

               In order to determine the ratio of 2 genomes present in a mixed virus stock (e.g. BAG plus DAP), there are several methods that can be used. The first 2 methods are performed in vitro, and are simply an extension of a titration assay. Any virus stock is normally titered on NIH 3T3 cells to determine the amount of virus to inject. The infected NIH 3T3 cells are then either selected for the expression of a selectable marker when the virus encodes such a gene (e.g. neo in BAG and DAP), or are stained directly, histochemically, for b-gal or PLAP activity without prior selection in drugs. If no selection is used, the relative ratio of the two markers can be scored directly by evaluating the number of clones of each color on a dish. Alternatively, selected G418-resistant colonies can be stained histochemically for both enzyme activities and the relative ratio of blue vs. purple G418 resistant colonies computed. A third method of evaluating the ratio of the 2 genomes is to use the values observed from in vivo infections. After animals are infected and processed for both histochemical stains, the ratio of the 2 genomes can be compared by counting the number of clones, or infected cells, of each color. When all the above methods were applied to lineage analysis in mouse retina [20] and rat striatum [34], the value obtained for the ratio of G418 resistant colonies scored histochemically was almost identical to the ratio observed in vivo. Directly scoring histochemically stained, non-G418 selected NIH 3T3 cells led to an underestimate of the number of BAG infected colonies, presumably as such cells often are only faint blue, while DAP-infected cells are usually an intense purple. In vivo, this is not generally the case as BAG-infected cells are usually deep blue.

               The method of injecting 2 distinctive viruses is a straightforward and feasible method of assessing clonal boundaries when they are fairly easy to define. This method does not require circumstances where there is a wide range of dilutions that can be injected to give countable numbers of events, which is required for a reliable dilution analysis. Moreover, it does not rely as critically on controlling the exact volume of injection, as is required in the dilution analysis. The use of a small number of vectors that encode distinctive histochemical products for definition of sibling relationships is only appropriate when there is very little migration of sibling cells. In these cases, the arrangement of cells that will be used to identify clonal relationships can be defined, and then this definition can be tested as described above. The fit of the definition with true clonal relationships will be revealed by the percentage of defined "clones" that are of more than one color. When the error is too great, one can re-evaluate the criteria, make a new definition, and again test it by looking for "clones" that are more than one color. Through trial and error, an accurate definition of sibling relations should be possible when migration is not too great.

               When one cannot accurately define clonal relationships with a few distinctive viruses, a much greater number of vectors must be used. One can employ a library of retroviral vectors, each member of which is tagged with a unique small insert of an irrelevant DNA. Each vector is scored using the polymerase chain reaction (PCR). The library/PCR method is tedious but extremely worthwhile when dealing with problematic areas.

               Regardless of which method is used to score sibling relationships, one further recommendation to aid in the assignments is to choose an injection site which will allow the inoculum to spread. If one injects into a packed tissue, the viral inoculum will most likely infect cells within the injection tract and it will be very difficult to sort out sibling relationships. For example, a lumen, such as the neural tube, provides an ideal site for injection. Regardless of site, one must inject such that the virus has clear access to the target population; the virus will bind to cells at the injection site and will not gain access to cells that are not directly adjoining that

PREPARATION AND USE OF A RETROVIRAL LIBRARY FOR LINEAGE ANALYSIS USING PCR AND SEQUENCING

               We developed a direct approach to address lumping and splitting errors [41] by constructing a library of viruses that was analyzed by PCR. In our first libraries, each virus of the library carried one member from a pool of approximately 100 DNA fragments from Arabidopsis thalliana DNA, in addition to the lacZ or PLAP gene. Infected cells, recognized by their enzyme activity, were mapped and the positive cells cut from cryosections. The A. thalliana DNA was amplified by PCR and characterized by size and restriction enzyme digestion patterns. If the size and restriction digestion pattern of the PCR product from two or more cells was the same, they were considered siblings with a probability calculated on the basis of the number of infections in that brain and the complexity of the library [41, 42]. Lineage analysis using such libraries revealed novel lineal relationships in the rat cerebral cortex [41, 43, 44] and chick diencephalon [45]. However, the limited number of unique members in the library made from A. thalliana DNA restrained the analysis to tissues with low infection rates.

               More data could be acquired with each experiment and additional questions could be addressed in the central nervous system and other tissues with a more complex library containing a greater number of DNA tags. We therefore constructed several retroviral vectors, of which CHAPOL (chick alkaline phosphatase with oligonucleotide library) is the prototype [46], that include degenerate oligonucleotides with a theoretical complexity of 1.7 x 107. Studies in the developing nervous system of the chick have been successfully completed using CHAPOL [47, 48].

               A summary of the production of CHAPOL and BOLAP (an oligo library in a murine vector) will be given here; a detailed description of the construction of CHAPOL can be found elsewhere [46]. For either avian or murine retroviruses, the overall strategy is the same. A population of double-stranded DNA molecules that includes a short degenerate region, [(G or C)(A or T)]12, is generated by PCR amplification of a chemically constructed single-stranded oligonucleotide population of the same sequence. The oligo preparation is ligated into a retrovirus vector and a preparation of highly competent E. coli is transformed. The library is then grown as a pool and a preparation of plasmids from the pool is made. The DNA of the pool is transfected into an avian or mammalian packaging cell line to produce a library of virus particles. The library is injected into an area to be mapped. Infected cells are detected histochemically and each infected cell is recovered for PCR amplification. Each PCR product is then sequenced. Two cells with the same sequence are considered siblings, again with a probability derived from an analysis of the frequency of recovery of each genome (see Walsh et al. 1992 [42]).

PREPARATION OF CHAPOL

               The avian replication-incompetent virus CHAP [49, 50], encoding the human placental alkaline phosphatase (PLAP) gene, was modified to accept the oligo inserts. CHAP was linearized, purified, and mixed with the degenerate oligonucleotides in the presence of ligase and aliquots of the resulting ligation products were used to transform E. coli DH5a. Following transformation, all aliquots were pooled. One hundred ml of the pool was plated at varying dilutions on plates containing ampicillin. The remainder of the pool was divided and added to eight 2L flasks containing lL LB media with 50 mg/ml ampicillin. The cultures were shaken overnight at 37°C. Plasmid DNA was extracted from these cultures by the triton lysis procedure and purified on CsCl gradients [51].

               CHAPOL DNA was transfected into the avian virus packaging line Q2bn [14], and the transiently produced virus was collected and concentrated. Aliquots of CaPO4 precipitates of 100 mg CHAPOL DNA were made in 10 ml of HBS. The precipitate in each aliquot was then distributed equally on ten ten cm plates of Q2bn and glycerol shock was carried out for 90 seconds at room temperature 4 hours later. At 24 hours post glycerol shock, the supernatants were collected and pooled. This was repeated at 48 hours. The supernatants from the 24 and 48 hour harvests were pooled and the titer calculated by infection of QT6 cells and assay of the PLAP activity as described (Cepko and Pear in Ausubel et al. 1997 [4]). The stock was filtered through a 0.45 mm filter and concentrated by centrifugation in an SW27 rotor at 4°C, 20K, for 2 hours. The concentrated stock was titered on QT6 and tested for helper virus, which proved negative. The titer of CHAPOL was determined to be 1.1 x 107 CFU/ml. The same stock has used for all experiments conducted over a 5 year period and many aliquots remain. We recommend making large stocks and storing them as small aliquots.

USE OF CHAPOL

               CHAPOL was used to infect the developing brain of chick embryos using the procedures outlined above. At various times later, the tissue was harvested and stained for AP activity. The outline of each section was drawn by camera lucida and the location and type of cells labeled on each section were recorded. A single cell or cluster of cells with a small group of surrounding cells were removed using a heat pulled glass micropipette (figure 2) and transferred to a 96-well PCR plate for Proteinase K digestion (as below). Following digestion, nested PCR was performed (as below). The product of each PCR was run on a 1.5% agarose gel to determine if a product of the appropriate size had been amplified. The recovery of a PCR product of the proper size occurred from PCR of 30-85% of the picks (the frequency varied depending upon the batch of PCRs and the tissue being studied) using CHAPOL. Sequencing of the oligonucleotide insert (as below) was performed on all reactions which gave the expected product on the agarose gel analysis (e.g. figure 3) and was successful approximately 75% of the time. All sequences were stored in the software program GCG (1991). All common sequences were pulled from the database created in GCG and the corresponding cells labeled. Sections were then aligned to determine the three dimensional boundaries of clonal expansion. Each type of cell (e.g. neuron, glia) was also recorded to determine the variety of cells which can arise from a single progenitor.

The value of using a complex library of vectors is illustrated by the view of more heavily infected brains (figure 4) where closely aggregated AP+ cells would have been lumped into a single clone based on proximity. In addition, even lightly infected brains could give rise to lumping errors (also see Walsh and Cepko, 1992 [41]).

               Two issues are important for determining the value of this type of library of DNA markers: the number of unique members in the library and the distribution of the library members [42]. If only 2 members exist in the library, for example, then there is a one in two chance that the same tag will be selected in two consecutive picks (if they are present in equal concentrations). If 100 members exist in the library at equal concentrations, the chance that two picks come up with the same member is reduced to 10-2. The second important variable determining the quality of the library is the distribution of the members within the library. This can be illustrated as follows. Consider a library composed of 106 members, with 50% of the library composed of one member. If two neighboring or distant cells are found to carry the over represented insert, the probability that the two cells arose from separate clones is still 0.5. CHAPOL was found to have an equal distribution in that each of the inserts picked to date (n> 500) has occurred independently only once. One further issue to consider is the level of difficulty in using the library. We have found in practice that this method of tag identification is in fact easier than our previous method based upon the analysis of the size and restriction digestion pattern of A. thalliana DNA.

               These libraries should be useful for application in a wide range of tissues and species. The host range has been expanded to previously uninfectable hosts [11], and infection of non-neural tissue with CHAPOL has been observed, as one would expect based upon experience with avian and murine retroviruses.

 

PCR AND SEQUENCING FROM CHAPOL

Proteinase K digestion: The coverslips were removed from slides by immersion in sterile H2O. Single cells or small clusters of cells containing purple NBT precipitate with surrounding unlabeled tissue (approximately 0.5 to 2 mm tissue fragments) were scraped from the slide using a heat pulled glass micropipette (figure 1). The cells were transferred to a 96 well PCR (Hybaid) plate with 10 ml of a proteinase K solution (50 mM KCl, 10 mM TrisHCl pH 7.5, 2.5 mM MgCl, 0.02% tween 20, 200 mg/ml proteinase K). Each well was overlaid with 1 drop of light mineral oil (Sigma) and the plates were heated to 60°C for 2 hours, 85°C for 20 minutes, and 95°C for 10 minutes in a Hybaid OmniGene thermocycler.

Nested PCR: The first PCR was accomplished by adding 0.15 ml Taq polymerase (Boehringer Mannheim), 0.15 ml dNTP mix (Boehringer Mannheim), 0.75 ml each of 10 mM oligonucleotide 0 (5'TGTGGCTGCCTGCACCCCAGGAAAG3') and 10 mM oligonucleotide 5 (5'GTGTGCTGTCGAGCCGCCTTCAATG3'), 2 ml PCR buffer with MgC12 (Boehringer-Mannheim) and 16.2 ml of H20 to each well of the 10 ml proteinase K solution (final volume 30 ml). This was cycled at 93°C x 2.5 minutes; [(94°C x 45 seconds)(72°C x 2 minutes)] x 40 cycles; 72°C x 5 minutes.

The second PCR was performed with 1 ml of reaction product from the first PCR added to 0.25 ml Taq polymerase (Boehringer-Mannheim), 0.25 ml dNTP mix (Boehringer Mannheim), 1 ml each 10 mM oligonucleotide 2 (5'GCCACCACCTACAGCCCAGTGG3') and 10 mM oligonucleotide 3 (5'GAGAGAGTGCCGCGGTAATGGG3'), 2 ml PCR buffer with MgC12 (Boehringer Mannheim) and 14.5 ml of H20 (final volume 30 ml). The reaction was thermocycled at 93°C x 2.5 minutes; [(94°C x 45 seconds)(70°C x 2 minutes)] x 30 cycles; 72°C x 5 minutes. An aliquot of DNA was run on a 1.5% agarose gel (0.75% Seakem, 0.75% NuSeive) to insure that the appropriate insert was present (figure 2).

Sequencing: Sequencing was performed using the CyclistTM Exo- Pfu DNA sequencing kit from Stratagene. Briefly, 5 ml of each d/ddNTP mix was added to four wells on a 96 well Hybaid plate. To each of these wells 25% of the following mixture was added: 1 ml of the nested PCR product, 1 ml of 10 mM Oligo 3, 3 ml 10 x sequencing buffer, 1 ml Exo-Pfu, 0.75 ml 35S (10 mCi), 4 ml DMSO, and 11.25 ml H20. This was cycled at 95°C x 5 minutes; [(95°C x 30 seconds)(60°C x 30 seconds)(72°C x 1 minute)] x 30 cycles. Sequencing reactions were analyzed on a 6% acrylamide denaturing gel (figure 3).

NOTE: Reagents, instruments, and glass microscope slides should be handled with scrupulous technique, and UV-irradiated when needed to destroy contaminating DNA.

CREATION OF BOLAP

The BOLAP library was created in a murine retrovirus vector, pBABE [8], into which the P-ALP1 gene was inserted to create pBABE-AP (Fields-Berry and Cepko, unpublished: see Figure 4). BABE-AP-X was generated by inserting PCR amplified DPL2 into the AscI and BglII site of BABE-AP. After the ligation, BABE-AP-X was digested with AscI and Xhol, phenol/chloroform extracted, and ethanol precipitated. The details are supplied here as an example of how to make such a library. BOLAP is available upon request.

DPLl was prepared by PCR amplification of the following reaction mix:

1 ml 1/100 diluted DPL (0.6 mg/ml), l ml DPLP (0.44 mg/ml), 1 ml DPLP5 (0.2 mg/ml), 5 ml 2.5 mM dNTP mix, 10 ml 10 X PCR buffer (Boehringer-Mannheim), 79 ml water, 2 ml Taq DNA polymerase (Boehringer-Mannheim)

The amplification program was:

93°C for 2.5 min., [94°C for 45 sec, 70°C for 2 min] X 30, 72°C for 5 min.

The product was phenol/chloroform extracted, ethanol precipitated, digested with AscI and XhoI, gel purified, and ethanol precipitated. Approximately 3 mg of AscI/XhoI digested BABE-AP-X was ligated to 25 ng of AscI/XhoI digested DPLl in a 100 ml ligation reaction at 16°C overnight. The product was phenol/chloroform extracted, ethanol precipitated, and resuspended in 50 ml TE. Twenty ml of ligation product was used to transform ElectroMax DHlOB (Gibco/BRL) competent cells according to the manufacturer's instructions. The transformed cells were pooled. One ml of the pooled cells was serially diluted and plated on LB/amp plates. The remainder of the pool was split into five 1-liter cultures of TB containing ampicillin and kanamycin sulfate (each at 50 mg/ml) and shaken overnight at 37°C. 3.7 mg of BOLAP plasmid was extracted from the overnight cultures by triton lysis, followed by purification via CsCl gradient. In the case of our preparation of BOLAP, the number of colonies on the LB/amp plates projected a total number of transformants to be 1.28 X 107. A control ligation containing no DPLl projected the background of vector without DPLl insert to be 0.9%.

 

BOLAP VIRUS PRODUCTION AND EVALUATION

Eleven confluent ten cm dishes of Bosc23 cells [9] were split into fifty ten cm.dishes. The next morning, 350 mg of BOLAP DNA was combined with 1.5 ml Lipofectamine (Gibco/BRL) in 25 ml of Optimem (Gibco/BRL). The mixture was incubated for 20 minutes at room temperature and then added to 200 ml DME. The fifty plates were washed with DME and 5 ml of the DNA/Lipofectamine/DME mixture was added to each plate. The plates were incubated for 5 hrs at 37°C. Five ml of 20% fetal bovine serum in DME was then added to each plate and the plates returned to the incubator overnight. The next morning the supernatant was harvested and replaced with 5 ml of 10% FCS in DME. The next morning, this supernatant was harvested. The supernatants were pooled, filtered through a 0.45 mm filter unit and concentrated by centrifugation at 20K rpm for 2 hr at 4°C. The final concentration of the viral supernatant was l X 108 cfu/ml.

               To test the complexity of the library, the protocols of [46] were followed. One ml of viral supernatant was diluted into 30 ml DME with l0% calf serum. One to two ml of this diluted viral stock was used to infect a 30-50% confluent 6 cm dish of NIH3T3 cells. Five to six hours later, the infected cells were trypsinized, diluted ten fold, and plated on 96 well dishes. Three days later, the plates were washed with PBS, fixed with 4% paraformaldehyde for 10 minutes, washed thrice with PBS for 5 minutes, heated to 65°C for 25 minutes, and then stained overnight for AP activity with X-Phos and NBT. The following morning, each well was examined for the presence of a single discreet grouping of AP+ cells. The cells in chosen wells were washed with PBS. Ten ml of 400 mg/ml proteinase K solution (50 mM KCl, 10 mM Tris-HCl, pH 7.5, 2.5 mM MgCl2, 0.02% Tween-20) was added and the cells and solution were scraped/suctioned off and placed in individual wells in a 96 well dish. A drop of mineral oil placed over each well and the plate was heated to 65°C for 2 hrs, 85°C for 20 min, and 95°C for 10 min.

               A nested PCR was performed as follows. To each well was added 20 ml of the following reaction mix:

2 ml 10 X PCR buffer with Mg (Boehringer Mannheim)

0.15 ml BOLAPO 5 and 6 (0.6 mg/ml)

O.l5 ml dNTP mixture (25 mM) (Boehringer Mannheim)

17.4 ml water

0.4 ml Taq DNA polymerase (Boehringer Mannheim)

The reactions were cycled as follows:

93°C for 2.5 min, [94°C for 45 sec, 67°C for 2 min, 72°C for 2 min] X 33, 72°C for 5 min.

One ml of the above reaction was added to 20 ml of the same reaction mix, substituting BOLAPO 7 and 8 for BOLAPO 5 and 6. The amplification program was:

93°C for 2.5 min., [94°C for 45 sec, 72°C for 2 min] X 30, 72°C for 5 min.

Eight ml of the second reaction mix was added to 2.5 ml of gel loading buffer and then fractionated on a 3% NuSieve GTG/1% SeaKem ME agarose gel. Amplifications which yielded a DNA product of the appropriate molecular weight (bp) were sequenced using the Exo- Pfu Cyclist kit (Stratagene). One ml of nested PCR product was added to the following reaction mix:

0.15 ml BOLAPO 7 (0.6 mg/ml)

3 ml 10 X Sequencing buffer

0.75 ml 35S-(alpha)dATP

4 ml DMSO

12.1 ml water

l ml Exo- Pfu DNA polymerase

 

This reaction was mixed and 5 ml portions were added separately to 5 ml of the four dNTP/ddNTP mixtures. The reactions were overlayed with oil and cycled as follows:

95°C for 5 min [95°C for 30 sec, 60°C for 30 sec, 72°C for 1 min] X 30

The reactions were terminated by the addition of 5 ml of stop buffer, and then fractionated by 6% acrylamide gel electrophoresis.

At this point, 98 individual clones have been sequenced in our laboratory. All carry a unique DPLl insert. Of those, the degenerate oligo region was shorter than 24 bases in seven clones and longer in 1 clone. Occasional variations in the (GC)(AT) sequence were also seen.

 

SEQUENCES OF OLIGONUCLEOTIDES REFERENCED ABOVE

DPL2 - 5'-TAGGAGGCGCGCCTTT-[(GC)(AT)]12

GTTCTCGAGGACACCTGACTGGCTGAGGG

CTTCCGCGACCCGAGATCTCAGCTTCC-3'

DPLl - 5'-TAGGAGGCGCGCCTTT-[(GC)(AT)]12

GTTACGCGTTAATTAACTCGAGATCTCAGCTTC-3'

DPLP- 5'-GAAGCTGAGATCTCGAGTTA-3'

DPLP5 - 5'-TAGGAGGCGCGCCTTT-3'

BOLAPO 5 - 5'-CCAGGGACTGCAGGTTGTGCCCTGT-3'

BOLAPO 6 - 5'-AGACACACATTCCACAGGGTCGAAG-3'

BOLAPO7-5'-GGCTGCCTGCACCCCAGGAAAGGAG-3'

BOLAPO8-5'-GGTCTCGGAAGCCCTCAGCCCAGTC-3'

 

FIGURE LEGENDS

Figure 1:

The VSV G protein endows avian retroviral vectors with a high efficiency of infection in vivo. Comparable amounts of an avian replication-incompetent vector, RIA-AP (ref. 15), encoding PLAP were injected into developing chick embryos. RIA-AP (G) virion particles contained VSV G protein on their surface and RIA-AP (A) virions contained the avian retroviral A env protein on their surface. Embryos were injected at stage 18 into the eye (A and B), stage 10 into the neural tube (E and F), or stage 18 into the limb bud or heart regions (C, D, G, H). Embryos injected with RIA-AP (A) are shown in panels A, C, E., and those injected with RIA-AP (G) in panels B, F, D, and H. Sections of infected limbs are shown in G and H. Approximately 3 days postinfection, the embryos were stained to reveal PLAP activity; red arrowheads indicate the limb regions and blue arrowheads the heart.

Figure 2:

Picking of AP+ cells from tissue sections following infection with CHAPOL. (A) Several AP+ cells are present in a chick cerebellar section, including a Purkinje cell (arrow) and many glial cells. (B) After removal of the Purkinje cell using a glass micropipette.

Figure 3:

The sequence reactions from the PCR products of four representative samples (samples 1-4) are shown. Sequencing of PCR products 1 and 2 each revealed the presence of a unique sequnce. Sequencing of PCR products 2 and 4 yielded the same sequence. Sequencing of sample 3 showed the presence of more than one species.

Figure 4:

A 60 mm parasagittal section of a chick cerebellum from a brain infected with CHAPOL and analyzed at P14. The AP+ cells were removed, subjected to PCR, and the PCR products were sequenced. Each pick that yielded a PCR product is labeled by an arrow. The sequence identity of each PCR product is color-coded. The clone indicated by the magenta arrows were not closely clustered, and would have resulted in a splitting error if certain geometric critera were used in clonal definition. The clones indicated by the blue and the green arrows were located very close to each other, and would have resulted in a lumping error if certain geometric criteria of clonal assignment were used.

 Figure 5:

BOLAP, a murine retroviral preparation encoding an oligonucleotide library. A degenerate oligonucleotide pool (DPLl, see text) was inserted into the BABE-Neo plasmid encoding P-ALP1. The positions and orientations of the oligonucleotides BOLAPO5-8 used to amplify each insert and for sequencing are indicated by the arrows. LTR, long terminal repeat; gag, amino terminus of MMLV gag gene; SV40 - simian virus 40 early promoter; neo, TnS neomycin resistance gene; puc ori, bacterial origin of replication.

REFERENCES

1.           Coffin, J.M., S.H. Hughes and H.E. Varmus, Retroviruses. 1997, Plainview, NY: Cold Spring Harbor Laboratory Press.

2.           Roe, T.Y., T.C. Reynolds, G. Yu and P.O. Brown, EMBO J., 1993. 12: p. 2099-2108.

3.           Naldini, L., U. Blomer, P. Gallay, D. Ory, R. Mulligan, F.H. Gage, I.M. Verma and D. Trono, Science, 1996. 272(5259): p. 263-7.

4.           Ausubel, F.M., R. Brent, R.E. Kingston, D.D. Moore, J.G. Seidman, J.A. Smith and K. Struhl, Current Protocols in Molecular Biology. 1997, New York, NY: Greene Publishing Associates.

5.           Mann, R., R.C. Mulligan and D. Baltimore, Cell, 1983. 33(1): p. 153-9.

6.           Danos, O. and R.C. Mulligan, Proc Natl Acad Sci U S A, 1988. 85(17): p. 6460-4.

7.           Markowitz, D., S. Goff and A. Bank, J Virol, 1988. 62(4): p. 1120-4.

8.           Morgenstern, J.P. and H. Land, Nucleic Acids Res, 1990. 18(12): p. 3587-96.

9.           Pear, W.S., G.P. Nolan, M.L. Scott and D. Baltimore, Proc Natl Acad Sci U S A, 1993. 90(18): p. 8392-6.

10.        Soneoka, Y., P.M. Cannon, E.E. Ramsdale, J.C. Griffiths, G. Romano, S.M. Kingsman and A.J. Kingsman, Nucleic Acids Res, 1995. 23(4): p. 628-33.

11.        Yee, J.K., T. Friedmann and J.C. Burns, Methods Cell Biol, 1994. 43(Pt A): p. 99-112.

12.        Boerkoel, C.F., M.J. Federspiel, D.W. Salter, W. Payne, L.B. Crittenden, H.J. Kung and S.H. Hughes, Virology, 1993. 195(2): p. 669-79.

13.        Cosset, F.L., C. Legras, Y. Chebloune, P. Savatier, P. Thoraval, J.L. Thomas, J. Samarut, V.M. Nigon and G. Verdier, J Virol, 1990. 64(3): p. 1070-8.

14.        Stoker, A.W. and M.J. Bissell, J Virol, 1988. 62(3): p. 1008-15.

15.        Chen, C.-M.A., D.M. Smith, M.A. Peters, M.E.S. Samson, J. Zitz, C.J. Tabin and C.L. Cepko, Dev. Biol., 1999. 214: p. 370-384.

16.        Schlegel, R., T.S. Tralka, M.C. Willingham and I. Pastan, Cell, 1983. 32(2): p. 639-46.

17.        Gaiano, N., J.D. Kohtz, D.H. Turnbull and G. Fishell, Nat Neurosci, 1999. 2(9): p. 812-9.

18.        Price, J., D. Turner and C. Cepko, Proc Natl Acad Sci U S A, 1987. 84(1): p. 156-60.

19.        Galileo, D.S., G.E. Gray, G.C. Owens, J. Majors and J.R. Sanes, Proc Natl Acad Sci U S A, 1990. 87(1): p. 458-62.

20.        Fields-Berry, S.C., A.L. Halliday and C.L. Cepko, Proc Natl Acad Sci U S A, 1992. 89(2): p. 693-7.

21.        Fekete, D.M., J. Perez-Miguelsanz, E.F. Ryder and C.L. Cepko, Dev Biol, 1994. 166(2): p. 666-82.

22.        Chalfie, M., Y. Tu, G. Euskirchen, W.W. Ward and D.C. Prasher, Science, 1994. 263(5148): p. 802-5.

23.        Cepko, C.L., S. Bruhn and D. Fekete, Methods in Cell Biology, .

24.        Rosenberg, W.S., X.O. Breakefield, C. DeAntonio and O. Isacson, Brain Res Mol Brain Res, 1992. 16(3-4): p. 311-5.

25.        Zoellner, H.F. and N. Hunter, J Histochem Cytochem, 1989. 37(12): p. 1893-8.

26.        Bao, Z.Z. and C.L. Cepko, J Neurosci, 1997. 17(4): p. 1425-34.

27.        Snyder, E.Y., D.L. Deitcher, C. Walsh, S. Arnold-Aldea, E.A. Hartwieg and C.L. Cepko, Cell, 1992. 68(1): p. 33-51.

28.        Moriyoshi, K., L.J. Richards, C. Akazawa, D.D. O'Leary and S. Nakanishi, Neuron, 1996. 16(2): p. 255-60. 29.     

29. Zlokarnik, G., P.A. Negulescu, T.E. Knapp, L. Mere, N. Burres, L. Feng, M. Whitney, K. Roemer and R.Y. Tsien, Science, 1998. 279(5347): p. 84-8.

30.        Cepko, C.L., C.P. Austin, C. Walsh, E.F. Ryder, A. Halliday and S. Fields-Berry, Cold Spring Harb Symp Quant Biol, 1990. 55: p. 265-78.

31.        Turner, D.L., E.Y. Snyder and C.L. Cepko, Neuron, 1990. 4(6): p. 833-45.

32.        Turner, D.L. and C.L. Cepko, Nature, 1987. 328(6126): p. 131-6.

33.        Hughes, S.M. and H.M. Blau, Nature, 1990. 345(6273): p. 350-3.

34.        Halliday, A.L. and C.L. Cepko, Neuron, 1992. 9(1): p. 15-26.

35.        Muneoka, K., N. Wanek and S.V. Bryant, J Exp Zool, 1986. 239(2): p. 289-93.

36.        Morgan, B.A. and D.M. Fekete, Methods Cell Biol, 1996. 51: p. 185-218.

37.        Astrin, S.M., E.G. Buss and W.S. Haywards, Nature, 1979. 282(5736): p. 339-41.

38.        Hamburger, V. and H.L. Hamilton, J Exp Morph, 1951. 88: p. 49-92.

39.        Fekete, D.M. and C.L. Cepko, Mol Cell Biol, 1993. 13(4): p. 2604-13.

40.        Spector, D.L., R.D. Goldman and L.A. Leinwand, Cells: a laboratory manual. 1997, Cold Spring Harbor, NY: Cold Spring Harbor Laboratories Press.

41.        Walsh, C. and C.L. Cepko, Science, 1992. 255(5043): p. 434-40.

42.        Walsh, C., C.L. Cepko, E.F. Ryder, G.M. Church and C. Tabin, Science, 1992. 258: p. 317-320.

43.        Walsh, C. and C.L. Cepko, Nature, 1993. 362(6421): p. 632-5.

44.        Reid, C.B., I. Liang and C. Walsh, Neuron, 1995. 15(2): p. 299-310.

45.        Arnold-Aldea, S.A. and C.L. Cepko, Dev Biol, 1996. 173(1): p. 148-61.

46.        Golden, J.A., S.C. Fields-Berry and C.L. Cepko, Proc Natl Acad Sci U S A, 1995. 92(12): p. 5704-8.

47.        Golden, J.A. and C.L. Cepko, Development, 1996. 122(1): p. 65-78.

48.        Szele, F.G. and C.L. Cepko, Curr Biol, 1996. 6(12): p. 1685-90.

49.        Fields-Berry, S.C., in Genetics. 1990, Harvard University: Cambridge, MA. p. 141.

50.        Ryder, E.F. and C.L. Cepko, Neuron, 1994. 12(5): p. 1011-28.

51.        Sambrook, J., E.F. Fritsch and T. Maniatis, Molecular cloning: a laboratory manual. 1989, Cold Spring Harbor, NY: Cold Spring Harbor Laboratories Press.