Wednesday, February 25, 2015

Extract rRNA and tRNA features from UCSC Browser

Credit to Matthew Speir.


Extract tRNA (ref1)
In the Table Browser, you can use the following steps to get the coordinates for tRNA genes, with all of the tRNA pseudogenes filtered out:

1. Select your assembly and tracks

    clade: Mammal
    genome: Human
    assembly: Feb. 2009 (GRCh37/hg19)
    group: Genes and Gene Predictions Tracks
    track: tRNA
    table: tRNAs
    output: GTF - gene transfer format
    output file: enter a file name to save your results to a file, or leave blank to display results in the browser

2. Click 'Filter'.

3. Enter 'Pseudo' into the aa field.
    The "aa" line should read: aa doesn't match Pseudo

4. Click 'Submit'.

Extract rRNA (ref2)
The GENCODE v19 track,
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeGencodeV19,
contains genomic coordinates for human ribosomal RNA, snRNA, and 5S
ribosomal RNA. You can use the following steps to access this
information, and get the output in BED format:

1. Navigate to the table browser, http://genome.ucsc.edu/cgi-bin/hgTables.

2. Select your assembly and tracks

    clade: Mammal
    genome: Human
    assembly: Feb. 2009 (GRCh37/hg19)
    group: Genes and Gene Predictions Tracks
    track: GENCODE Genes V19
    table: Basic (wgEncodeGencodeBasicV19)
    output: BED - browser extensible data
    output file: enter a file name to save your results to a file, or
leave blank to display results in your browser

3. Click 'Filter'.

4. Select the wgEncodeGencodeAttrsV19 from the 'Linked Tables' section

5. Click 'allow filtering using fields in checked tables'.

6. This step will change depending on whether you want the coordinates
for the rRNA or snRNA genes.
    6.1 For rRNA, type 'rRNA' in the 'geneType' and 'transciptType'
fields of the hg19.wgEncodeGencodeAttrsV19 based filters section.
        The "geneType" line should read: geneType does match rRNA
        The "transcriptType" line should read: transcriptType does match rRNA
    6.2 For snRNA, type 'snRNA' in the 'geneType' and 'transciptType'
fields of the hg19.wgEncodeGencodeAttrsV19 based filters section.
        The "geneType" line should read: geneType does match snRNA
        The "transcriptType" line should read: transcriptType does match snRNA

7. Click 'Submit'.

8. After you return to the main Table Browser page, click 'get output'.

Many of the 5S rRNA positions in this table are pseudogenes, and you
may need to try different filtering parameters to exclude these from
the output.

The coordinates for piRNA are contained in the UCSC Genes track,
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=knownGene. You can
get this information from the Table Browser using steps similar to
those I previously described:

1. Select your assembly and tracks

    clade: Mammal
    genome: Human
    assembly: Feb. 2009 (GRCh37/hg19)
    group: Genes and Gene Predictions Tracks
    track: UCSC Genes
    table: knownGene
    output: BED - browser extensible data
    output file: enter a file name to save your results to a file, or
leave blank to display results in the browser

2. Click 'Filter'.

3. Type '*piRNA*' in the 'description' field of the hg19.kgXref based
filters section.
    The "description" line should read: description does match *piRNA*

4. Click 'Submit'.

5. After you return to the main Table Browser page, click 'get output'.

Lastly, precursor miRNA coordinates can be found in the sno/miRNA
track, http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgRna.
Again, you can get this information using the Table Browser and steps
similar to those I previously described:

1. Select your assembly and tracks

    clade: Mammal
    genome: Human
    assembly: Feb. 2009 (GRCh37/hg19)
    group: Genes and Gene Predictions Tracks
    track: sno/miRNA
    table: wgRna
    output: BED - browser extensible data
    output file: enter a file name to save your results to a file, or
leave blank to display results in the browser

2. Click 'Filter'.

3. Enter 'miRNA' into the type field.
    The "type" line should read: type does match *miRNA*

4. Click 'Submit'.

5. After you return to the main Table Browser page, click 'get output'.





References
1. https://groups.google.com/a/soe.ucsc.edu/forum/#!topic/genome/NWDhuxc360w
2. https://groups.google.com/a/soe.ucsc.edu/forum/#!msg/genome/jSAY8w1JVVo/P6lk4OJzDNEJ

No comments: