728

Web Search

 

Dec 25, 2013

Stress gene linked to heart attacks


By Helen BriggsBBC News

A stress gene has been linked to having a higher risk of dying from a heart attack or heart disease.

Heart patients with the genetic change had a 38% increased risk of heart attack or death, say US researchers.

Personalised medicine may lead to better targeting of psychological or drug treatment to those most at risk, they report in PLOS ONE.

The study adds to evidence stress may directly increase heart disease risk, says the British Heart Foundation.

A team at Duke University School of Medicine studied a single DNA letter change in the human genome, which has been linked to being more vulnerable to the effects of stress.

They found heart patients with the genetic change had a 38% increased risk of heart attack or death from heart disease after seven years of follow up compared with those without, even after taking into account factors like age, obesity and smoking.
This suggests that stress management techniques and drug therapies could reduce deaths and disability from heart attacks, they say.

Dr Redford Williams, director of the Behavioural Medicine Research Center at Duke University School of Medicine, said the work is the first step towards finding genetic variants that identify people at higher risk of cardiovascular disease.

"This is one step towards the day when we will be able to identify people on the basis of this genotype who are at higher risk of developing heart disease in the first place," he told BBC News.

"That's a step in the direction of personalised medicine for cardiovascular disease."
Identifying people with the genetic change could lead to early interventions for heart patients who are at high risk of dying or having a heart attack, say the researchers.
About one in 10 of men and 3% of women in the group of 6,000 heart patients studied had the genetic change associated with handling emotional stress badly.

Commenting on the study, Prof Jeremy Pearson, associate medical director at the British Heart Foundation, said the results provided further evidence that stress may directly increase heart disease risk.

"By finding a possible mechanism behind this relationship, these researchers have suggested tackling the problem either by changing behaviour or, if needed, with existing medicines," he said.

"There are positive lifestyle changes you can make to help you cope with stress. A balanced diet and regular physical activity will help you feel better able to cope with life's demands.

"If you often feel anxious and you're worried about your stress levels, make an appointment to talk it through with your doctor."

Fossil of "most successful mammal" on Earth unearthed in China


A nearly complete skeleton that belongs to the oldest ancestor of "the most evolutionarily successful and long-lived mammal lineage" on Earth has been unearthed in northeastern China, researchers from China and the United States said Thursday.
Dubbed Rugosodon eurasiaticus, the newly discovered species looked a bit like a small rat or a chipmunk. It lived 160 million years ago and was an early member of the group of mammals known as multituberculates, which flourished across the planet from about 170 million to 35 million years ago.
"The new mammal is called Rugosodon after the rugose teeth ornamented by numerous tiny ridges and grooves and pits, indicating that it was an omnivore that fed on leaves and seeds of ferns and gymnosperm plants, plus worms and insects," an international team of scientists from Chinese Academy of Geological Sciences, Beijing Museum of Natural History and the University of Chicago said in a statement.
The researchers described in the U.S. journal Science Rugosodon 's ankle bones as being "surprisingly mobile and flexible," a feature that suggests Rugosodon was a fast-running and agile mammal.
"The later multituberculates of the Cretaceous era and the Paleocene epoch are extremely functionally diverse: Some could jump, some could burrow, others could climb trees and many more lived on the ground. The tree-climbing multituberculates and the jumping multituberculates had the most interesting ankle bones, capable of 'hyper-back-rotation' of the hind feet," said ZheXi Luo, professor at the University of Chicago and co-author of the study. "What is surprising about this discovery is that these ankle features were already present in Rugosodon."
The researchers also said Rugosodon was a nocturnal mammal and lived in a temperate climate in lakeshores of what is now known as Jianchang County of Liaoning province in northeastern China. At that time, the creature shared the land with the feathered dinosaur Anchiornis, the pterosaur Darwinipterus, and abundant arthropods, several other mammals.
The discovery of Rugosodon extends the distribution of certain multituberculates from Europe to Asia during the Late Jurassic period, the researchers said.
"This new fossil from eastern China is very similar to the Late Jurassic fossil teeth of multituberculates from Portugal in western Europe," explained Luo. "This suggests that Rugosodon and its closely related multituberculates had a broad paleogreographic distribution and dispersals back-and-forth across the entire Eurasian continent."
Multituberculates arose in the Jurassic period and went extinct in the Oligocene epoch, occupying a diverse range of habitats for more than 100 million years before they were out-competed by more modern rodents.
Scientists believed that by the end of their run on the planet, multituberculates had evolved complex teeth that allowed them to enjoy vegetarian diets and unique locomotive skills that enabled them to traverse treetops. Both adaptations helped them thrive in the shadows of dinosaurs and survive through their mass extinction 65 million years ago. They became the most abundant mammals of the Mesozoic Era and constitute almost half of all mammal species that lived in the Jurassic and Cretaceous.

Nanoparticles could help sperm defects probe


A way of using nanoparticles to investigate the mechanisms underlying infertility has been developed by British scientists, according to a new report published in a scientific journal on Friday.
The technique, published in Nanomedicine: Nanotechnology, Biology and Medicine, could help researchers discover the causes behind cases of unexplained infertility, and develop treatments for affected couples.
The method involves loading porous silica nanoparticle "envelopes" with compounds to identify, diagnose or treat the causes of infertility. Researchers demonstrated that the nanoparticles could be attached to boar sperm with no detrimental effects on their function.
"An attractive feature of nanoparticles is that they are like an empty envelope that can be loaded with a variety of compounds and inserted into cells," Natalia Barkalina, lead author of the study from Oxford University, said.
"The nanoparticles we use don't appear to interfere with the sperm, making them a perfect delivery vessel."
According to researchers, sperm are difficult to study due to their small size, unusual shape and short lifetime outside of the body, yet this is a vital part of infertility research. Previous methods involved complicated procedures in animals and introduced months of delays before the sperm could be used.
The new technique enabled researchers to expose sperm to nanoparticles in a petri dish, which can all be done quickly enough for the sperm to survive perfectly unharmed.
"We will start with compounds to investigate the biology of infertility, and within a few years may be able to explain or even diagnose rare cases in patients. In future we could even deliver treatments in a similar way," study co-author Celine Jones said.

DNA study reveals nearly 100 potential cancer triggers


U.S. and British researchers said Thursday they have identified a new source of cancer triggers in the little-explored regions of genome previously considered as "junk."
Researchers from the University of Yale, the Wellcome Trust Sanger Institute and other institutions, brought together data from two large-scale genome analysis projects, known as the 1000 Genomes project and the ENCODE project, to study non-coding DNA regions and their relation to disease risk.
Unlike the coding region of the genome where 23,000 protein- coding genes lie, the non-coding region, which makes up 98 percent of our genome, is poorly understood.
The team found that some non-coding DNA regions showed almost the same low levels of variation as protein-coding genes, and called these "ultrasensitive" regions.
Within the ultrasensitive regions, they looked at specific single DNA letters that, when altered, caused the greatest disturbance to many genes, resulting in disease.
They integrated all this information to develop a computer system called FunSeq, which prioritizes genetic variants in the non-coding regions based on their predicted impact on human disease.
The team applied FunSeq to 90 cancer genomes including breast cancer, prostate cancer and brain tumors, and found nearly 100 potential non-coding cancer drivers.
Among the new discoveries was a single DNA letter change that seems to have great impact on the development of breast cancer. The single letter change occurs in an ultrasensitive region central to a network of many related genes, the researchers said.
"Our technique allows scientists to focus in on the most functionally important parts of the non-coding regions of the genome," Professor Mark Gerstein, senior author from the University of Yale said in a statement. "This is not just beneficial for cancer research, but can be extended to other genetic diseases too."
The findings were published in the U.S. journal Science.

DNA and RNA Databases


A collection of genomics, functional genomics, and genetics studies and links to their resulting datasets. This resource describes project scope, material, and objectives and provides a mechanism to retrieve datasets that are often difficult to find due to inconsistent annotation, multiple independent submissions, and the varied nature of diverse data types which are often stored in different databases.

The BioSample database contains descriptions of biological source materials used in experimental assays.

A collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality.

A divison of GenBank that contains short single-pass reads of cDNA (transcript) sequences. dbEST can be searched directly through the Nucleotide EST Database.

A division of GenBank that contains short single-pass reads of genomic DNA. dbGSS can be searched directly through the Nucleotide GSS Database.

Includes single nucleotide variations, microsatellites, and small-scale insertions and deletions. dbSNP contains population-specific frequency and genotype data, experimental conditions, molecular context, and mapping information for both neutral variations and clinical mutations.

The NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. GenBank is part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI. These three organizations exchange data on a daily basis. GenBank consists of several divisions, most of which can be accessed through the Nucleotide database. The exceptions are the EST and GSS divisions, which are accessed through the Nucleotide EST and Nucleotide GSS databases, respectively.

A compilation of data from the NIAID Influenza Genome Sequencing Project and GenBank.  It provides tools for flu sequence analysis, annotation and submission to GenBank. This resource also has links to other flu sequence resources, and publications and general information about flu viruses.

A collection of nucleotide sequences from several sources, including GenBank, RefSeq, the Third Party Annotation (TPA) database, and PDB. Searching the Nucleotide Database will yield available results from each of its component databases.

Database of related DNA sequences that originate from comparative studies: phylogenetic, population, environmental and, to a lesser degree, mutational. Each record in the database is a set of DNA sequences. For example, a population set provides information on genetic variation within an organism, while a phylogenetic set may contain sequences, and their alignment, of a single gene obtained from several related organisms.

A public registry of nucleic acid reagents designed for use in a wide variety of biomedical research applications, together with information on reagent distributors, probe effectiveness, and computed sequence similarities.

A collection of human gene-specific reference genomic sequences. RefSeq gene is a subset of  NCBI’s RefSeq database, and are defined based on review from curators of locus-specific databases and the genetic testing community. They form a stable foundation for reporting mutations, for establishing consistent intron and exon numbering conventions, and for defining the coordinates of other biologically significant variation. RefSeqGene is a part of the Locus Reference Genomic
(LRG) Collaboration.

A collection of curated, non-redundant genomic DNA, transcript (RNA), and protein sequences produced by NCBI. RefSeqs provide a stable reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis, expression studies, and comparative analyses. The RefSeq collection is accessed through the Nucleotide and Protein databases.

The Sequence Read Archive (SRA) stores sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Life Technologies AB SOLiD System®, Helicos Biosciences Heliscope®, Complete Genomics®, and Pacific Biosciences SMRT®.

A database that contains sequences built from the existing primary sequence data in GenBank. The sequences and corresponding annotations are experimentally supported and have been published in a peer-reviewed scientific journal. TPA records are retrieved through the Nucleotide Database.

A repository of DNA sequence chromatograms (traces), base calls, and quality estimates for single-pass reads from various large-scale sequencing projects.

A database that provides sets of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.

This database contains libraries of Expressed Sequence Tags (ESTs) organized by organism, tissue type and developmental stage.

A comprehensive database of sequence tagged sites (STSs) derived from STS-based maps and other experiments. STSs are defined by PCR primer pairs and are associated with additional information, such as genomic position, genes, and sequences.

DNA and RNA Resource Download


BLAST executables for local use are provided for Solaris, LINUX, Windows, and MacOSX systems. See the README file in the ftp directory for more information. Pre-formatted databases for BLAST nucleotide, protein, and translated searches also are available for downloading under the db subdirectory.

Sequence databases for use with the stand-alone BLAST programs. The files in this directory are pre-formatted databases that are ready to use with BLAST.

Sequence databases in FASTA format for use with the stand-alone BLAST programs. These databases must be formatted using formatdb before they can be used with BLAST.

This site contains files for all sequence records in GenBank in the default flat file format. The files are organized by GenBank division, and the full contents are described in the README.genbank file.

This site contains all nucleotide and protein sequence records in the Reference Sequence (RefSeq) collection. The ""release"" directory contains the most current release of the complete collection, while data for selected organisms (such as human, mouse and rat) are available in separate directories. Data are available in FASTA and flat file formats. See the README file for details.

This site contains next-generation sequencing data organized by the submitted sequencing project.

This site contains the trace chromatogram data organized by species. Data include chromatogram, quality scores, FASTA sequences from automatic base calls, and other ancillary information in tab-delimited text as well as XML formats. See the README file for details.

This site contains individual directories for each organism with data in UniGene. The data for each species includes the unique sequence for each UniGene cluster, all sequences in each cluster in FASTA format and library information for the cluster. See the README file for further details.

This site contains the UniVec and UniVec_Core databases in FASTA format. See the README.uv file for details.

This site contains whole genome shotgun sequence data organized by the 4-digit project code. Data include GenBank and GenPept flat files, quality scores and summary statistics. See the README.genbank.wgs file for more information.
Submissions

An online form that provides an interface for researchers, consortia and organizations to register their BioProjects. This serves as the starting point for the submission of genomic and genetic data for the study. The data does not need to be submitted at the time of BioProject registration.

A web-based sequence submission tool for one or a few submissions to the GenBank database, designed to make the submission process quick and easy.

Tool for submission to the GenBank database of Barcode short nucleotide sequences from a standard genetic locus for use in species identification.

A stand-alone software tool developed by the NCBI for submitting and updating entries to public sequence databases (GenBank, EMBL, or DDBJ). It is capable of handling simple submissions that contain a single short mRNA sequence, complex submissions containing long sequences, multiple annotations, segmented sets of DNA, as well as sequences from phylogenetic and population studies with alignments. For simple submission, use the online submission tool BankIt instead.

A command-line program that automates the creation of sequence records for submission to GenBank using many of the same functions as Sequin. It is used primarily for submission of complete genomes and large batches of sequences.

This link describes how submitters of SRA data can obtain a secure NCBI FTP site for their data, and also describes the allowed data formats and directory structures.

A single entry point for submitters to link to and find information about all of the data submission processes at NCBI. Currently, this serves as an interface for the registration of BioProjects and BioSamples and submission of data for WGS and GTR. Future additions to this site are planned.

This link describes how submitters of trace data can obtain a secure NCBI FTP site for their data, and also describes the allowed data formats and directory structures.

DNA and RNA Tools


Databases

A collection of genomics, functional genomics, and genetics studies and links to their resulting datasets. This resource describes project scope, material, and objectives and provides a mechanism to retrieve datasets that are often difficult to find due to inconsistent annotation, multiple independent submissions, and the varied nature of diverse data types which are often stored in different databases.

The BioSample database contains descriptions of biological source materials used in experimental assays.

A collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality.

A divison of GenBank that contains short single-pass reads of cDNA (transcript) sequences. dbEST can be searched directly through the Nucleotide EST Database.

A division of GenBank that contains short single-pass reads of genomic DNA. dbGSS can be searched directly through the Nucleotide GSS Database.

Includes single nucleotide variations, microsatellites, and small-scale insertions and deletions. dbSNP contains population-specific frequency and genotype data, experimental conditions, molecular context, and mapping information for both neutral variations and clinical mutations.

The NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. GenBank is part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI. These three organizations exchange data on a daily basis. GenBank consists of several divisions, most of which can be accessed through the Nucleotide database. The exceptions are the EST and GSS divisions, which are accessed through the Nucleotide EST and Nucleotide GSS databases, respectively.

A compilation of data from the NIAID Influenza Genome Sequencing Project and GenBank.  It provides tools for flu sequence analysis, annotation and submission to GenBank. This resource also has links to other flu sequence resources, and publications and general information about flu viruses.

A collection of nucleotide sequences from several sources, including GenBank, RefSeq, the Third Party Annotation (TPA) database, and PDB. Searching the Nucleotide Database will yield available results from each of its component databases.

Database of related DNA sequences that originate from comparative studies: phylogenetic, population, environmental and, to a lesser degree, mutational. Each record in the database is a set of DNA sequences. For example, a population set provides information on genetic variation within an organism, while a phylogenetic set may contain sequences, and their alignment, of a single gene obtained from several related organisms.

A public registry of nucleic acid reagents designed for use in a wide variety of biomedical research applications, together with information on reagent distributors, probe effectiveness, and computed sequence similarities.

A collection of human gene-specific reference genomic sequences. RefSeq gene is a subset of  NCBI’s RefSeq database, and are defined based on review from curators of locus-specific databases and the genetic testing community. They form a stable foundation for reporting mutations, for establishing consistent intron and exon numbering conventions, and for defining the coordinates of other biologically significant variation. RefSeqGene is a part of the Locus Reference Genomic
(LRG) Collaboration.

A collection of curated, non-redundant genomic DNA, transcript (RNA), and protein sequences produced by NCBI. RefSeqs provide a stable reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis, expression studies, and comparative analyses. The RefSeq collection is accessed through the Nucleotide and Protein databases.

The Sequence Read Archive (SRA) stores sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Life Technologies AB SOLiD System®, Helicos Biosciences Heliscope®, Complete Genomics®, and Pacific Biosciences SMRT®.

A database that contains sequences built from the existing primary sequence data in GenBank. The sequences and corresponding annotations are experimentally supported and have been published in a peer-reviewed scientific journal. TPA records are retrieved through the Nucleotide Database.

A repository of DNA sequence chromatograms (traces), base calls, and quality estimates for single-pass reads from various large-scale sequencing projects.

A database that provides sets of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.

This database contains libraries of Expressed Sequence Tags (ESTs) organized by organism, tissue type and developmental stage.

A comprehensive database of sequence tagged sites (STSs) derived from STS-based maps and other experiments. STSs are defined by PCR primer pairs and are associated with additional information, such as genomic position, genes, and sequences.

Downloads

BLAST executables for local use are provided for Solaris, LINUX, Windows, and MacOSX systems. See the README file in the ftp directory for more information. Pre-formatted databases for BLAST nucleotide, protein, and translated searches also are available for downloading under the db subdirectory.

Sequence databases for use with the stand-alone BLAST programs. The files in this directory are pre-formatted databases that are ready to use with BLAST.

Sequence databases in FASTA format for use with the stand-alone BLAST programs. These databases must be formatted using formatdb before they can be used with BLAST.

This site contains files for all sequence records in GenBank in the default flat file format. The files are organized by GenBank division, and the full contents are described in the README.genbank file.

This site contains all nucleotide and protein sequence records in the Reference Sequence (RefSeq) collection. The ""release"" directory contains the most current release of the complete collection, while data for selected organisms (such as human, mouse and rat) are available in separate directories. Data are available in FASTA and flat file formats. See the README file for details.

This site contains next-generation sequencing data organized by the submitted sequencing project.

This site contains the trace chromatogram data organized by species. Data include chromatogram, quality scores, FASTA sequences from automatic base calls, and other ancillary information in tab-delimited text as well as XML formats. See the README file for details.

This site contains individual directories for each organism with data in UniGene. The data for each species includes the unique sequence for each UniGene cluster, all sequences in each cluster in FASTA format and library information for the cluster. See the README file for further details.

This site contains the UniVec and UniVec_Core databases in FASTA format. See the README.uv file for details.

This site contains whole genome shotgun sequence data organized by the 4-digit project code. Data include GenBank and GenPept flat files, quality scores and summary statistics. See the README.genbank.wgs file for more information.
Submissions

An online form that provides an interface for researchers, consortia and organizations to register their BioProjects. This serves as the starting point for the submission of genomic and genetic data for the study. The data does not need to be submitted at the time of BioProject registration.

A web-based sequence submission tool for one or a few submissions to the GenBank database, designed to make the submission process quick and easy.

Tool for submission to the GenBank database of Barcode short nucleotide sequences from a standard genetic locus for use in species identification.

A stand-alone software tool developed by the NCBI for submitting and updating entries to public sequence databases (GenBank, EMBL, or DDBJ). It is capable of handling simple submissions that contain a single short mRNA sequence, complex submissions containing long sequences, multiple annotations, segmented sets of DNA, as well as sequences from phylogenetic and population studies with alignments. For simple submission, use the online submission tool BankIt instead.

A command-line program that automates the creation of sequence records for submission to GenBank using many of the same functions as Sequin. It is used primarily for submission of complete genomes and large batches of sequences.

This link describes how submitters of SRA data can obtain a secure NCBI FTP site for their data, and also describes the allowed data formats and directory structures.

A single entry point for submitters to link to and find information about all of the data submission processes at NCBI. Currently, this serves as an interface for the registration of BioProjects and BioSamples and submission of data for WGS and GTR. Future additions to this site are planned.

This link describes how submitters of trace data can obtain a secure NCBI FTP site for their data, and also describes the allowed data formats and directory structures.

Tools

Finds regions of local similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as to help identify members of gene families.

Allows you to retrieve records from many Entrez databases by uploading a file of GI or accession numbers from the Nucleotide or Protein databases, or a file of unique identifiers from other Entrez databases. Search results can be saved in various formats directly to a local file on your computer.

Tools that provide access to data within NCBI's Entrez system outside of the regular web query interface. They provide a method of automating Entrez tasks within software applications. Each utility performs a specialized retrieval task, and can be used simply by writing a specially formatted URL.

This tool compares nucleotide or protein sequences to genomic sequence databases and calculates the statistical significance of matches using the Basic Local Alignment Search Tool (BLAST) algorithm.

NCBI's Remap tool allows users to project annotation data and convert locations of features from one genomic assembly to another or to RefSeqGene sequences through a base by base analysis. Options are provided to adjust the stringency of remapping, and summary results are displayed on the web page. Full results can be downloaded for viewing in NCBI's Genome Workbench graphical viewer, and annotation data for the remapped features, as well as summary data, is also available for download.

An integrated application for viewing and analyzing sequence data. With Genome Workbench, you can view data in publically available sequence databases at NCBI, and mix these data with your own data.

A graphical analysis tool that finds all open reading frames in a user's sequence or in a sequence already in the database. Sixteen different genetic codes can be used. The deduced amino acid sequence can be saved in various formats and searched against protein databases using BLAST.

The Primer-BLAST tool uses Primer3 to design PCR primers to a sequence template. The potential products are then automatically analyzed with a BLAST search against user specified databases, to check the specificity to the target intended.

A utility for computing alignment of proteins to genomic nucleotide sequence. It is based on a variation of the Needleman Wunsch global alignment algorithm and specifically accounts for introns and splice signals. Due to this algorithm, ProSplign is accurate in determining splice sites and tolerant to sequencing errors.

Provides a configurable graphical display of a nucleotide or protein sequence and features that have been annotated on that sequence. In addition to use on NCBI sequence database pages, this viewer is available as an embeddable webpage component. Detailed documentation including an API Reference guide is available for developers wishing to embed the viewer in their own pages.

A utility for computing cDNA-to-Genomic sequence alignments. It is based on a variation of the Needleman-Wunsch global alignment algorithm and specifically accounts for introns and splice signals. Due to this algorithm, Splign is accurate in determining splice sites and tolerant to sequencing errors.

A system for quickly identifying segments of a nucleic acid sequence that may be of vector origin. VecScreen searches a query sequence for segments that match any sequence in a specialized non-redundant vector database (UniVec).