
D3G Release 19.06
===

D3G provides a set of RNA data, consisting of ones derived from publicly available databases as well as ones based on our own experiments, to facilitate development of oligonucleotide therapeutics. Plese find [release note](https://d3g.riken.jp/release/latest/) for details.




Note that this is an ___alpha release.___


This release provides RNA data sets of four species (human, mouse, crab-eating monkey, and commom marmoset), which includes:

* Gene models of RefSeq, Gencode, ENSEMBL obtained from the UCSC Genome Browser Database
* Our original gene models (cageAssociateTranscriptome, called CAT) of crab-eating monkey and common marmoset, assembled from RNA-seq and CAGE

Nucleotide sequences of RNAs (before and after splicing) are also provided for a subset of the gene models.



Data version and source
---

* human
	- Genome assembly: GRCh38/hg38.p12
	- RefSeq (curated):  NCBI Homo sapiens Annotation Release 109 (2018-03-29)
	- Gencode: V30
* mouse
	- Genome assembly: GRCm38/mm10
	- RefSeq (curated): GCF 000001635.25 GRCm38.p5 (2017-08-04)
	- Gencode: VM18
* crab eating monkey
	- Genome assembly: macFas5
	- RefSeq (curated): 2019-05-22
	- ENSEMBL: 95
	- CageAssociatedTranscriptome(original): 19.06
* commom marmoset
	- Genome assembly: calJac3
	- RefSeq (curated): 2019-05-22
	- ENSEMBL: 91
	- CageAssociatedTranscriptome(original): 19.06

All the data is downloaded from the UCSC Genome Browser Database (http://hgdownload.soe.ucsc.edu/goldenPath/) on June 2019.



Leadership and contact
---
The transcriptome profiling experiments on non-human primates and database development is made by RIKEN, Shiga University of Medical Science, and CIEA (Central Institute for Experimental Animals), and DBCLS.



	email: d3g@ml.riken.jp



Data use / embargo
---

We share the data set, in particular the transcriptome data of non-human primates, for contribution to the drug development. We are now further improving the transcriptome data and intend to provide the transcriptome landscape as publication(s). Please respect the embargo on the presentation of analyses using pre-publication data that we release via this website and the relevant archives. Exceptions to the policy are for analyses on a couple of locus, gene families, and oligo nucleotide sequences, rather than comprehensive large-scale analysis.


How to cite
---
Please refer our database like this:

	D3G: Database for Drug Development based on Genome and RNA sequences, https://d3g.riken.jp, 2019


Acknowledgement
---
We thank to AMED (Japan Agency for Medical Research and Development), NIHS (National Institute of Health Sciences), JPMA (Japan Pharmaceutical Manufacturers Association), and the FANTOM consortium for relevant advices and fruitful discussion. We thank Dr. Hon for sharing the script to build CAGE associate transcriptome. The experiments and the database is financially supported by AMED.



Reference
---
* Genome Reference Consortium
	- Church DM, et al. Modernizing reference genome assemblies. PLoS Biol. 9:e1001091. 2011. doi: 10.1371/journal.pbio.1001091. PMID: 21750661
* RefSeq
	- O'Leary NA, et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44:D733-45. 2016. doi: 10.1093/nar/gkv1189. PMID: 26553804
* Gencode
	- Harrow J, et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22:1760-74. 2012. doi: 10.1101/gr.135350.111. PMID: 22955987
* CAGE Associated Transcriptome
	- Hon,C.-C., et al. An atlas of human long non-coding RNAs with accurate 5' ends. Nature, 543, 199-204. 2017.

