News


New datasets, new drivers, new versioning approach

November 17th 2017

We have added three new datasets to CancerGD.org :

As these datasets cover a significantly larger number of cell lines to previous screens, we have also increased the coverage of driver genes. We now store dependencies for 102 driver genes.

The dataset of Tsherniak et al contains a superset of the cell lines screened in Cowley et al and uses an improved algorithm for identifying gene specific dependencies. We previously stored dependencies identified using the Cowley et al dataset, but we have dropped them for this release as they have been now superseded by those from Tsherniak et al. We anticipate this will be a common pattern - new datasets will expand upon previously published studies, incorporating previous screens along with new screens. We will always endeavour to store the most up to date version of each dataset (e.g. Tsherniak et al rather than Cowley et al). However, for the sake of reproducibility, we will keep storing older versions of the database in both SQLite and CSV format. We have added a new downloads page where you can download historic versions of the database.

We hope you find the updates useful and, as always, we welcome any feedback.