June 1, 2021

How to Compare and Cluster Every Known Genome in about an Hour

Author(s): Koren, Sergey

Given a massive collection of sequences, it is infeasible to perform pairwise alignment for basic tasks like sequence clustering and search. To address this problem, we demonstrate that the MinHash technique, first applied to clustering web pages, can be applied to biological sequences with similar effect, and extend this idea to include biologically relevant distance and significance measures. Our new tool, Mash, uses MinHash locality-sensitive hashing to reduce large sequences to a representative sketch and rapidly estimate pairwise distances between genomes or metagenomes. Using Mash, we explored several use cases, including a 5,000-fold size reduction and clustering of all 55,000 NCBI RefSeq genomes in 46 CPU hours. The resulting 93 MB sketch database includes all RefSeq genomes, effectively delineates known species boundaries, reconstructs approximate phylogenies, and can be searched in seconds using assembled genomes or raw sequencing runs from Illumina, Pacific Biosciences, and Oxford Nanopore. For metagenomics, Mash scales to thousands of samples and can replicate Human Microbiome Project and Global Ocean Survey results in a fraction of the time. Other potential applications include any problem where an approximate, global sequence distance is acceptable, e.g. to triage and cluster sequence data, assign species labels to unknown genomes, quickly identify mis- tracked samples, and search massive genomic databases. In addition, the Mash distance metric is based on simple set intersections, which are compatible with homomorphic encryption schemes. To facilitate integration with other software, Mash is implemented as a lightweight C++ toolkit and freely released under a BSD license athttps://github.com/marbl/mash

Organization: Plant and Animal Genome
Year: 2016

View Conference Poster

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.

.navbar .megamenu{padding:0}@media all and ( min-width: 992px ){.navbar .has-megamenu{position:static !important}.navbar .megamenu{left:0;right:0;width:100%;margin-top:0}}@media (max-width: 991px){.navbar.fixed-top .navbar-collapse,.navbar.sticky-top .navbar-collapse{overflow-y:auto;max-height:90vh;margin-top:0px}}@media all and (min-width: 992px){.navbar .nav-item .dropdown-menu{display:none}.navbar .nav-item:hover .nav-link{}.navbar .nav-item:hover .dropdown-menu{display:block}.navbar .nav-item .dropdown-menu{margin-top:0}}img:is([sizes=auto i],[sizes^="auto," i]){contain-intrinsic-size:3000px 1500px}img.wp-smiley,img.emoji{display:inline !important;border:none !important;box-shadow:none !important;height:1em !important;width:1em !important;margin:0 .07em !important;vertical-align:-.1em !important;background:none !important;padding:0 !important}.wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none}.svg-inline--fa{display:inline-block;height:1em;overflow:visible;vertical-align:-.125em}.wp-block-font-awesome-icon svg::before,.wp-rich-text-font-awesome-icon svg::before{content:unset}:root{--wp--preset--aspect-ratio--square:1;--wp--preset--aspect-ratio--4-3:4/3;--wp--preset--aspect-ratio--3-4:3/4;--wp--preset--aspect-ratio--3-2:3/2;--wp--preset--aspect-ratio--2-3:2/3;--wp--preset--aspect-ratio--16-9:16/9;--wp--preset--aspect-ratio--9-16:9/16;--wp--preset--color--black:#000;--wp--preset--color--cyan-bluish-gray:#abb8c3;--wp--preset--color--white:#fff;--wp--preset--color--pale-pink:#f78da7;--wp--preset--color--vivid-red:#cf2e2e;--wp--preset--color--luminous-vivid-orange:#ff6900;--wp--preset--color--luminous-vivid-amber:#fcb900;--wp--preset--color--light-green-cyan:#7bdcb5;--wp--preset--color--vivid-green-cyan:#00d084;--wp--preset--color--pale-cyan-blue:#8ed1fc;--wp--preset--color--vivid-cyan-blue:#0693e3;--wp--preset--color--vivid-purple:#9b51e0;--wp--preset--gradient--vivid-cyan-blue-to-vivid-purple:linear-gradient(135deg,#0693e3 0%,#9b51e0 100%);--wp--preset--gradient--light-green-cyan-to-vivid-green-cyan:linear-gradient(135deg,#7adcb4 0%,#00d082 100%);--wp--preset--gradient--luminous-vivid-amber-to-luminous-vivid-orange:linear-gradient(135deg,#fcb900 0%,#ff6900 100%);--wp--preset--gradient--luminous-vivid-orange-to-vivid-red:linear-gradient(135deg,#ff6900 0%,#cf2e2e 100%);--wp--preset--gradient--very-light-gray-to-cyan-bluish-gray:linear-gradient(135deg,#eee 0%,#a9b8c3 100%);--wp--preset--gradient--cool-to-warm-spectrum:linear-gradient(135deg,#4aeadc 0%,#9778d1 20%,#cf2aba 40%,#ee2c82 60%,#fb6962 80%,#fef84c 100%);--wp--preset--gradient--blush-light-purple:linear-gradient(135deg,#ffceec 0%,#9896f0 100%);--wp--preset--gradient--blush-bordeaux:linear-gradient(135deg,#fecda5 0%,#fe2d2d 50%,#6b003e 100%);--wp--preset--gradient--luminous-dusk:linear-gradient(135deg,#ffcb70 0%,#c751c0 50%,#4158d0 100%);--wp--preset--gradient--pale-ocean:linear-gradient(135deg,#fff5cb 0%,#b6e3d4 50%,#33a7b5 100%);--wp--preset--gradient--electric-grass:linear-gradient(135deg,#caf880 0%,#71ce7e 100%);--wp--preset--gradient--midnight:linear-gradient(135deg,#020381 0%,#2874fc 100%);--wp--preset--font-size--small:13px;--wp--preset--font-size--medium:20px;--wp--preset--font-size--large:36px;--wp--preset--font-size--x-large:42px;--wp--preset--spacing--20:.44rem;--wp--preset--spacing--30:.67rem;--wp--preset--spacing--40:1rem;--wp--preset--spacing--50:1.5rem;--wp--preset--spacing--60:2.25rem;--wp--preset--spacing--70:3.38rem;--wp--preset--spacing--80:5.06rem;--wp--preset--shadow--natural:6px 6px 9px rgba(0,0,0,.2);--wp--preset--shadow--deep:12px 12px 50px rgba(0,0,0,.4);--wp--preset--shadow--sharp:6px 6px 0px rgba(0,0,0,.2);--wp--preset--shadow--outlined:6px 6px 0px -3px #fff,6px 6px #000;--wp--preset--shadow--crisp:6px 6px 0px #000}:where(.is-layout-flex){gap:.5em}:where(.is-layout-grid){gap:.5em}body .is-layout-flex{display:flex}.is-layout-flex{flex-wrap:wrap;align-items:center}.is-layout-flex>:is(*, div){margin:0}body .is-layout-grid{display:grid}.is-layout-grid>:is(*, div){margin:0}:where(.wp-block-columns.is-layout-flex){gap:2em}:where(.wp-block-columns.is-layout-grid){gap:2em}:where(.wp-block-post-template.is-layout-flex){gap:1.25em}:where(.wp-block-post-template.is-layout-grid){gap:1.25em}.has-black-color{color:var(--wp--preset--color--black) !important}.has-cyan-bluish-gray-color{color:var(--wp--preset--color--cyan-bluish-gray) !important}.has-white-color{color:var(--wp--preset--color--white) !important}.has-pale-pink-color{color:var(--wp--preset--color--pale-pink) !important}.has-vivid-red-color{color:var(--wp--preset--color--vivid-red) !important}.has-luminous-vivid-orange-color{color:var(--wp--preset--color--luminous-vivid-orange) !important}.has-luminous-vivid-amber-color{color:var(--wp--preset--color--luminous-vivid-amber) !important}.has-light-green-cyan-color{color:var(--wp--preset--color--light-green-cyan) !important}.has-vivid-green-cyan-color{color:var(--wp--preset--color--vivid-green-cyan) !important}.has-pale-cyan-blue-color{color:var(--wp--preset--color--pale-cyan-blue) !important}.has-vivid-cyan-blue-color{color:var(--wp--preset--color--vivid-cyan-blue) !important}.has-vivid-purple-color{color:var(--wp--preset--color--vivid-purple) !important}.has-black-background-color{background-color:var(--wp--preset--color--black) !important}.has-cyan-bluish-gray-background-color{background-color:var(--wp--preset--color--cyan-bluish-gray) !important}.has-white-background-color{background-color:var(--wp--preset--color--white) !important}.has-pale-pink-background-color{background-color:var(--wp--preset--color--pale-pink) !important}.has-vivid-red-background-color{background-color:var(--wp--preset--color--vivid-red) !important}.has-luminous-vivid-orange-background-color{background-color:var(--wp--preset--color--luminous-vivid-orange) !important}.has-luminous-vivid-amber-background-color{background-color:var(--wp--preset--color--luminous-vivid-amber) !important}.has-light-green-cyan-background-color{background-color:var(--wp--preset--color--light-green-cyan) !important}.has-vivid-green-cyan-background-color{background-color:var(--wp--preset--color--vivid-green-cyan) !important}.has-pale-cyan-blue-background-color{background-color:var(--wp--preset--color--pale-cyan-blue) !important}.has-vivid-cyan-blue-background-color{background-color:var(--wp--preset--color--vivid-cyan-blue) !important}.has-vivid-purple-background-color{background-color:var(--wp--preset--color--vivid-purple) !important}.has-black-border-color{border-color:var(--wp--preset--color--black) !important}.has-cyan-bluish-gray-border-color{border-color:var(--wp--preset--color--cyan-bluish-gray) !important}.has-white-border-color{border-color:var(--wp--preset--color--white) !important}.has-pale-pink-border-color{border-color:var(--wp--preset--color--pale-pink) !important}.has-vivid-red-border-color{border-color:var(--wp--preset--color--vivid-red) !important}.has-luminous-vivid-orange-border-color{border-color:var(--wp--preset--color--luminous-vivid-orange) !important}.has-luminous-vivid-amber-border-color{border-color:var(--wp--preset--color--luminous-vivid-amber) !important}.has-light-green-cyan-border-color{border-color:var(--wp--preset--color--light-green-cyan) !important}.has-vivid-green-cyan-border-color{border-color:var(--wp--preset--color--vivid-green-cyan) !important}.has-pale-cyan-blue-border-color{border-color:var(--wp--preset--color--pale-cyan-blue) !important}.has-vivid-cyan-blue-border-color{border-color:var(--wp--preset--color--vivid-cyan-blue) !important}.has-vivid-purple-border-color{border-color:var(--wp--preset--color--vivid-purple) !important}.has-vivid-cyan-blue-to-vivid-purple-gradient-background{background:var(--wp--preset--gradient--vivid-cyan-blue-to-vivid-purple) !important}.has-light-green-cyan-to-vivid-green-cyan-gradient-background{background:var(--wp--preset--gradient--light-green-cyan-to-vivid-green-cyan) !important}.has-luminous-vivid-amber-to-luminous-vivid-orange-gradient-background{background:var(--wp--preset--gradient--luminous-vivid-amber-to-luminous-vivid-orange) !important}.has-luminous-vivid-orange-to-vivid-red-gradient-background{background:var(--wp--preset--gradient--luminous-vivid-orange-to-vivid-red) !important}.has-very-light-gray-to-cyan-bluish-gray-gradient-background{background:var(--wp--preset--gradient--very-light-gray-to-cyan-bluish-gray) !important}.has-cool-to-warm-spectrum-gradient-background{background:var(--wp--preset--gradient--cool-to-warm-spectrum) !important}.has-blush-light-purple-gradient-background{background:var(--wp--preset--gradient--blush-light-purple) !important}.has-blush-bordeaux-gradient-background{background:var(--wp--preset--gradient--blush-bordeaux) !important}.has-luminous-dusk-gradient-background{background:var(--wp--preset--gradient--luminous-dusk) !important}.has-pale-ocean-gradient-background{background:var(--wp--preset--gradient--pale-ocean) !important}.has-electric-grass-gradient-background{background:var(--wp--preset--gradient--electric-grass) !important}.has-midnight-gradient-background{background:var(--wp--preset--gradient--midnight) !important}.has-small-font-size{font-size:var(--wp--preset--font-size--small) !important}.has-medium-font-size{font-size:var(--wp--preset--font-size--medium) !important}.has-large-font-size{font-size:var(--wp--preset--font-size--large) !important}.has-x-large-font-size{font-size:var(--wp--preset--font-size--x-large) !important}:root{--global-kb-font-size-sm:clamp(.8rem,.73rem + .217vw,.9rem);--global-kb-font-size-md:clamp(1.1rem,.995rem + .326vw,1.25rem);--global-kb-font-size-lg:clamp(1.75rem,1.576rem + .543vw,2rem);--global-kb-font-size-xl:clamp(2.25rem,1.728rem + 1.63vw,3rem);--global-kb-font-size-xxl:clamp(2.5rem,1.456rem + 3.26vw,4rem);--global-kb-font-size-xxxl:clamp(2.75rem,.489rem + 7.065vw,6rem)}:root{--global-palette1:#3182ce;--global-palette2:#2b6cb0;--global-palette3:#1a202c;--global-palette4:#2d3748;--global-palette5:#4a5568;--global-palette6:#718096;--global-palette7:#edf2f7;--global-palette8:#f7fafc;--global-palette9:#fff}div#DivAssetPlaceHolder1{display:none}div#DivAssetPlaceHolder2{display:none}.wow{visibility:visible !important}img.gdpr-rt{display:none}.page-template-template-kadence-nowrapper .site-main{margin-top:0px}p.swp-result-item--desc{font-family:"Roboto",sans-serif;font-size:18px;font-weight:300;line-height:29px}#searchwp-form-1 .swp-input{border-radius:30px !important;background:unset}#searchModal #searchwp-form-1 .search-submit{color:transparent !important;background-color:transparent;background-image:url("data:image/svg+xml;charset=utf-8,%3Csvg xmlns='http://www.w3.org/2000/svg' width='15' height='15' fill='none'%3E%3Cpath fill='%23d9178d' d='M6.068 12.136c1.31 0 2.533-.426 3.527-1.136l3.74 3.74c.174.173.402.26.64.26.512 0 .883-.395.883-.9a.87.87 0 0 0-.253-.63L10.89 9.744a6.04 6.04 0 0 0 1.247-3.677C12.136 2.73 9.406 0 6.068 0 2.722 0 0 2.73 0 6.068s2.722 6.068 6.068 6.068m0-1.31c-2.612 0-4.758-2.154-4.758-4.758S3.456 1.31 6.068 1.31c2.604 0 4.758 2.154 4.758 4.758s-2.154 4.758-4.758 4.758'/%3E%3C/svg%3E");background-position:20px !important;background-repeat:no-repeat;background-size:40px auto}#searchModal #searchwp-form-1 .search-submit.nitro-lazy{background-image:none !important}.home p.swp-result-item--desc{-webkit-line-clamp:3;overflow:hidden;display:-webkit-box;-webkit-box-orient:vertical}.fixed-button{position:fixed;top:0px;right:30px;z-index:100}.admin-bar .fixed-button{top:30px}@media (max-width: 550px){.fixed-button{display:none}}.page-template-template-kadence-nowrapper button#rmp_menu_trigger-62790{display:none !important}.fixed-button a{background:#f7bf43 !important;border-top:2px solid #f7bf43 !important;border-bottom:2px solid #f7bf43 !important;border-left:2px solid #f7bf43 !important;border-right:2px solid #f7bf43 !important;color:#5e19eb !important}.fixed-button a:hover{background-color:#5e19eb !important;color:#f7bf43 !important}a.signmeup-button:hover{color:#f7bf43 !important}a.purple-button:hover{color:#5e19eb !important;border-top:1px solid #5e19eb !important;border-bottom:1px solid #5e19eb !important;border-left:1px solid #5e19eb !important;border-right:1px solid #5e19eb !important}a.green-button:hover{color:#7ed956 !important;border:none !important}a.orange-button:hover{border:none !important}a.gold-button:hover{border:none !important}.ha-event .entry-title a{color:#a20067 !important;font-size:18px !important;text-decoration:none !important}.ha-event__dates{font-size:18px !important}#searchwp-form-1 .swp-input,#searchwp-form-1 .swp-select{border:1px solid #d9178d;border-radius:5px}#searchwp-form-1 .searchwp-form-input-container .swp-select{border-top-right-radius:0;border-bottom-right-radius:0;border-right:0}#searchwp-form-1 .searchwp-form-input-container .swp-select+.swp-input{border-top-left-radius:0;border-bottom-left-radius:0}#searchwp-form-1 input[type=submit]{border-radius:5px}#searchwp-form-1 .swp-toggle-checkbox:checked+.swp-toggle-switch,#searchwp-form-1 .swp-toggle-switch--checked{background:#d9178d}#searchwp-form-1 input[type=submit]{color:#d9178d}.su-spoiler.my-custom-spoiler>.su-spoiler-title{font-size:21px;background:#d9178d;color:#fff}