Bioinformatics with Swift: Ep. 5 — Computing GC Content

Let’s get back to work exploring another Rosalind bioinformatics challenge. . . in Swift!

Ocean Paradise (they/them)
5 min readOct 31, 2023

Welcome to another thrilling episode of our Swift-based bioinformatics journey. In this post, we’ll tackle the task of computing the GC content of DNA strings and determine the string with the highest GC content (from Rosalind’s fifth bioinformatics challenge). As a bonus, we’ll explore the FASTA format used to label DNA strings. If you’re ready to unravel the genetic mysteries, let’s get started!

Dear reader, it really helps me out when you clap for my work, highlight sections you find interesting or important, and comment below (and most importantly following me!), so if you enjoy the following article, please consider doing any of those tiny things to really brighten my day! :-D

Understanding the Problem

DNA strings contain genetic information composed of four nucleotides: Adenine (A), Thymine (T), Cytosine (C), and Guanine (G). In bioinformatics, understanding the composition of DNA strings is crucial.

The GC content of a DNA string is the percentage of symbols in the string that are either ‘C’ or ‘G’. For example, in the string “AGCTATAG,” the GC content is 37.5%, as 3 out of 8…

--

--

Ocean Paradise (they/them)

amateur computational biologist & software engineer |sentient astrolotl | open-source enthusiast | let's tinker together!