How to create a Unity Rhythm Game Part 2: Generating the Steps

Part 1

In the last article we went over how to create a parser to dynamically convert a valid .sm file (with the exception of some edge cases, such as the song having multiple bpms). So we now have a fully populated data structure that looks something like this:

SM Data Structure Example

If that didn’t make much sense, then don’t worry too much about it! The previous article goes over the structure in a little more detail. Now let’s tackle actually using this data. The plan is to dynamically spawn the arrows at the correct time so that when they fall down the screen (at a speed determined by the user’s difficulty) they hit the detection zone in sync with the song.

This may sound like a difficult task, but we can break it down rather easily with a simple physics equation;

Speed = Distance / Time

We want to find the time in the song that we should spawn the arrow, we are given the speed through the difficulty the user has selected (The arrows/steps will move faster at a higher speed) and we pick a location off-screen as our fixed ‘spawn distance.’ So with this information in mind we can re-arrange the equation to

Time = Distance / Speed

 Now remember this equation because it becomes fundamental to the calculation later.

The first thing is to set up the actual spawning process. To do this we need to Initialize some of the variables we’ll need.

    public void InitSteps(Song_Parser.Metadata newSongData, 
                          Song_Parser.difficulties newDifficulty)
    {
        songData = newSongData;
        isInit = true;

        //We estimate how many seconds a single 'bar' will be in the song
        //Using the bpm provided in the song data
        barTime = (60.0f / songData.bpm) * 4.0f;
        
        difficulty = newDifficulty;
        distance = originalDistance;

        //We then use the provided difficulty to determine how fast the arrows 
        //will be going
        switch (difficulty)
        {
            case Song_Parser.difficulties.beginner:
                arrowSpeed = 0.007f;
                noteData = songData.beginner;
                break;
            case Song_Parser.difficulties.easy:
                arrowSpeed = 0.009f;
                noteData = songData.easy;
                break;
            case Song_Parser.difficulties.medium:
                arrowSpeed = 0.011f;
                noteData = songData.medium;
                break;
            case Song_Parser.difficulties.hard:
                arrowSpeed = 0.013f;
                noteData = songData.hard;
                break;
            case Song_Parser.difficulties.challenge:
                arrowSpeed = 0.016f;
                noteData = songData.challenge;
                break;
            default:
                goto case Song_Parser.difficulties.easy;
        }

        //This variable is needed when we look at changing the speed of the song
        //with the variable BPM mechanic
        originalArrowSpeed = arrowSpeed;
    }

 A lot of whats being done here seems fairly straight forward, we are initializing the variables we need and setting the speed of the arrows to match the chosen difficulty, with the exception of one line.

barTime = (60.0f / songData.bpm) * 4.0f;

 This line may seem a little weird, but what we’re doing is estimating how much time a bar will take in seconds given the song’s bpm. Bpm is recorded as ‘beats per minute’ meaning that if we had 120 bpm then we could represent that as

120 Beats = 60 seconds

So lets take that logic and make it a little more generic

BPM = 60 Seconds

From this we can calculate the time taken for a single ‘beat’ by dividing the BPM by 60

Time for a single note = BPM / 60

And we know that in most cases, a bar will be 4 notes, now this can differ, but in these cases the notes are often closer together, meaning that the total time taken for the bar would be the same, despite the additional notes in it. If a bar is 4 notes we can modify the above equation to

Time for a Bar = 4 * Time for a single note

Time for a Bar = 4 * (BPM / 60)

In case it wasn’t obvious yet, there’s gunna be a bit of Math involved when calculating these values, I’ll go through it the best I can when it crops up however.

So now that initialization is done, let’s look at the meat of the generation process.

    // Update is called once per frame
    void Update () 
    {
        //If we're done initializing the rest of the world
        //And we havent gone through all the bars of the song yet
        if (isInit && barCount < noteData.bars.Count) 
        { 
            //We calculate the time offset using the s=d/t equation (t=d/s) 
            distance = originalDistance; 
            float timeOffset = distance / arrowSpeed;

            //Get the current time through the song 
            songTimer = heartAudio.time; 

            //If the current song time - the time Offset is greater than 
            //the time taken for all executed bars so far 
            //then it's time for us to spawn the next bar's notes 
            if (songTimer - timeOffset >= (barExecutedTime - barTime))
            {
                StartCoroutine(PlaceBar(noteData.bars[barCount++]));

                barExecutedTime += barTime;
            }
        }
    }

The above code is called through our update method, which is a function called once per frame by the Unity engine, we need to keep this in mind as it can affect how we structure our game logic.

We first check whether the game is still initializing or not, if it isn’t we check how many bars of notes we have ‘spawned’ so far, as long as it isn’t the end of the song, we go into the main body of the method.

We then calculate the intended time offset to spawn the arrows using the current arrow’s speed and the intended spawn distance, using the equation we showed before; t=d/s.

The last piece of information we need to keep track of is the current time progress of the song (ie, how far through the song we are in seconds). Which we get through the heartAudio variable, which contains the Unity AudioSource object we’re playing music from.

Now that we have all the info we need, we can decide whether to spawn the next bar of notes on this frame or not. In order to do that we check whether the current time of the song, minus the offset required to spawn the arrows on-time is larger than the time taken to go through all the executed bars so far, minus one ‘bar time.’ The logic behind this is a little funny, and it took some trial and error to get it working, so don’t be disheartened if it doesn’t make a lot of sense to you.

If this is true; we call a Unity-specific function called ‘StartCoroutine.’ StartCoroutine runs the function passed into it in a separate thread, allowing it to run parallel to the rest of the game. If we didnt have this then we’d need to wait until all the arrows were spawned before the update loop progressed, in some games this isn’t an issue, but in a rhythm game where timing is key and there’s constant attention by the user we want anything lengthy to be run parallel during the main game loop.

We pass in a function called PlaceBars and the value of barCount before incrementing it (adding 1 to it’s value), barCount allows us to know which bar in our data structure we need to place.

    IEnumerator PlaceBar(List<Song_Parser.Notes> bar)
    {
        for (int i = 0; i &lt; bar.Count; i++)
        {
            if (bar[i].left)
            {
                GameObject obj = (GameObject)Instantiate(leftArrow, new Vector3(leftArrowBack.transform.position.x, leftArrowBack.transform.position.y + distance, leftArrowBack.transform.position.z - 0.3f), Quaternion.identity);
                obj.GetComponent<Arrow_Movement>().arrowBack = leftArrowBack;
            }
            if (bar[i].down)
            {
                GameObject obj = (GameObject)Instantiate(downArrow, new Vector3(downArrowBack.transform.position.x, downArrowBack.transform.position.y + distance, downArrowBack.transform.position.z - 0.3f), Quaternion.identity);
                obj.GetComponent<Arrow_Movement>().arrowBack = downArrowBack;
            }
            if (bar[i].up)
            {
                GameObject obj = (GameObject)Instantiate(upArrow, new Vector3(upArrowBack.transform.position.x, upArrowBack.transform.position.y + distance, upArrowBack.transform.position.z - 0.3f), Quaternion.identity);
                obj.GetComponent<Arrow_Movement>().arrowBack = upArrowBack;
            }
            if (bar[i].right)
            {
                GameObject obj = (GameObject)Instantiate(rightArrow, new Vector3(rightArrowBack.transform.position.x, rightArrowBack.transform.position.y + distance, rightArrowBack.transform.position.z - 0.3f), Quaternion.identity);
                obj.GetComponent<Arrow_Movement>().arrowBack = rightArrowBack;
            }
            yield return new WaitForSeconds((barTime / bar.Count) - Time.deltaTime);
        }
    }

This PlaceBar method is called on a new thread, because we called this method through a Coroutine, it has to have an ‘IEnumerator’ return type. Because of this we can use Unity’s WaitForSeconds class to delay the thread for a certain period of time.

This method goes through all of the notes in that bar, and depending on which ‘step arrows’ are meant to be spawned, it creates an instance of that game object and initializes some stuff in their scripts.

Once spawning the correct note, the method and thread waits for some time with the line:

yield return new WaitForSeconds((barTime / bar.Count) - Time.deltaTime);

In a usual method, a return would end that method’s execution and return a value to whatever called it. With a method that returns IEnumerator however, we can use a ‘yield return’ to return multiple things. What yield return actually does is it returns a value, and then continues the execution of the method until it’s executed all the code in it and executes normally.

In this instance, we exploit this by returning a reference to the previously mentioned WaitForSeconds class. We get the method to wait for however long it takes to reach the time where the next note should appear. Otherwise, all the notes in the bar will be spawned at once, instead of separately when they should.

We use ‘barTime’ which contains the time taken for a single bar from our initial estimate, divide it by the number of notes in this bar (bar.Count) in order to find the time taken for each note in this bar. Finally, we subtract the current deltaTime. Delta Time is the difference in time between this frame and the last, putting this in prevents any delays caused by the instantiation process to prevent notes being spawned on time.

With the steps generated, we add to the total barExecutedTime seen previously, and get ready to spawn the next bar!

In you were after a starting point on making a Rhythm game then this will probably be enough for you, the next part of this writeup will be a quick one discussing how I allowed the user to generate a new BPM for a song by hitting the space bar to the rhythm wanted.

How to create a Unity Rhythm Game Part 1: Parsing the .SM file

Rhythm games are a genre I’ve only really been exposed to heavily this past year, thanks to one of my friends at my internship (Hi Ryan) being really into them. And I mean REALLY into them. Before this I thought the only rhythm games there really were was Dance Dance Revolution and Guitar Hero/Rockband, the ones where you push a button as an icon reaches a certain point on the screen. However my eyes have been opened to the sins of my ignorance, as it seems Japan has decided to expand on the formula, and has gone “pfffffft. Fuck that weak shit, let’s add a vinyl record thing on it, a literal bongo drum, and maybe some giant dials for good measure.” Like, Japan took rhythm games to the extreme, and so its very easy to see why people would be inspired to give it a go themselves.

This was my thought process when I entered my University’s 48hr Game Jam a few months back, I wanted to make a simple rhythm game, adhering more to the formula of DDR, where you just had to press a button in time with an arrow on screen. However, keeping in with Japan’s crazy takes on the genre, I wanted to add a twist of my own, in my game you would be able to change the BPM of the song in real-time, by tapping out a new bpm for it to follow. This would increase/decrease the speed of the song and therefore increase/decrease the score multiplier you would obtain.

With this long overdue article I’ll be walking you guys through the main parts of creating my game, starting with parsing the step files. For this project I decided to adapt the Stepmania files already provided. ‘Stepmania’ is a PC adaptation of the infamous DDR game, with all its ‘step’ info is all stored in a text file.

The game has a huge available fan-base, meaning that I’d have a lot of songs to play around with from the get-go, it also saved me from designing a file format to store all the data I wanted.

First let’s have a look into the Stepmania format, stored in a .sm file. The sm files are  structured with the header information, describing various bits of metadata for the song, including important information we’d need such as Song Name, Artist Name, BPM, and the file location of images the song would use in-game.

SM Header ExampleAfter that comes the step data for each difficulty of the song. The steps for the song are separated into ‘bars’ of the song (I’d explain how bars in music work, but a lot of videos online explain it far better than I ever could!), with 4 number columns corresponding to the position of the ‘step’ and different number values corresponding to different note types.

The step information for the 'Butterfly' song

In the above image you can see for example ‘0010’ which would be nothing in the first, second or fourth column, and an arrow in the 3rd column. In the game this would look something like this:

Heartbeat Example

So now that we’re on the same page, let’s talk abouit how we’re going to store all this data in our game. I went about creating a set of structures to house the various bits of information, including the song metadata, and the steps themselves;

    //This structure contains all the information for this track
    public struct Metadata
    {
        //Is the song's structure valid?
        public bool valid;

        //The Title, Subtitle and Artist for the song
        public string title;
        public string subtitle;
        public string artist;

        //The file paths for the related images and song media
        public string bannerPath;
        public string backgroundPath;
        public string musicPath;

        //The offset that the song starts at compared to the step info
        public float offset;

        //The start and length of the sample that is played when selecting a song
        public float sampleStart;
        public float sampleLength;
       
        //The bpm the song is played at
        public float bpm;

        //The note data for each difficulty, 
        //as well as a boolean to check that data for that difficulty exists
        public NoteData beginner;
        public bool beginnerExists;
        public NoteData easy;
        public bool easyExists;
        public NoteData medium;
        public bool mediumExists;
        public NoteData hard;
        public bool hardExists;
        public NoteData challenge;
        public bool challengeExists;
    }

    //This structure contains all the bars for a song at a single difficulty
    public struct NoteData
    {
        public List<List<NoteData>> bars;
    }

    //This structure contains note information for a single 'row' of notes
    //Right now it's just a simple "Is there a note there or not"
    //But this could be modified and expanded to support numerous note types
    public struct Notes
    {
        public bool left;
        public bool right;
        public bool up;
        public bool down;
    }

I explain in the code comments what each part of the structure contains, but the concept was that each song would have a Metadata instance, which would contain all the song’s information, and all the step info for each difficulty (beginner, easy, medium, hard, challenge). The NoteData structure has a list of ‘Bars’ with each ‘Bar’ being represented by a list of Note-rows (‘0010’ in the text file would be a single instance of the ‘Notes’ Structure), this is what contains the actual ‘steps’ of the song, the stuff the user will interact with.

The parsing of this file will come in multiple parts, but the first thing to be concerned with is actually extracting the information from the file in a format we can interact with, this is a fairly straight-forward solution, but I’ll include it for the sake of clarity.

        //Check if the file path is empty
        if (IsNullOrWhiteSpace(filePath))
        {
            //If so, Error and return invalid data
            Metadata tempMeta = new Metadata();
            tempMeta.valid = false;
            return tempMeta;
        }

        //Create a boolean variable that we'll use to check whether
        //we're currently parsing the notes or other metadata
        bool inNotes = false;

        Metadata songData = new Metadata();
        //Initialise Metadata
        //If it encounters any major errors during parsing, this is set to false and the song cannot be selected
        songData.valid = true;
        songData.beginnerExists = false;
        songData.easyExists = false;
        songData.mediumExists = false;
        songData.hardExists = false;
        songData.challengeExists = false;

        //Collect the raw data from the sm file all at once
        List fileData = File.ReadAllLines(filePath).ToList();

        //Get the file directory, and make sure it ends with either forward or backslash
        string fileDir = Path.GetDirectoryName(filePath);
        if (!fileDir.EndsWith("\\") &amp;&amp; !fileDir.EndsWith("/"))
        {
            fileDir += "\\";
        }

Above we kindof lay down the framework for our parser, we get the file path, check it’s valid and initialize our Metadata structure. With the groundwork in place we can start parsing in the generic metadata, this will be used to display the song to the player correctly, taking advantage of the provided information to present the song in a more user friendly environment.

The next couple code segments are all within the containing for loop below, but will be separated for ease of reading.

        //Go through the file data
        for (int i = 0; i &lt; fileData.Count; i++)
        {
            //Parse the data from the document
            string line = fileData[i].Trim();

            if (line.StartsWith("//"))
            {
                //It's a comment, ignore it and go to the next line
                continue;
            }
            else if (line.StartsWith("#"))
            {
                //the # symbol denotes generic metadata for the song
                string key = line.Substring(0, line.IndexOf(':')).Trim('#').Trim(':');

                switch (key.ToUpper())
                {
                    case "TITLE":
                        songData.title = line.Substring(line.IndexOf(':')).Trim(':').Trim(';');
                        break;
                    case "SUBTITLE":
                        songData.subtitle = line.Substring(line.IndexOf(':')).Trim(':').Trim(';');
                        break;
                    case "ARTIST":
                        songData.artist = line.Substring(line.IndexOf(':')).Trim(':').Trim(';');
                        break;
                    case "BANNER":
                        songData.bannerPath = fileDir + line.Substring(line.IndexOf(':')).Trim(':').Trim(';');
                        break;
                    case "BACKGROUND":
                        songData.backgroundPath = fileDir + line.Substring(line.IndexOf(':')).Trim(':').Trim(';');
                        break;
                    case "MUSIC":
                        songData.musicPath = fileDir + line.Substring(line.IndexOf(':')).Trim(':').Trim(';');
                        if (IsNullOrWhiteSpace(songData.musicPath) || !File.Exists(songData.musicPath))
                        {
                            //No music file found!
                            songData.musicPath = null;
                            songData.valid = false;
                        }
                        break;
                    case "OFFSET":
                        if (!float.TryParse(line.Substring(line.IndexOf(':')).Trim(':').Trim(';'), out songData.offset))
                        {
                            //Error Parsing
                            songData.offset = 0.0f;
                        }
                        break;
                    case "SAMPLESTART":
                        if (!float.TryParse(line.Substring(line.IndexOf(':')).Trim(':').Trim(';'), out songData.sampleStart))
                        {
                            //Error Parsing
                            songData.sampleStart = 0.0f;
                        }
                        break;
                    case "SAMPLELENGTH":
                        if (!float.TryParse(line.Substring(line.IndexOf(':')).Trim(':').Trim(';'), out songData.sampleLength))
                        {
                            //Error Parsing
                            songData.sampleLength = sampleLengthDefault;
                        }
                        break;
                    case "DISPLAYBPM":
                        if (!float.TryParse(line.Substring(line.IndexOf(':')).Trim(':').Trim(';'), out songData.bpm) || songData.bpm &lt;= 0)
                        {
                            //Error Parsing - BPM not valid
                            songData.valid = false;
                            songData.bpm = 0.0f;
                        }
                        break;
                    case "NOTES":
                        inNotes = true;
                        break;
                    default:
                        break;
                }
            }

With the above code, we go through each line of the file, and begin to parse it, as seen in the screenshots at the top of the page, the song’s information is specified with a ‘#’ followed by a key term, we can use this to determine what information each line contains. Once we hit a ‘NOTES’ section we know the next info we find is note data. We flag that this is the case, allowing the next bit of code to be called:

            //If we're now parsing step data
            if (inNotes)
            {
                //We skip some feature we're not implementing for now
                if (line.ToLower().Contains("dance-double"))
                {
                    //And update the for loop we're in to adequately skip this section
                    for(int j = i; j &lt; fileData.Count; j++)
                    {
                        if (fileData[j].Contains(";"))
                        {
                            i = j - 1;
                            break;
                        }
                    }
                }

                //Check if it's a difficulty
                if (line.ToLower().Contains("beginner") ||
                    line.ToLower().Contains("easy") ||
                    line.ToLower().Contains("medium") ||
                    line.ToLower().Contains("hard") ||
                    line.ToLower().Contains("challenge"))
                {
                    //And if it does have a difficulty declaration
                    //Then we're at the start of a 'step chart' 
                    string difficulty = line.Trim().Trim(':');

                    //We update the parsing for loop to after the current step chart, and also record the note data along the way
                    //We can then analyse the note data and parse it further
                    List noteChart = new List();
                    for (int j = i; j &lt; fileData.Count; j++)
                    {
                        string noteLine = fileData[j].Trim();
                        if (noteLine.EndsWith(";"))
                        {
                            i = j - 1;
                            break;
                        }
                        else
                        {
                            noteChart.Add(noteLine);
                        }
                    }

                    //Here we determine what difficulty we're in, and begin parsing the accompanied note data
                    switch (difficulty.ToLower().Trim())
                    {
                        case "beginner":
                            songData.beginnerExists = true;
                            songData.beginner = ParseNotes(noteChart);
                            break;
                        case "easy":
                            songData.easyExists = true;
                            songData.easy = ParseNotes(noteChart);
                            break;
                        case "medium":
                            songData.mediumExists = true;
                            songData.medium = ParseNotes(noteChart);
                            break;
                        case "hard":
                            songData.hardExists = true;
                            songData.hard = ParseNotes(noteChart);
                            break;
                        case "challenge":
                            songData.challengeExists = true;
                            songData.challenge = ParseNotes(noteChart);
                            break;
                    }
                }
                if (line.EndsWith(";"))
                {
                    inNotes = false;
                }
            }
        }

We’re not quite at parsing the step chart yet, we first need to grab some accompanied data, namely the song’s difficulty, we then call another method called ‘ParseNotes’ and pass in the raw note data, in here it will be translated into the NoteData structure and assigned to the appropriate variable.

    private NoteData ParseNotes(List notes)
    {
        //We first instantiate our structures
        NoteData noteData = new NoteData();
        noteData.bars = new List&lt;List&gt;();

        //And then work through each line of the raw note data
        List bar = new List();
        for(int i = 0; i &lt; notes.Count; i++)
        {
            //Based on different line properties we can determine what data that
            //line contains, such as a semicolon dictating the end of the note data
            //or a comma indicating the end of that bar
            string line = notes[i].Trim();

            if (line.StartsWith(";"))
            {
                break;
            }

            if (line.EndsWith(","))
            {
                noteData.bars.Add(bar);
                bar = new List();
            }
            else if (line.EndsWith(":"))
            {
                continue;
            }
            else if (line.Length &gt;= 4)
            {
                //When we have a single 'note row' such as '0010' or '0110'
                //We check which columns will contain 'steps' and mark the appropriate flags
                Notes note = new Notes();
                note.left = false;
                note.down = false;
                note.up = false;
                note.right = false;

                
                if (line[0] != '0')
                {
                    note.left = true;
                }
                if (line[1] != '0')
                {
                    note.down = true;
                }
                if (line[2] != '0')
                {
                    note.up = true;
                }
                if (line[3] != '0')
                {
                    note.right = true;
                }

                //We then add this information to our current bar and continue until end
                bar.Add(note);
            }
        }

        return noteData;
    }

And with the above code implemented, we now have our full parsing, allowing us to correctly convert the text data into our structure representation of the song’s data. And that’s the parsing done! The next step is to use this structure we have to spawn the arrows/steps in the game for the player to interact with.

There were a few ways to tackle this, one of them being to spawn all the ‘arrow’ instances at once, and have them all just scroll down throughout the song’s run time, however I felt this was a bit resource heavy, and wanted to opt for a bit of a more complex, but efficient route: Spawning the arrows in real-time to have them hit their targets at the exact moment in the song you want them to.

Easy, right?

Part 2